Package org.apache.cassandra.spark.data
Class PartitionedDataLayer
- java.lang.Object
-
- org.apache.cassandra.spark.data.DataLayer
-
- org.apache.cassandra.spark.data.PartitionedDataLayer
-
- All Implemented Interfaces:
java.io.Serializable
- Direct Known Subclasses:
CassandraDataLayer
public abstract class PartitionedDataLayer extends DataLayer
DataLayer that partitions token range by the number of Spark partitions and only lists SSTables overlapping with range- See Also:
- Serialized Form
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static classPartitionedDataLayer.AvailabilityHintstatic classPartitionedDataLayer.ReplicaSet
-
Field Summary
Fields Modifier and Type Field Description protected org.apache.cassandra.spark.data.partitioner.ConsistencyLevelconsistencyLevelprotected java.lang.Stringdatacenter-
Fields inherited from class org.apache.cassandra.spark.data.DataLayer
serialVersionUID
-
-
Constructor Summary
Constructors Constructor Description PartitionedDataLayer(org.apache.cassandra.spark.data.partitioner.ConsistencyLevel consistencyLevel, java.lang.String datacenter)
-
Method Summary
All Methods Static Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description org.apache.cassandra.spark.data.partitioner.ConsistencyLevelconsistencylevel()booleanequals(java.lang.Object other)booleanfilterNonIntersectingSSTables()Overridable method setting whether the PartitionedDataLayer should filter out SSTables that do not intersect with the Spark partition token rangeprotected PartitionedDataLayer.AvailabilityHintgetAvailability(org.apache.cassandra.spark.data.partitioner.CassandraInstance instance)Data Layer can override this method to hint availability of a Cassandra instance so Bulk Reader attempts UP instances first, and avoids instances known to be down e.g.inthashCode()booleanisInPartition(int partitionId, java.math.BigInteger token, java.nio.ByteBuffer key)abstract java.util.concurrent.CompletableFuture<java.util.stream.Stream<org.apache.cassandra.spark.data.SSTable>>listInstance(int partitionId, com.google.common.collect.Range<java.math.BigInteger> range, org.apache.cassandra.spark.data.partitioner.CassandraInstance instance)intpartitionCount()org.apache.cassandra.spark.data.partitioner.Partitionerpartitioner()java.util.List<org.apache.cassandra.spark.sparksql.filters.PartitionKeyFilter>partitionKeyFiltersInRange(int partitionId, java.util.List<org.apache.cassandra.spark.sparksql.filters.PartitionKeyFilter> filters)abstract org.apache.cassandra.spark.data.ReplicationFactorreplicationFactor(java.lang.String keyspace)abstract org.apache.cassandra.spark.data.partitioner.CassandraRingring()org.apache.cassandra.spark.sparksql.filters.SparkRangeFiltersparkRangeFilter(int partitionId)DataLayer implementation should provide a SparkRangeFilter to filter out partitions and mutations that do not overlap with the Spark worker's token rangeorg.apache.cassandra.spark.data.SSTablesSuppliersstables(int partitionId, org.apache.cassandra.spark.sparksql.filters.SparkRangeFilter sparkRangeFilter, java.util.List<org.apache.cassandra.spark.sparksql.filters.PartitionKeyFilter> partitionKeyFilters)abstract org.apache.cassandra.spark.data.partitioner.TokenPartitionertokenPartitioner()static voidvalidateReplicationFactor(org.apache.cassandra.spark.data.partitioner.ConsistencyLevel consistencyLevel, org.apache.cassandra.spark.data.ReplicationFactor replicationFactor, java.lang.String dc)protected voidvalidateReplicationFactor(org.apache.cassandra.spark.data.ReplicationFactor replicationFactor)-
Methods inherited from class org.apache.cassandra.spark.data.DataLayer
bigNumberConfig, bridge, cqlTable, executorService, jobId, openCompactionScanner, openCompactionScanner, openPartitionSizeIterator, partitionSizeStructType, readIndexOffset, requestedFeatures, sstableTimeRangeFilter, stats, structType, timeProvider, typeConverter, unsupportedPushDownFilters, useIncrementalRepair, version
-
-
-
-
Method Detail
-
validateReplicationFactor
protected void validateReplicationFactor(@NotNull org.apache.cassandra.spark.data.ReplicationFactor replicationFactor)
-
validateReplicationFactor
public static void validateReplicationFactor(@NotNull org.apache.cassandra.spark.data.partitioner.ConsistencyLevel consistencyLevel, @NotNull org.apache.cassandra.spark.data.ReplicationFactor replicationFactor, @Nullable java.lang.String dc)
-
listInstance
public abstract java.util.concurrent.CompletableFuture<java.util.stream.Stream<org.apache.cassandra.spark.data.SSTable>> listInstance(int partitionId, @NotNull com.google.common.collect.Range<java.math.BigInteger> range, @NotNull org.apache.cassandra.spark.data.partitioner.CassandraInstance instance)
-
ring
public abstract org.apache.cassandra.spark.data.partitioner.CassandraRing ring()
-
tokenPartitioner
public abstract org.apache.cassandra.spark.data.partitioner.TokenPartitioner tokenPartitioner()
-
partitionCount
public int partitionCount()
- Specified by:
partitionCountin classDataLayer
-
partitioner
public org.apache.cassandra.spark.data.partitioner.Partitioner partitioner()
- Specified by:
partitionerin classDataLayer
-
isInPartition
public boolean isInPartition(int partitionId, java.math.BigInteger token, java.nio.ByteBuffer key)- Specified by:
isInPartitionin classDataLayer
-
sparkRangeFilter
public org.apache.cassandra.spark.sparksql.filters.SparkRangeFilter sparkRangeFilter(int partitionId)
Description copied from class:DataLayerDataLayer implementation should provide a SparkRangeFilter to filter out partitions and mutations that do not overlap with the Spark worker's token range- Overrides:
sparkRangeFilterin classDataLayer- Parameters:
partitionId- the partitionId for the task- Returns:
- SparkRangeFilter for the Spark worker's token range
-
partitionKeyFiltersInRange
public java.util.List<org.apache.cassandra.spark.sparksql.filters.PartitionKeyFilter> partitionKeyFiltersInRange(int partitionId, java.util.List<org.apache.cassandra.spark.sparksql.filters.PartitionKeyFilter> filters) throws org.apache.cassandra.spark.sparksql.NoMatchFoundException- Overrides:
partitionKeyFiltersInRangein classDataLayer- Throws:
org.apache.cassandra.spark.sparksql.NoMatchFoundException
-
consistencylevel
public org.apache.cassandra.spark.data.partitioner.ConsistencyLevel consistencylevel()
-
sstables
public org.apache.cassandra.spark.data.SSTablesSupplier sstables(int partitionId, @Nullable org.apache.cassandra.spark.sparksql.filters.SparkRangeFilter sparkRangeFilter, @NotNull java.util.List<org.apache.cassandra.spark.sparksql.filters.PartitionKeyFilter> partitionKeyFilters)
-
filterNonIntersectingSSTables
public boolean filterNonIntersectingSSTables()
Overridable method setting whether the PartitionedDataLayer should filter out SSTables that do not intersect with the Spark partition token range- Returns:
- true if we should filter
-
getAvailability
protected PartitionedDataLayer.AvailabilityHint getAvailability(org.apache.cassandra.spark.data.partitioner.CassandraInstance instance)
Data Layer can override this method to hint availability of a Cassandra instance so Bulk Reader attempts UP instances first, and avoids instances known to be down e.g. if create snapshot request already failed- Parameters:
instance- a cassandra instance- Returns:
- availability hint
-
replicationFactor
public abstract org.apache.cassandra.spark.data.ReplicationFactor replicationFactor(java.lang.String keyspace)
-
hashCode
public int hashCode()
- Overrides:
hashCodein classjava.lang.Object
-
equals
public boolean equals(java.lang.Object other)
- Overrides:
equalsin classjava.lang.Object
-
-