Package org.apache.spark.ml.clustering
Interface KMeansParams
- All Superinterfaces:
- HasDistanceMeasure,- HasFeaturesCol,- HasMaxBlockSizeInMB,- HasMaxIter,- HasPredictionCol,- HasSeed,- HasSolver,- HasTol,- HasWeightCol,- Identifiable,- Params,- Serializable
- All Known Implementing Classes:
- KMeans,- KMeansModel
public interface KMeansParams
extends Params, HasMaxIter, HasFeaturesCol, HasSeed, HasPredictionCol, HasTol, HasDistanceMeasure, HasWeightCol, HasSolver, HasMaxBlockSizeInMB
Common params for KMeans and KMeansModel
- 
Method SummaryModifier and TypeMethodDescriptionintintgetK()initMode()Param for the initialization algorithm.Param for the number of steps for the k-means|| initialization mode.k()The number of clusters to create (k).solver()Param for the name of optimization method used in KMeans.validateAndTransformSchema(StructType schema) Validates and transforms the input schema.Methods inherited from interface org.apache.spark.ml.param.shared.HasDistanceMeasuredistanceMeasure, getDistanceMeasureMethods inherited from interface org.apache.spark.ml.param.shared.HasFeaturesColfeaturesCol, getFeaturesColMethods inherited from interface org.apache.spark.ml.param.shared.HasMaxBlockSizeInMBgetMaxBlockSizeInMB, maxBlockSizeInMBMethods inherited from interface org.apache.spark.ml.param.shared.HasMaxItergetMaxIter, maxIterMethods inherited from interface org.apache.spark.ml.param.shared.HasPredictionColgetPredictionCol, predictionColMethods inherited from interface org.apache.spark.ml.param.shared.HasWeightColgetWeightCol, weightColMethods inherited from interface org.apache.spark.ml.util.IdentifiabletoString, uidMethods inherited from interface org.apache.spark.ml.param.Paramsclear, copy, copyValues, defaultCopy, defaultParamMap, explainParam, explainParams, extractParamMap, extractParamMap, get, getDefault, getOrDefault, getParam, hasDefault, hasParam, isDefined, isSet, onParamChange, paramMap, params, set, set, set, setDefault, setDefault, shouldOwn
- 
Method Details- 
getInitModeString getInitMode()
- 
getInitStepsint getInitSteps()
- 
getKint getK()
- 
initModeParam for the initialization algorithm. This can be either "random" to choose random points as initial cluster centers, or "k-means||" to use a parallel variant of k-means++ (Bahmani et al., Scalable K-Means++, VLDB 2012). Default: k-means||.- Returns:
- (undocumented)
 
- 
initStepsIntParam initSteps()Param for the number of steps for the k-means|| initialization mode. This is an advanced setting -- the default of 2 is almost always enough. Must be > 0. Default: 2.- Returns:
- (undocumented)
 
- 
kIntParam k()The number of clusters to create (k). Must be > 1. Note that it is possible for fewer than k clusters to be returned, for example, if there are fewer than k distinct points to cluster. Default: 2.- Returns:
- (undocumented)
 
- 
solverParam for the name of optimization method used in KMeans. Supported options: - "auto": Automatically select the solver based on the input schema and sparsity: If input instances are arrays or input vectors are dense, set to "block". Else, set to "row". - "row": input instances are processed row by row, and triangle-inequality is applied to accelerate the training. - "block": input instances are stacked to blocks, and GEMM is applied to compute the distances. Default is "auto".
- 
validateAndTransformSchemaValidates and transforms the input schema.- Parameters:
- schema- input schema
- Returns:
- output schema
 
 
-