Package org.apache.spark.ml.feature
Interface VectorIndexerParams
- All Superinterfaces:
- HasHandleInvalid,- HasInputCol,- HasOutputCol,- Identifiable,- Params,- Serializable
- All Known Implementing Classes:
- VectorIndexer,- VectorIndexerModel
Private trait for params for VectorIndexer and VectorIndexerModel
- 
Method SummaryModifier and TypeMethodDescriptionintParam for how to handle invalid data (unseen labels or NULL values).Threshold for the number of values a categorical feature can take.Methods inherited from interface org.apache.spark.ml.param.shared.HasHandleInvalidgetHandleInvalidMethods inherited from interface org.apache.spark.ml.param.shared.HasInputColgetInputCol, inputColMethods inherited from interface org.apache.spark.ml.param.shared.HasOutputColgetOutputCol, outputColMethods inherited from interface org.apache.spark.ml.util.IdentifiabletoString, uidMethods inherited from interface org.apache.spark.ml.param.Paramsclear, copy, copyValues, defaultCopy, defaultParamMap, explainParam, explainParams, extractParamMap, extractParamMap, get, getDefault, getOrDefault, getParam, hasDefault, hasParam, isDefined, isSet, onParamChange, paramMap, params, set, set, set, setDefault, setDefault, shouldOwn
- 
Method Details- 
getMaxCategoriesint getMaxCategories()
- 
handleInvalidParam for how to handle invalid data (unseen labels or NULL values). Note: this param only applies to categorical features, not continuous ones. Options are: 'skip': filter out rows with invalid data. 'error': throw an error. 'keep': put invalid data in a special additional bucket, at index of the number of categories of the feature. Default value: "error"- Specified by:
- handleInvalidin interface- HasHandleInvalid
- Returns:
- (undocumented)
 
- 
maxCategoriesIntParam maxCategories()Threshold for the number of values a categorical feature can take. If a feature is found to have > maxCategories values, then it is declared continuous. Must be greater than or equal to 2.(default = 20) - Returns:
- (undocumented)
 
 
-