Package org.apache.spark.ml.util
Class MetadataUtils
Object
org.apache.spark.ml.util.MetadataUtils
Helper utilities for algorithms using ML metadata
- 
Constructor SummaryConstructors
- 
Method SummaryModifier and TypeMethodDescriptiongetCategoricalFeatures(StructField featuresSchema) Examine a schema to identify categorical (Binary and Nominal) features.static int[]getFeatureIndicesFromNames(StructField col, String[] names) Takes a Vector column and a list of feature names, and returns the corresponding list of feature indices in the column, in order.static scala.Option<Object>getNumClasses(StructField labelSchema) Examine a schema to identify the number of classes in a label column.static scala.Option<Object>getNumFeatures(StructField vectorSchema) Examine a schema to identify the number of features in a vector column.
- 
Constructor Details- 
MetadataUtilspublic MetadataUtils()
 
- 
- 
Method Details- 
getNumClassesExamine a schema to identify the number of classes in a label column. Returns None if the number of labels is not specified, or if the label column is continuous.- Parameters:
- labelSchema- (undocumented)
- Returns:
- (undocumented)
 
- 
getNumFeaturesExamine a schema to identify the number of features in a vector column. Returns None if the number of features is not specified.- Parameters:
- vectorSchema- (undocumented)
- Returns:
- (undocumented)
 
- 
getCategoricalFeaturespublic static scala.collection.immutable.Map<Object,Object> getCategoricalFeatures(StructField featuresSchema) Examine a schema to identify categorical (Binary and Nominal) features.- Parameters:
- featuresSchema- Schema of the features column. If a feature does not have metadata, it is assumed to be continuous. If a feature is Nominal, then it must have the number of values specified.
- Returns:
- Map: feature index to number of categories. The map's set of keys will be the set of categorical feature indices.
 
- 
getFeatureIndicesFromNamesTakes a Vector column and a list of feature names, and returns the corresponding list of feature indices in the column, in order.- Parameters:
- col- Vector column which must have feature names specified via attributes
- names- List of feature names
- Returns:
- (undocumented)
 
 
-