Package org.apache.spark.mllib.feature
Class StandardScaler
Object
org.apache.spark.mllib.feature.StandardScaler
- All Implemented Interfaces:
- org.apache.spark.internal.Logging
Standardizes features by removing the mean and scaling to unit std using column summary
 statistics on the samples in the training set.
 
The "unit std" is computed using the corrected sample standard deviation (https://en.wikipedia.org/wiki/Standard_deviation#Corrected_sample_standard_deviation), which is computed as the square root of the unbiased sample variance.
param: withMean False by default. Centers the data with mean before scaling. It will build a dense output, so take care when applying to sparse input. param: withStd True by default. Scales the data to unit standard deviation.
- 
Nested Class SummaryNested classes/interfaces inherited from interface org.apache.spark.internal.Loggingorg.apache.spark.internal.Logging.LogStringContext, org.apache.spark.internal.Logging.SparkShellLoggingFilter
- 
Constructor SummaryConstructors
- 
Method SummaryModifier and TypeMethodDescriptionComputes the mean and variance and stores as a model to be used for later scaling.Methods inherited from class java.lang.Objectequals, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface org.apache.spark.internal.LogginginitializeForcefully, initializeLogIfNecessary, initializeLogIfNecessary, initializeLogIfNecessary$default$2, isTraceEnabled, log, logDebug, logDebug, logDebug, logDebug, logError, logError, logError, logError, logInfo, logInfo, logInfo, logInfo, logName, LogStringContext, logTrace, logTrace, logTrace, logTrace, logWarning, logWarning, logWarning, logWarning, org$apache$spark$internal$Logging$$log_, org$apache$spark$internal$Logging$$log__$eq, withLogContext
- 
Constructor Details- 
StandardScalerpublic StandardScaler(boolean withMean, boolean withStd) 
- 
StandardScalerpublic StandardScaler()
 
- 
- 
Method Details- 
fitComputes the mean and variance and stores as a model to be used for later scaling.- Parameters:
- data- The data used to compute the mean and variance to build the transformation model.
- Returns:
- a StandardScalarModel
 
 
-