Package org.apache.spark.ml.feature
Interface Word2VecBase
- All Superinterfaces:
- HasInputCol,- HasMaxIter,- HasOutputCol,- HasSeed,- HasStepSize,- Identifiable,- Params,- Serializable
- All Known Implementing Classes:
- Word2Vec,- Word2VecModel
public interface Word2VecBase
extends Params, HasInputCol, HasOutputCol, HasMaxIter, HasStepSize, HasSeed
Params for 
Word2Vec and Word2VecModel.- 
Method SummaryModifier and TypeMethodDescriptionintintintintintSets the maximum length (in words) of each sentence in the input data.minCount()The minimum number of times a token must appear to be included in the word2vec model's vocabulary.Number of partitions for sentences of words.validateAndTransformSchema(StructType schema) Validate and transform the input schema.The dimension of the code that you want to transform from words.The window size (context words from [-window, window]).Methods inherited from interface org.apache.spark.ml.param.shared.HasInputColgetInputCol, inputColMethods inherited from interface org.apache.spark.ml.param.shared.HasMaxItergetMaxIter, maxIterMethods inherited from interface org.apache.spark.ml.param.shared.HasOutputColgetOutputCol, outputColMethods inherited from interface org.apache.spark.ml.param.shared.HasStepSizegetStepSize, stepSizeMethods inherited from interface org.apache.spark.ml.util.IdentifiabletoString, uidMethods inherited from interface org.apache.spark.ml.param.Paramsclear, copy, copyValues, defaultCopy, defaultParamMap, estimateMatadataSize, explainParam, explainParams, extractParamMap, extractParamMap, get, getDefault, getOrDefault, getParam, hasDefault, hasParam, isDefined, isSet, onParamChange, paramMap, params, set, set, set, setDefault, setDefault, shouldOwn
- 
Method Details- 
getMaxSentenceLengthint getMaxSentenceLength()
- 
getMinCountint getMinCount()
- 
getNumPartitionsint getNumPartitions()
- 
getVectorSizeint getVectorSize()
- 
getWindowSizeint getWindowSize()
- 
maxSentenceLengthIntParam maxSentenceLength()Sets the maximum length (in words) of each sentence in the input data. Any sentence longer than this threshold will be divided into chunks of up tomaxSentenceLengthsize. Default: 1000- Returns:
- (undocumented)
 
- 
minCountIntParam minCount()The minimum number of times a token must appear to be included in the word2vec model's vocabulary. Default: 5- Returns:
- (undocumented)
 
- 
numPartitionsIntParam numPartitions()Number of partitions for sentences of words. Default: 1- Returns:
- (undocumented)
 
- 
validateAndTransformSchemaValidate and transform the input schema.- Parameters:
- schema- (undocumented)
- Returns:
- (undocumented)
 
- 
vectorSizeIntParam vectorSize()The dimension of the code that you want to transform from words. Default: 100- Returns:
- (undocumented)
 
- 
windowSizeIntParam windowSize()The window size (context words from [-window, window]). Default: 5- Returns:
- (undocumented)
 
 
-