Algorithm Hyperparameters
Hyperparameters are parameters of a machine learning algorithm that are set prior to the commencement of the training process.
Parameters common to all algorithms:
Field Description | Field Name |
Batch Size. The preferred number of instances to process if batch prediction is being performed. More or fewer instances may be provided, but this gives implementations a chance to specify a preferred batch size. | batchSize |
Number of Decimal Places. The number of decimal places to be used for the output of numbers shown in the trained unit info that is returned by the Train function. | numDecimalPlaces |
Parameters specific to an algorithm:
Linear Regression
Field Description | Field Name |
Attribute Selection Method. Sets the method used to select attributes. Available methods are: no attribute selection, attribute selection using M5's method (step through the attributes removing the one with the smallest standardized coefficient until no improvement is observed in the estimate of the error given by the Akaike information criterion), and a greedy selection using the Akaike information metric. | attributeSelectionMethod |
Eliminate Collinear Attributes. Sets whether or not collinear attributes are eliminated. | eliminateCollinearAttributes |
Minimal. If enabled, means and standard deviations get discarded to conserve memory. As a consequence, the trained unit info that is returned by the Train function is truncated. | minimal |
Ridge Parameter. The value of the ridge parameter for the L2 regularization. | ridge |
Output Additional Statistics. Determines whether to output additional statistics (such as standard deviation of coefficients and t-statistics) in the trained unit info for regression analysis. | outputAdditionalStats |
Logistic Regression
Field Description | Field Name |
Maximum Number of Iterations. Maximum number of iterations to perform. | maxIts |
Ridge Parameter. The value of the ridge parameter. | ridge |
Use Conjugate Gradient Descent. Use conjugate gradient descent rather than BFGS updates; faster for problems with many parameters. | useConjugateGradientDescent |
Multilayer Perceptron
Field Description | Field Name |
Decay. Setting this option to true will cause the learning rate to decrease. This will divide the starting learning rate by the epoch number, to determine what the current learning rate should be. This may help to stop the network from diverging from the target output, as well as improve general performance. | decay |
Hidden Layers. This option defines the hidden layers of the neural network. This is a list of positive whole numbers. 1 for each hidden layer. Comma separated. To have no hidden layers put a single 0 here. There are also wildcard values 'a' = (attributes + classes) / 2, 'i' = attributes, 'o' = classes, 't' = attributes + classes. | hiddenLayers |
Learning Rate. The amount the weights are updated. | learningRate |
Momentum. Momentum applied to the weights during updating. | momentum |
Nominal to Binary Filter. If enabled, the filter will be used to preprocess the instances. This could help improve performance if there are nominal attributes in the data. | nominalToBinaryFilter |
Normalize Attributes. Determines whether to normalize the attributes. This could help improve performance of the network. Nominal attributes will be normalized as well (after they have been run through the nominal to binary filter if that is in use) so that the nominal values are between -1 and 1. | normalizeAttributes |
Normalize Label Values. Determines whether to normalize the label column values if they are numeric. This could help improve performance of the network. The values are normalized to be between -1 and 1. Note that this is only internally, the output will be scaled back to the original range. | normalizeLabelValues |
Reset. Setting this to true will allow the network to reset with a lower learning rate. If the network diverges from the answer this will automatically reset the network with a lower learning rate and begin training again. Note that if the network diverges but isn't allowed to reset it will fail the training process and return an error message. | reset |
Random Number Seed. Seed used to initialize the random number generator. Random numbers are used for setting the initial weights of the connections between nodes, and also for shuffling the training data. | seed |
Training Time. The number of epochs to train through. If the validation set is non-zero then it can terminate the network early. | trainingTime |
Validation Set Size. The percentage size of the validation set. The training will continue until it is observed that the error on the validation set has been consistently getting worse, or if the training time is reached. If this is set to zero no validation set will be used and instead the network will train for the specified number of epochs. | validationSetSize |
Validation Threshold. Used to terminate validation testing. The value here dictates how many times in a row the validation set error can get worse before training is terminated. | validationThreshold |
Naive Bayes Classifier
Field Description | Field Name |
Use Kernel Estimator. Determines whether to use a kernel estimator for numeric attributes rather than a normal distribution. | useKernelEstimator |
Use Supervised Discretization. Determines whether to use supervised discretization to convert numeric attributes to nominal ones. | useSupervisedDiscretization |
One-Class Support Vector Machine
Field Description | Field Name |
Do Not Replace Missing Values. Determines whether to turn off automatic replacement of missing values. WARNING: set to true only if the data does not contain missing values. | doNotReplaceMissingValues |
Kernel. The kernel to use. | svmKernel |
Kernel Parameters. Parameters of the chosen kernel. | svmKernelParameters |
Normalize. Determines whether to normalize the data. | normalize |
Nu. The value of nu. | nu |
Random Number Seed. Seed used to initialize the random number generator. | seed |
Shrinking. Determines whether to use the shrinking heuristic. | shrinking |
Tolerance Parameter. The tolerance of the termination criterion. | toleranceParameter |
Random Forest
Field Description | Field Name |
Bag Size Percentage. Size of each bag, as a percentage of the training set size. | bagSizePercent |
Break Ties Randomly. Break ties randomly when several attributes look equally good. | breakTiesRandomly |
Calculate Out-of-bag Error. Determines whether the out-of-bag error is calculated. | calcOutOfBag |
Compute Attribute Importance. Compute attribute importance via mean impurity decrease. | computeAttributeImportance |
Maximum Depth of the Tree. The maximum depth of the tree, 0 for unlimited. | maxDepth |
Number of Execution Slots. The number of execution slots (threads) to use for constructing the ensemble. | numExecutionSlots |
Number of Features. Sets the number of randomly chosen attributes. If 0, int(log_2(num_predictors) + 1) is used. | numFeatures |
Number of Iterations. The number of iterations to be performed. | numIterations |
Output Out-of-bag Complexity Statistics. Determines whether to output complexity-based statistics in the trained unit info when out-of-bag evaluation is performed. | outputOutOfBagComplexityStats |
Output Classifiers. Determines whether to output the individual classifiers in the trained unit info. | outputClassifiers |
Random Number Seed. Seed used to initialize the random number generator. | seed |
Reduced Error Pruning (REP) Decision Tree
Field Description | Field Name |
Initial Count. Initial class value count. | initialCount |
Maximum Depth of the Tree. The maximum tree depth (-1 for no restriction). | maxDepth |
Minimum Number of Instances. The minimum total weight of the instances in a leaf. | minNum |
Minimum Proportion of the Variance. The minimum proportion of the variance on all the data that needs to be present at a node in order for splitting to be performed (used only for regression problems). | minVarianceProp |
No Pruning. Determines whether pruning is performed. | noPruning |
Number of Folds. Determines the amount of data used for pruning. One fold is used for pruning, the rest for growing the rules. | numFolds |
Random Number Seed. The seed used for random data shuffling. | seed |
Spread Initial Count. Spread initial count across all values instead of using the count per value. | spreadInitialCount |
Support Vector Machine
Field Description | Field Name |
C. The complexity parameter C. | c |
Filter Type. Determines how/if the data will be transformed. | filterType |
Kernel. The kernel to use. | kernel |
Kernel Parameters. Parameters of the chosen kernel (kernel-specific). | kernelParameters |
Epsilon. The epsilon for round-off error. | epsilon |
Tolerance Parameter. The tolerance parameter. | toleranceParameter |
Build Calibration Models. Determines whether to fit calibration models to the SVM's outputs (for proper probability estimates). | buildClibrationModels |
Calibrator. The calibration method to use. Visible only if buildClibrationModels is set to true. | calibrator |
Calibrator Parameters. Parameters of the calibrator. Visible only if buildClibrationModels is set to true. | calibratorParameters |
Number of Folds. The number of folds for cross-validation used to generate training data for calibration models (-1 means use training data). Visible only if buildClibrationModels is set to true. | calibNumFolds |
Random Number Seed. Random number seed for the cross-validation used to generate training data for calibration models. Visible only if buildClibrationModels is set to true. | calibRandomSeed |
Support Vector Regression
Field Description | Field Name |
C. The complexity parameter C. | c |
Filter Type. Determines how/if the data will be transformed. | filterType |
Kernel. The kernel to use. | kernel |
Kernel Parameters. Parameters of the chosen kernel (kernel-specific) | kernelParameters |
Optimizer. The learning algorithm. | regOptimizer |
Optimizer Parameters. Parameters of the Optimizer. | regOptimizerParameters |
Naive Bayes Classifier
Field Description | Field Name |
Use Kernel Estimator. Determines whether to use a kernel estimator for numeric attributes rather than a normal distribution. | useKernelEstimator |
Use Supervised Discretization. Determines whether to use supervised discretization to convert numeric attributes to nominal ones. | useSupervisedDiscretization |
Stochastic Gradient Descent
Field Description | Field Name |
Do Not Normalize. Determines whether normalization is turned off. | doNotNormalize |
Do Not Replace Missing Values. Determines whether to turn off global replacement of missing values. | doNotReplaceMissingValues |
Number of Epochs. The number of epochs to perform (batch learning). The total number of iterations is the number of epochs multiplied by the number of instances. | epochs |
Lambda. The regularization constant. | lambda |
Learning Rate. Determines the learning rate. If normalization is turned off, then the default learning rate will need to be reduced (e.g. set to 0.0001). | learningRate |
Loss Function. The loss function to use. | lossFunction |
Epsilon. The epsilon threshold for epsilon insensitive and Huber loss. An error with absolute value less that this threshold has loss of 0 for epsilon insensitive loss. For Huber loss this is the boundary between the quadratic and linear parts of the loss function. | epsilon |
Random Number Seed. Seed used to initialize the random number generator. | seed |
Filtered Predictor
Field Description | Field Name |
Algorithm. Base algorithm to be used. | algorithm |
Base Algorithm Hyperparameters. Determines the parameters of selected algorithm. | baseAlgorithmParameters |
Filter. Filter to be used. | filter |
Filter Parameters. Determines the parameters of selected filter. | filterParameters |
Random Number Seed. Seed used to initialize the random number generator. | seed |
Hoeffding Tree
Field Description | Field Name |
Grace Period. Number of instances (or total weight of instances) a leaf should observe between split attempts. | gracePeriod |
Hoeffding Tie Threshold. Theshold below which a split will be forced to break ties. | hoeffdingTieThreshold |
Leaf Prediction Strategy. The leaf prediction strategy to use. | leafPredictionStrategy |
Naive Bayes Prediction Threshold. The number of instances (weight) a leaf should observe before allowing naive Bayes (adaptive) to make predictions. | naiveBayesPredictionThreshold |
Print Leaf Models. Determines whether to output the leaf models in the trained unit info (naive Bayes leaves only). | outputLeafModels |
Split Confidence. The allowable error in a split decision. Values closer to zero will take longer to decide. | splitConfidence |
Splitting Criterion. The splitting criterion to use. | splitCriterion |
Minimum Fraction Of Weight by Information Gain. Minimum fraction of weight required down at least two branches for information gain splitting. | minimumFractionOfWeightInfoGain |
Multiclass Updateable Classifier
Field Description | Field Name |
Base Algorithm. Base algorithm to be used. | baseAlgorithm |
Base Algorithm Hyperparameters. Determines the parameters of selected algorithm. | baseAlgorithmParameters |
Method. Sets the method to use for transforming the multi-class problem into several 2-class ones. | method |
Log Loss Decoding. Determines whether to use log loss decoding for random or exhaustive codes. | logLossDecoding |
Width Factor. Sets the width multiplier when using random codes. The number of codes generated will be this number multiplied by the number of classes. | randomWidthFactor |
Use Pairwise Coupling. Determines whether to use pairwise coupling. | usePairwiseCoupling |
Random Number Seed. Seed used to initialize the random number generator. | seed |
Kernel Parameters
Parameters common to all kernels:
Field Description | Field Name |
Cache Size. The size of the cache (a prime number), 0 for full cache and -1 to turn it off. | kernelCacheSize |
Parameters specific to a kernel:
Pearson VII Function Kernel
Field Description | Field Name |
Omega. The omega value. | kernelOmega |
Sigma. The sigma value. | kernelSigma |
Polynomial and Normalized Polynomial Kernels
Field Description | Field Name |
Degree. The exponent value. | kernelExponent |
Use Lower Order. Determines whether to use lower-order terms. | kernelUseLowerOrder |
Radial Basis Function (RBF) Kernel
Field Description | Field Name |
Gamma. The gamma value. | kernelGamma |
Kernel Parameters Used By the One-Class Support Vector Machine Algorithm
Parameters common to all kernels:
Field Description | Field Name |
Cache Size. The cache size in Mb. | kernelSvmCacheSize |
Polynomial Kernel
Field Description | Field Name |
Coef0. Independent term in kernel function. | kernelSvmCoefficient0 |
Degree. The exponent value. | kernelSvmDegree |
Gamma. The gamma to use, if 0 then 1/max_index is used. | kernelSvmGamma |
Radial Basis Function (RBF) Kernel
Field Description | Field Name |
Gamma. The gamma to use, if 0 then 1/max_index is used. | kernelSvmGamma |
Sigmoid Kernel
Field Description | Field Name |
Coefficient0. Independent term in kernel function. | kernelSvmCoefficient0 |
Gamma. The gamma to use, if 0 then 1/max_index is used. | kernelSvmGamma |
Optimizer Parameters
Parameters common to all optimizers:
Field Description | Field Name |
Epsilon. The epsilon for round-off error. | epsilon |
Epsilon Parameter. The epsilon parameter of the epsilon insensitive loss function. | epsilonParameter |
Random Number Seed. Seed used to initialize the random number generator. | seed |
Parameters of the RegSMO Improved
Field Description | Field Name |
Tolerance. Tolerance parameter used for checking stopping criterion (b_up is less then b_low + 2*tol). | tolerance |
Use Variant 1. Set true to use variant 1 of the paper given below, otherwise use variant 2. S.K. Shevade, S.S. Keerthi, C. Bhattacharyya, K.R.K. Murthy: Improvements to the SMO Algorithm for SVM Regression. In: IEEE Transactions on Neural Networks, 1999 | useVariant1 |
Clustering Parameters
Parameters for clustering algorithms.
Simple K Means
Field Description | Field Name |
---|---|
Random Number Seed. The initial seed value for the random number generator used in the algorithm | seed |
Number of clusters. The number of clusters to be generated by the algorithm | numClusters |
Number of Execution Slots. The number of parallel executions that can be performed by the algorithm | numExecutionSlots |
Maximum number of iterations. The maximum number of iterations the algorithm can perform | maxIterations |
Faster distance calculations. A flag to indicate if faster distance calculation methods should be used | fasterDistanceCalc |
Do Not Replace Missing Values. A flag to indicate if missing values in the data should not be replaced | dontReplaceMissingValues |
Display standard deviations. A flag to indicate if the standard deviations should be displayed | displayStdDevs |
Canopy T1 distance. The distance metric used in the first phase of the canopy clustering | canopyT1 |
Canopy T2 distance. The distance metric used in the second phase of the canopy clustering | canopyT2 |
Canopy periodic pruning rate. The rate at which the canopy tree is pruned in each periodic pruning cycle | canopyPeriodicPruningRate |
Minimum canopy density. The minimum density of the canopy tree | canopyMinimumCanopyDensity |
Max number of canopies to hold in memory. The maximum number of canopies that can be held in memory at a given time | canopyMaxNumCanopiesToHoldInMemory |
Hierarchical Clustering
Field Description | Field Name |
---|---|
Number of clusters. The number of clusters to be generated by the algorithm. | numClusters |
Distance is branch length. A flag to indicate if the distance between clusters should be represented as the length of the branch joining them. | distanceIsBranchLength |
Print hierarchy in Newick format. A flag to indicate if the hierarchy should be printed in Newick format. | printNewick |
Density Based Clustering
Field Description | Field Name |
---|---|
Number of clusters. The number of clusters to be generated by the algorithm. | numClusters |
Minimum standard deviation. The minimum standard deviation of the clusters to be generated by the algorithm. | minStdDev |
Filtered Predictor
Field Description | Field Name |
---|---|
Base algorithm. Base algorithm to use for filtering the data | baseAlgorithm |
Base algorithm hyperparameters. Hyperparameters of the base algorithm. | baseAlgorithmHyperparameters |
Filter. The type of filter to be applied to the data, replace missing values or remove missing values | filterType |
Filter Parameters. In the case that replace missing values is selected for Filter, select whether to ignore the label field. if | filterParameters |
Random Number Seed. The initial seed value for the random number generator used in the algorithm. | randomNumberSeed |
Was this page helpful?