Algorithm Hyperparameters

Hyperparameters are parameters of a machine learning algorithm that are set prior to the commencement of the training process.

Parameters common to all algorithms:

Field Description

Field Name

Batch Size. The preferred number of instances to process if batch prediction is being performed. More or fewer instances may be provided, but this gives implementations a chance to specify a preferred batch size.

batchSize

Number of Decimal Places. The number of decimal places to be used for the output of numbers shown in the trained unit info that is returned by the Train function.

numDecimalPlaces

Parameters specific to an algorithm:

Linear Regression

Field Description

Field Name

Attribute Selection Method. Sets the method used to select attributes. Available methods are: no attribute selection, attribute selection using M5's method (step through the attributes removing the one with the smallest standardized coefficient until no improvement is observed in the estimate of the error given by the Akaike information criterion), and a greedy selection using the Akaike information metric.

attributeSelectionMethod

Eliminate Collinear Attributes. Sets whether or not collinear attributes are eliminated.

eliminateCollinearAttributes

Minimal. If enabled, means and standard deviations get discarded to conserve memory. As a consequence, the trained unit info that is returned by the Train function is truncated.

minimal

Ridge Parameter. The value of the ridge parameter for the L2 regularization.

ridge

Output Additional Statistics. Determines whether to output additional statistics (such as standard deviation of coefficients and t-statistics) in the trained unit info for regression analysis.

outputAdditionalStats

Logistic Regression

Field Description

Field Name

Maximum Number of Iterations. Maximum number of iterations to perform.

maxIts

Ridge Parameter. The value of the ridge parameter.

ridge

Use Conjugate Gradient Descent. Use conjugate gradient descent rather than BFGS updates; faster for problems with many parameters.

useConjugateGradientDescent

Multilayer Perceptron

Field Description

Field Name

Decay. Setting this option to true will cause the learning rate to decrease. This will divide the starting learning rate by the epoch number, to determine what the current learning rate should be. This may help to stop the network from diverging from the target output, as well as improve general performance.

decay

Hidden Layers. This option defines the hidden layers of the neural network. This is a list of positive whole numbers. 1 for each hidden layer. Comma separated. To have no hidden layers put a single 0 here. There are also wildcard values 'a' = (attributes + classes) / 2, 'i' = attributes, 'o' = classes, 't' = attributes + classes.

hiddenLayers

Learning Rate. The amount the weights are updated.

learningRate

Momentum. Momentum applied to the weights during updating.

momentum

Nominal to Binary Filter. If enabled, the filter will be used to preprocess the instances. This could help improve performance if there are nominal attributes in the data.

nominalToBinaryFilter

Normalize Attributes. Determines whether to normalize the attributes. This could help improve performance of the network. Nominal attributes will be normalized as well (after they have been run through the nominal to binary filter if that is in use) so that the nominal values are between -1 and 1.

normalizeAttributes

Normalize Label Values. Determines whether to normalize the label column values if they are numeric. This could help improve performance of the network. The values are normalized to be between -1 and 1. Note that this is only internally, the output will be scaled back to the original range.

normalizeLabelValues

Reset. Setting this to true will allow the network to reset with a lower learning rate. If the network diverges from the answer this will automatically reset the network with a lower learning rate and begin training again. Note that if the network diverges but isn't allowed to reset it will fail the training process and return an error message.

reset

Random Number Seed. Seed used to initialize the random number generator. Random numbers are used for setting the initial weights of the connections between nodes, and also for shuffling the training data.

seed

Training Time. The number of epochs to train through. If the validation set is non-zero then it can terminate the network early.

trainingTime

Validation Set Size. The percentage size of the validation set. The training will continue until it is observed that the error on the validation set has been consistently getting worse, or if the training time is reached. If this is set to zero no validation set will be used and instead the network will train for the specified number of epochs.

validationSetSize

Validation Threshold. Used to terminate validation testing. The value here dictates how many times in a row the validation set error can get worse before training is terminated.

validationThreshold

Naive Bayes Classifier

Field Description

Field Name

Use Kernel Estimator. Determines whether to use a kernel estimator for numeric attributes rather than a normal distribution.

useKernelEstimator

Use Supervised Discretization. Determines whether to use supervised discretization to convert numeric attributes to nominal ones.

useSupervisedDiscretization

One-Class Support Vector Machine

Field Description

Field Name

Do Not Replace Missing Values. Determines whether to turn off automatic replacement of missing values. WARNING: set to true only if the data does not contain missing values.

doNotReplaceMissingValues

Kernel. The kernel to use.

svmKernel

Kernel Parameters. Parameters of the chosen kernel.

svmKernelParameters

Normalize. Determines whether to normalize the data.

normalize

Nu. The value of nu.

nu

Random Number Seed. Seed used to initialize the random number generator.

seed

Shrinking. Determines whether to use the shrinking heuristic.

shrinking

Tolerance Parameter. The tolerance of the termination criterion.

toleranceParameter

Random Forest

Field Description

Field Name

Bag Size Percentage. Size of each bag, as a percentage of the training set size.

bagSizePercent

Break Ties Randomly. Break ties randomly when several attributes look equally good.

breakTiesRandomly

Calculate Out-of-bag Error. Determines whether the out-of-bag error is calculated.

calcOutOfBag

Compute Attribute Importance. Compute attribute importance via mean impurity decrease.

computeAttributeImportance

Maximum Depth of the Tree. The maximum depth of the tree, 0 for unlimited.

maxDepth

Number of Execution Slots. The number of execution slots (threads) to use for constructing the ensemble.

numExecutionSlots

Number of Features. Sets the number of randomly chosen attributes. If 0, int(log_2(num_predictors) + 1) is used.

numFeatures

Number of Iterations. The number of iterations to be performed.

numIterations

Output Out-of-bag Complexity Statistics. Determines whether to output complexity-based statistics in the trained unit info when out-of-bag evaluation is performed.

outputOutOfBagComplexityStats

Output Classifiers. Determines whether to output the individual classifiers in the trained unit info.

outputClassifiers

Random Number Seed. Seed used to initialize the random number generator.

seed

Reduced Error Pruning (REP) Decision Tree

Field Description

Field Name

Initial Count. Initial class value count.

initialCount

Maximum Depth of the Tree. The maximum tree depth (-1 for no restriction).

maxDepth

Minimum Number of Instances. The minimum total weight of the instances in a leaf.

minNum

Minimum Proportion of the Variance. The minimum proportion of the variance on all the data that needs to be present at a node in order for splitting to be performed (used only for regression problems).

minVarianceProp

No Pruning. Determines whether pruning is performed.

noPruning

Number of Folds. Determines the amount of data used for pruning. One fold is used for pruning, the rest for growing the rules.

numFolds

Random Number Seed. The seed used for random data shuffling.

seed

Spread Initial Count. Spread initial count across all values instead of using the count per value.

spreadInitialCount

Support Vector Machine

Field Description

Field Name

C. The complexity parameter C.

c

Filter Type. Determines how/if the data will be transformed.

filterType

Kernel. The kernel to use.

kernel

Kernel Parameters. Parameters of the chosen kernel (kernel-specific).

kernelParameters

Epsilon. The epsilon for round-off error.

epsilon

Tolerance Parameter. The tolerance parameter.

toleranceParameter

Build Calibration Models. Determines whether to fit calibration models to the SVM's outputs (for proper probability estimates).

buildClibrationModels

Calibrator. The calibration method to use. Visible only if buildClibrationModels is set to true.

calibrator

Calibrator Parameters. Parameters of the calibrator. Visible only if buildClibrationModels is set to true.

calibratorParameters

Number of Folds. The number of folds for cross-validation used to generate training data for calibration models (-1 means use training data). Visible only if buildClibrationModels is set to true.

calibNumFolds

Random Number Seed. Random number seed for the cross-validation used to generate training data for calibration models. Visible only if buildClibrationModels is set to true.

calibRandomSeed

Support Vector Regression

Field Description

Field Name

C. The complexity parameter C.

c

Filter Type. Determines how/if the data will be transformed.

filterType

Kernel. The kernel to use.

kernel

Kernel Parameters. Parameters of the chosen kernel (kernel-specific)

kernelParameters

Optimizer. The learning algorithm.

regOptimizer

Optimizer Parameters. Parameters of the Optimizer.

regOptimizerParameters

Naive Bayes Classifier

Field Description

Field Name

Use Kernel Estimator. Determines whether to use a kernel estimator for numeric attributes rather than a normal distribution.

useKernelEstimator

Use Supervised Discretization. Determines whether to use supervised discretization to convert numeric attributes to nominal ones.

useSupervisedDiscretization

Stochastic Gradient Descent

Field Description

Field Name

Do Not Normalize. Determines whether normalization is turned off.

doNotNormalize

Do Not Replace Missing Values. Determines whether to turn off global replacement of missing values.

doNotReplaceMissingValues

Number of Epochs. The number of epochs to perform (batch learning). The total number of iterations is the number of epochs multiplied by the number of instances.

epochs

Lambda. The regularization constant.

lambda

Learning Rate. Determines the learning rate. If normalization is turned off, then the default learning rate will need to be reduced (e.g. set to 0.0001).

learningRate

Loss Function. The loss function to use.

lossFunction

Epsilon. The epsilon threshold for epsilon insensitive and Huber loss. An error with absolute value less that this threshold has loss of 0 for epsilon insensitive loss. For Huber loss this is the boundary between the quadratic and linear parts of the loss function.

epsilon

Random Number Seed. Seed used to initialize the random number generator.

seed

Filtered Predictor

Field Description

Field Name

Algorithm. Base algorithm to be used.

algorithm

Base Algorithm Hyperparameters. Determines the parameters of selected algorithm.

baseAlgorithmParameters

Filter. Filter to be used.

filter

Filter Parameters. Determines the parameters of selected filter.

filterParameters

Random Number Seed. Seed used to initialize the random number generator.

seed

Hoeffding Tree

Field Description

Field Name

Grace Period. Number of instances (or total weight of instances) a leaf should observe between split attempts.

gracePeriod

Hoeffding Tie Threshold. Theshold below which a split will be forced to break ties.

hoeffdingTieThreshold

Leaf Prediction Strategy. The leaf prediction strategy to use.

leafPredictionStrategy

Naive Bayes Prediction Threshold. The number of instances (weight) a leaf should observe before allowing naive Bayes (adaptive) to make predictions.

naiveBayesPredictionThreshold

Print Leaf Models. Determines whether to output the leaf models in the trained unit info (naive Bayes leaves only).

outputLeafModels

Split Confidence. The allowable error in a split decision. Values closer to zero will take longer to decide.

splitConfidence

Splitting Criterion. The splitting criterion to use.

splitCriterion

Minimum Fraction Of Weight by Information Gain. Minimum fraction of weight required down at least two branches for information gain splitting.

minimumFractionOfWeightInfoGain

Multiclass Updateable Classifier

Field Description

Field Name

Base Algorithm. Base algorithm to be used.

baseAlgorithm

Base Algorithm Hyperparameters. Determines the parameters of selected algorithm.

baseAlgorithmParameters

Method. Sets the method to use for transforming the multi-class problem into several 2-class ones.

method

Log Loss Decoding. Determines whether to use log loss decoding for random or exhaustive codes.

logLossDecoding

Width Factor. Sets the width multiplier when using random codes. The number of codes generated will be this number multiplied by the number of classes.

randomWidthFactor

Use Pairwise Coupling. Determines whether to use pairwise coupling.

usePairwiseCoupling

Random Number Seed. Seed used to initialize the random number generator.

seed

Kernel Parameters

Parameters common to all kernels:

Field Description

Field Name

Cache Size. The size of the cache (a prime number), 0 for full cache and -1 to turn it off.

kernelCacheSize

Parameters specific to a kernel:

Pearson VII Function Kernel

Field Description

Field Name

Omega. The omega value.

kernelOmega

Sigma. The sigma value.

kernelSigma

Polynomial and Normalized Polynomial Kernels

Field Description

Field Name

Degree. The exponent value.

kernelExponent

Use Lower Order. Determines whether to use lower-order terms.

kernelUseLowerOrder

Radial Basis Function (RBF) Kernel

Field Description

Field Name

Gamma. The gamma value.

kernelGamma

Kernel Parameters Used By the One-Class Support Vector Machine Algorithm

Parameters common to all kernels:

Field Description

Field Name

Cache Size. The cache size in Mb.

kernelSvmCacheSize

Polynomial Kernel

Field Description

Field Name

Coef0. Independent term in kernel function.

kernelSvmCoefficient0

Degree. The exponent value.

kernelSvmDegree

Gamma. The gamma to use, if 0 then 1/max_index is used.

kernelSvmGamma

Radial Basis Function (RBF) Kernel

Field Description

Field Name

Gamma. The gamma to use, if 0 then 1/max_index is used.

kernelSvmGamma

Sigmoid Kernel

Field Description

Field Name

Coefficient0. Independent term in kernel function.

kernelSvmCoefficient0

Gamma. The gamma to use, if 0 then 1/max_index is used.

kernelSvmGamma

Optimizer Parameters

Parameters common to all optimizers:

Field Description

Field Name

Epsilon. The epsilon for round-off error.

epsilon

Epsilon Parameter. The epsilon parameter of the epsilon insensitive loss function.

epsilonParameter

Random Number Seed. Seed used to initialize the random number generator.

seed

Parameters of the RegSMO Improved

Field Description

Field Name

Tolerance. Tolerance parameter used for checking stopping criterion (b_up is less then b_low + 2*tol).

tolerance

Use Variant 1. Set true to use variant 1 of the paper given below, otherwise use variant 2.

S.K. Shevade, S.S. Keerthi, C. Bhattacharyya, K.R.K. Murthy: Improvements to the SMO Algorithm for SVM Regression. In: IEEE Transactions on Neural Networks, 1999

useVariant1

Clustering Parameters

Parameters for clustering algorithms.

Simple K Means

Field Description

Field Name

Random Number Seed. The initial seed value for the random number generator used in the algorithm

seed

Number of clusters. The number of clusters to be generated by the algorithm

numClusters

Number of Execution Slots. The number of parallel executions that can be performed by the algorithm

numExecutionSlots

Maximum number of iterations. The maximum number of iterations the algorithm can perform

maxIterations

Faster distance calculations. A flag to indicate if faster distance calculation methods should be used

fasterDistanceCalc

Do Not Replace Missing Values. A flag to indicate if missing values in the data should not be replaced

dontReplaceMissingValues

Display standard deviations. A flag to indicate if the standard deviations should be displayed

displayStdDevs

Canopy T1 distance. The distance metric used in the first phase of the canopy clustering

canopyT1

Canopy T2 distance. The distance metric used in the second phase of the canopy clustering

canopyT2

Canopy periodic pruning rate. The rate at which the canopy tree is pruned in each periodic pruning cycle

canopyPeriodicPruningRate

Minimum canopy density. The minimum density of the canopy tree

canopyMinimumCanopyDensity

Max number of canopies to hold in memory. The maximum number of canopies that can be held in memory at a given time

canopyMaxNumCanopiesToHoldInMemory

Hierarchical Clustering

Field Description

Field Name

Number of clusters. The number of clusters to be generated by the algorithm.

numClusters

Distance is branch length. A flag to indicate if the distance between clusters should be represented as the length of the branch joining them.

distanceIsBranchLength

Print hierarchy in Newick format. A flag to indicate if the hierarchy should be printed in Newick format.

printNewick

Density Based Clustering

Field Description

Field Name

Number of clusters. The number of clusters to be generated by the algorithm.

numClusters

Minimum standard deviation. The minimum standard deviation of the clusters to be generated by the algorithm.

minStdDev

Filtered Predictor

Field Description

Field Name

Base algorithm. Base algorithm to use for filtering the data

baseAlgorithm

Base algorithm hyperparameters. Hyperparameters of the base algorithm.

baseAlgorithmHyperparameters

Filter. The type of filter to be applied to the data, replace missing values or remove missing values

filterType

Filter Parameters. In the case that replace missing values is selected for Filter, select whether to ignore the label field. if True, the label field will be temporarily unset before the filter is applied.

filterParameters

Random Number Seed. The initial seed value for the random number generator used in the algorithm.

randomNumberSeed

Was this page helpful?