Algorithms

The number of algorithms supported by the Iotellect Machine Learning module is constantly growing. The current set of algorithms is presented below.

Algorithms for regression problems:

Algorithm

Description

Linear Regression

Multiple linear regression with L2 regularization, attribute selection, and a possibility to eliminate collinear attributes. Supports numeric, nominal and Date attributes. Supports weighted instances.

Support Vector Regression

Implementation of a support vector machine for regression. Several popular kernels are supported. The input values can be normalized or standardized if required. Supports numeric and nominal attributes. Supports weighted instances.

REP Decision Tree

Fast decision tree with reduced error pruning. Supports numeric, nominal and Date attributes. Supports weighted instances.

Random Forest

Forest of random trees. Supports numeric, nominal and Date attributes. Supports weighted instances.

Multilayer Perceptron

Implementation of a feed-forward neural network that is trained using backpropagation. The activation function of the nodes in all hidden layers is the sigmoid function. The nodes in the output layer are unthresholded linear units. The input values can be normalized if required. Supports numeric, nominal and Date attributes. Supports weighted instances.

Stochastic Gradient Descent

Implements stochastic gradient descent for learning various linear models (binary class SVM, binary class logistic regression, squared loss, Huber loss and epsilon-insensitive loss linear regression). Globally replaces all missing values and transforms nominal attributes into binary ones. It also normalizes all attributes, so the coefficients in the output are based on the normalized data. Updateable.

Algorithms for classification problems:

Algorithm

Description

Logistic Regression

Multinomial logistic regression with a ridge estimator. Supports numeric, nominal and Date attributes. Supports weighted instances.

Support Vector Machine

Implementation of the sequential minimal optimization algorithm for a support vector machine. The attribute values can be normalized or standardized if required. Multi-class problems are solved using pairwise classification. Supports numeric and nominal attributes. Supports weighted instances.

REP Decision Tree

Fast decision tree with reduced error pruning. Supports numeric, nominal and Date attributes. Supports weighted instances.

Random Forest

Forest of random trees. Supports numeric, nominal and Date attributes. Supports weighted instances.

Multilayer Perceptron

Implementation of a feed-forward neural network. The activation function of all nodes is the sigmoid function. The input attributes can be normalized if required. Supports numeric, nominal and Date attributes. Supports weighted instances.

Naive Bayes Classifier

Naive Bayes classifier using estimator classes. Supports numeric and nominal attributes. Supports weighted instances.

Stochastic Gradient Descent

Implements stochastic gradient descent for learning various linear models (binary class SVM, binary class logistic regression, squared loss, Huber loss and epsilon-insensitive loss linear regression). Globally replaces all missing values and transforms nominal attributes into binary ones. It also normalizes all attributes, so the coefficients in the output are based on the normalized data. Updateable.

Hoeffding Tree

A Hoeffding tree (VFDT) is an incremental, anytime decision tree induction algorithm that is capable of learning from massive data streams, assuming that the distribution generating examples does not change over time. Hoeffding trees exploit the fact that a small sample can often be enough to choose an optimal splitting attribute. Updateable.

Multiclass Updateable Classifier

A metaclassifier for handling multi-class datasets with 2-class classifiers. This classifier is also capable of applying error correcting output codes for increased accuracy. The base classifier must be an updateable classifier. Updateable.

Algorithms for anomaly detection problems:

Algorithm

Description

One-class Support Vector Machine

Implementation of a one-class support vector machine for anomaly detection. Supports numeric, nominal and Date attributes.

Meta-algorithms:

Algorithm

Description

Filtered Predictor

Implementation of an arbitrary classifier on data that has been passed through an arbitrary filter. Supports weighted instances. Updateable if the base classifier is updateable.

Multiclass Updateable Classifier

A metaclassifier for handling multi-class datasets with 2-class classifiers. This classifier is also capable of applying error correcting output codes for increased accuracy. The base classifier must be an updateable classifier. Updateable.

Updateable algorithms:

Algorithm

Description

Filtered Predictor

Implementation of an arbitrary classifier on data that has been passed through an arbitrary filter. Supports weighted instances. Updateable if the base classifier is updateable.

Hoeffding Tree

A Hoeffding tree (VFDT) is an incremental, anytime decision tree induction algorithm that is capable of learning from massive data streams, assuming that the distribution generating examples does not change over time. Hoeffding trees exploit the fact that a small sample can often be enough to choose an optimal splitting attribute. Updateable.

Multiclass Updateable Classifier

A metaclassifier for handling multi-class datasets with 2-class classifiers. This classifier is also capable of applying error correcting output codes for increased accuracy. The base classifier must be an updateable classifier. Updateable.

Stochastic Gradient Descent

Implements stochastic gradient descent for learning various linear models (binary class SVM, binary class logistic regression, squared loss, Huber loss and epsilon-insensitive loss linear regression). Globally replaces all missing values and transforms nominal attributes into binary ones. It also normalizes all attributes, so the coefficients in the output are based on the normalized data. Updateable.

Clustering Algorithms

Algorithm

Description

Simple K Means

A partition-based clustering algorithm that groups data points into k clusters based on their similarity to each other.

Hierarchical Clustering

Creates a hierarchy of clusters by successively merging or splitting them based on a distance metric.

Density Based Clustering

Groups together data points that are within a specified distance and have a minimum number of nearby neighbors, resulting in clusters of arbitrary shapes and sizes.

Filtered Predictor

Combines clustering with data preprocessing techniques to identify patterns and group similar data points together.

Each algorithm has its own set of hyperparameters. See Algorithm Hyperparameters for more detail.

Was this page helpful?