Skip to main content

Logistic Regression (SVM)

Synopsis

This operator is a Logistic Regression Learner. It is based on the internal Java implementation of the

myKLR by Stefan Rueping.

Description

This learner uses the Java implementation of the myKLR by Stefan Rueping. myKLR is a tool for large scale kernel logistic regression based on the algorithm of Keerthi etal (2003) and the code of mySVM. For compatibility reasons, the model of myKLR differs slightly from that of Keerthi etal (2003). As myKLR is based on the code of mySVM; the format of example files, parameter files and kernel definition are identical. Please see the documentation of the SVM operator for further information. This learning method can be used for both regression and classification and provides a fast algorithm and good results for many learning tasks. mySVM works with linear or quadratic and even asymmetric loss functions.

This operator supports various kernel types including dot, radial, polynomial, neural, anova, epachnenikov, gaussian combination and multiquadric. Explanation of these kernel types is given in the parameters section.

Input

training set

This input port expects an ExampleSet. This operator cannot handle nominal attributes; it can be applied on data sets with numeric attributes. Thus often you may have to use the Nominal to Numerical operator before application of this operator.

Output

model

The Logistic Regression model is delivered from this output port. This model can now be applied on unseen data sets.

weights

This port delivers the attribute weights. This is only possible when the dot kernel type is used, it is not possible with other kernel types.

example set

The ExampleSet that was given as input is passed without changing to the output through this port. This is usually used to reuse the same ExampleSet in further operators or to view the ExampleSet in the Results Workspace.

Parameters

Kernel type

The type of the kernel function is selected through this parameter. Following kernel types are supported: dot, radial, polynomial, neural, anova, epachnenikov, gaussian combination, multiquadric

  • dot: The dot kernel is defined by** k(x,y)=x*y** i.e. it is inner product of** x** and y.
  • radial: The radial kernel is defined by **exp(-g ||x-y||^2) **where g is the gamma, it is specified by the kernel gamma parameter. The adjustable parameter gamma plays a major role in the performance of the kernel, and should be carefully tuned to the problem at hand.
  • polynomial: The polynomial kernel is defined by k(x,y)=(x*y+1)^d where d is the degree of polynomial and it is specified by the kernel degree parameter. The polynomial kernels are well suited for problems where all the training data is normalized.
  • neural: The neural kernel is defined by a two layered neural net tanh(a x*y+b) where a is alpha and b is the intercept constant. These parameters can be adjusted using the kernel a and kernel b parameters. A common value for alpha is 1/N, where N is the data dimension. Note that not all choices of a and b lead to a valid kernel function.
  • anova: The anova kernel is defined by raised to power d of summation of exp(-g (x-y)) where g is gamma and d is degree. gamma and degree are adjusted by the kernel gamma and kernel degree parameters respectively.
  • epachnenikov: The epachnenikov kernel is this function (3/4)(1-u2) for u between -1 and 1 and zero for u outside that range. It has two adjustable parameters kernel sigma1 and kernel degree.
  • gaussian_combination: This is the gaussian combination kernel. It has adjustable parameters kernel sigma1, kernel sigma2 and kernel sigma3.
  • multiquadric: The multiquadric kernel is defined by the square root of ||x-y||^2 + c^2. It has adjustable parameters kernel sigma1 and kernel sigma shift.

Kernel gamma

This is the SVM kernel parameter gamma. This is only available when the kernel type parameter is set to radial or anova.

Kernel sigma1

This is the SVM kernel parameter sigma1. This is only available when the kernel type parameter is set to epachnenikov, gaussian combination or multiquadric.

Kernel sigma2

This is the SVM kernel parameter sigma2. This is only available when the kernel type parameter is set to gaussian combination.

Kernel sigma3

This is the SVM kernel parameter sigma3. This is only available when the kernel type parameter is set to gaussian combination.

Kernel shift

This is the SVM kernel parameter shift. This is only available when the kernel type parameter is set to multiquadric.

Kernel degree

This is the SVM kernel parameter degree. This is only available when the ** kernel type** parameter is set to polynomial, anova or epachnenikov.

Kernel a

This is the SVM kernel parameter a. This is only available when the kernel type parameter is set to neural.

Kernel b

This is the SVM kernel parameter b. This is only available when the kernel type parameter is set to neural.

Kernel cache

This is an expert parameter. It specifies the size of the cache for kernel evaluations in megabytes.

C

This is the SVM complexity constant which sets the tolerance for misclassification, where higher C values allow for 'softer' boundaries and lower values create 'harder' boundaries. A complexity constant that is too large can lead to over-fitting, while values that are too small may result in over-generalization.

Convergence epsilon

This is an optimizer parameter. It specifies the precision on the KKT conditions.

Max iterations

This is an optimizer parameter. It specifies to stop iterations after a specified number of iterations.

Scale

This is a global parameter. If checked, the example values are scaled and the scaling parameters are stored for a test set.