Normalize
Synopsis
This Operator normalizes the values of the selected Attributes.
Description
Normalization is used to scale values so they fit in a specific range. Adjusting the value range is very important when dealing with Attributes of different units and scales. For example, when using the Euclidean distance all Attributes should have the same scale for a fair comparison. Normalization is useful to compare Attributes that vary in size. This Operator performs normalization of the selected Attributes. Four normalization methods are provided. These methods are explained in the parameters.
Differentiation
Scale by Weights
This Operator can be used to scale Attributes by pre-calculated weights. Instead of adjusting the value range to a common scale, this Operator can be used to give important Attributes even more weight.
Normalize
This Operator can be used to revert a previously applied normalization. It requires the preprocessing model returned by a Normalization Operator.
Input
example set
This input port expects an ExampleSet.
Output
example set
The ExampleSet with the selected Attributes in normalized form is output of this port.
original
The ExampleSet that was given as input is passed through without changes.
preprocessing model
This port delivers the preprocessing model. It can be used by the Apply Model Operator to perform the specified normalization on another ExampleSet. This is helpful for example if the normalization is used during training and the same transformation has to be applied on test or actual data.
The preprocessing model can also be grouped together with other preprocessing models and learning models by the Group Models Operator.
Parameters
Create view
Create a View instead of changing the underlying data. If this option is checked, the normalization is delayed until the transformations are needed. This parameter can be considered a legacy option.
Attribute filter type
This parameter allows you to select the Attribute selection filter; the method you want to use for selecting Attributes. It has the following options:
- all: This option selects all the Attributes of the ExampleSet, so that no Attributes are removed. This is the default option.
- single: This option allows the selection of a single Attribute. The required Attribute is selected by the attribute parameter.
- subset: This option allows the selection of multiple Attributes through a list (see parameter attributes). If the meta data of the ExampleSet is known, all Attributes are present in the list and the required ones can easily be selected.
- regular_expression: This option allows you to specify a regular expression for the Attribute selection. The regular expression filter is configured by the parameters regular expression, use except expression and except expression.
- value_type: This option allows selection of all the Attributes of a particular type. It should be noted that types are hierarchical. For example, both real and integer types belong to the numeric type. The value type filter is configured by the parameters value type, use value type exception, except value type.
- block_type: This option allows the selection of all the Attributes of a particular block type. It should be noted that block types may be hierarchical. For example, value_series_start and value_series_end block types both belong to the value_series block type. The block type filter is configured by the parameters block type, use block type exception, except block type.
- no_missing_values: This option selects all Attributes of the ExampleSet, which do not contain a missing value in any Example. Attributes that have even a single missing value are removed.
- numeric_value_filter: All numeric Attributes whose Examples all match a given numeric condition are selected. The condition is specified by the numeric condition parameter. Please note that all nominal Attributes are also selected irrespective of the given numerical condition.
Attribute
The required Attribute can be selected from this option. The Attribute name can be selected from the drop down box of the parameter if the meta data is known.
Attributes
The required Attributes can be selected from this option. This opens a new window with two lists. All Attributes are present in the left list. They can be shifted to the right list, which is the list of selected Attributes that will make it to the output port.
Regular expression
Attributes whose names match this expression will be selected. The expression can be specified through the edit and preview regular expression menu. This menu gives a good idea of regular expressions and it also allows you to try different expressions and preview the results simultaneously.
Use except expression
If enabled, an exception to the first regular expression can be specified. This exception is specified by the except regular expression parameter.