Sliding Window Validation
Synopsis
This operator performs a sliding window validation for a machine learning model trained on time dependent input data.
Description
The operator creates sliding windows from the input data. In each validation step the training window is provided at the inner training set port of the Training subprocess. The size of the training window is defined by the parameter training window size. The training window can be used to train a machine learning model which has to be provided to the model port of the Training subprocess.
The test window of the input data is provided at the inner test set port of the Testing subprocess. Its size is defined by the parameter test window size. The model trained in the Training subprocess is provided at the model port of the Testing subprocess. It can be applied on the test set. The performance of this prediction can be evaluated and the performance vector has to be provided to the performance port of the Testing process. For the next validation fold, the training and the test windows are shifted by k values, defined by the parameter step size.
The described behavior is the default example based windowing. It can be changed to time based windowing or custom windowing by changing the unit parameter. For time based windowing, the windowing parameter are specified in time durations/periods. For the "custom" windowing an additional ExampleSet has to be provided to the new "custom windows" input port. It holds the start (and optional the stop values) of the windows. For more details see the unit parameter and the description of the corresponding parameters.
Expert settings (for example no overlapping windows, the empty window handling, ..) can be enabled by selecting the corresponding expert settings parameter.
The sliding window validation ensures that the machine learning model built in the Training subprocess is always evaluated on Examples which are after the training window.
If the model output port of the Sliding Window Validation operator is connected, a final window with the same size as the training windows, but ending at the last example of the input series is used to train a final model. This final model is provided at the model output port.
This operator works on all time series (numerical, nominal and time series with date time values).
Input
example set
This input port receives an ExampleSet to apply the sliding window validation.
custom windows
The example set which contains the start (and stop) values of the custom windows. Only needs to be connected if the parameter unit is set to custom.
Output
model
If the model output port of the Sliding Window Validation operator is connected, a final window with the same size as the training windows, but ending at the last example of the input series is used to train a final model. This final model is provided at the model output port.
example set
The ExampleSet that was given as input is passed through without changes.
test result set
All test set ExampleSets, appended to one ExampleSet.
performance
This is an expandable port. You can connect any performance vector (result of a Performance operator) to the result port of the inner Testing subprocess. The performance output port delivers the average of the performances over all folds of the validation
Parameters
Has indices
This parameter indicates if there is an index attribute associated with the time series. If this parameter is set to true, the index attribute has to be selected.
Indices attribute
If the parameter has indices is set to true, this parameter defines the associated index attribute. It can be either a date, date_time or numeric value type attribute. The attribute name can be selected from the drop down box of the parameter if the meta data is known.
Sort time series
If this parameter is selected, the input time series will be sorted, according to the selected indices attribute, before the time series operation is applied on. If it is not selected and the input time series is not sorted, a corresponding User Error is thrown.
Keep in mind that the indices values still needs to be unique. If the values are non-unique a corresponding User Error is thrown.
Expert settings
This parameter can be selected to show expert settings for a more detailed configuration of the operator. The expert settings are: windows defined, custom start point, custom end point, date format, no overlapping windows, and empty window handling.
Unit
The mode on how windows are defined. It defines the unit of the window parameters (training window size, step size, test window size and test window offset).
- example based: The window parameters are specified in number of examples. This is the default option.
- time based: The window parameter are specified in time durations/periods (units ranging from milliseconds to years).
- custom: An additional example set has to be provided to the new "custom windows" input port. It holds the start (and optional the stop values) of the windows.
Windows defined
This parameter defines the point from which the windows are defined of. It is an expert setting and hence it is only shown if the parameter expert settings is selected.
- from start: The first window will start at the first example of the input data set. The following windows are set up according to the window parameters.
- from end: The last window will end at the last example of the input data set. The previous windows are set up according to the window parameters.
- custom start: The first window will start at the custom start point provided by the parameter custom start point / custom start time. The following windows are set up according to the window parameters.
- custom end: The last window will end at the custom end point provided by the parameter custom end point / custom end time. The previous windows are set up according to the window parameters.
Custom start point
If the parameter windows defined is set to custom start and the unit is set to example based, this parameter defines the custom point from which the windows start. It is an expert setting and hence it is only shown if the parameter expert settings is selected.
Custom end point
If the parameter windows defined is set to custom end and the unit is set to example based, this parameter defines the custom point where the windows end. It is an expert setting and hence it is only shown if the parameter expert settings is selected.
Custom start time
If the parameter windows defined is set to custom start and the unit is set to time based, this parameter defines the custom date time point from which the windows start.
The date time format used to interpret the string provided in this parameter is defined by the parameter date format. It is an expert setting and hence it is only shown if the parameter expert settings is selected.
Custom end time
If the parameter windows defined is set to custom end and the unit is set to time based, this parameter defines the custom date time point where the windows end.
The date time format used to interpret the string provided in this parameter is defined by the parameter date format. It is an expert setting and hence it is only shown if the parameter expert settings is selected.
Date format
Date format used for the custom start time and custom end time parameters. It is an expert setting and hence it is only shown if the parameter expert settings is selected.