Sliding Window Validation
Synopsis
This operator performs a sliding window validation for a machine learning model trained on time dependent input data.
Description
The operator creates sliding windows from the input data. In each validation step the training window is provided at the inner training set port of the Training subprocess. The size of the training window is defined by the parameter training window size. The training window can be used to train a machine learning model which has to be provided to the model port of the Training subprocess.
The test window of the input data is provided at the inner test set port of the Testing subprocess. Its size is defined by the parameter test window size. The model trained in the Training subprocess is provided at the model port of the Testing subprocess. It can be applied on the test set. The performance of this prediction can be evaluated and the performance vector has to be provided to the performance port of the Testing process. For the next validation fold, the training and the test windows are shifted by k values, defined by the parameter step size.
The described behavior is the default example based windowing. It can be changed to time based windowing or custom windowing by changing the unit parameter. For time based windowing, the windowing parameter are specified in time durations/periods. For the "custom" windowing an additional ExampleSet has to be provided to the new "custom windows" input port. It holds the start (and optional the stop values) of the windows. For more details see the unit parameter and the description of the corresponding parameters.
Expert settings (for example no overlapping windows, the empty window handling, ..) can be enabled by selecting the corresponding expert settings parameter.
The sliding window validation ensures that the machine learning model built in the Training subprocess is always evaluated on Examples which are after the training window.
If the model output port of the Sliding Window Validation operator is connected, a final window with the same size as the training windows, but ending at the last example of the input series is used to train a final model. This final model is provided at the model output port.
This operator works on all time series (numerical, nominal and time series with date time values).