ExampleSet to Tensor
Synopsis
This Operator converts series like data from an ExampleSet into a Tensor object for Deep Learning purposes and allows to set the labeling type (many-to-one, many-to-many).
Description
This Operator expects an ExampleSet containing series like data. The data needs to contain a batch column and an ID column for example identification. The batch column needs to be nominal, whereas all rows from a batch should have the same batch value. The ID column, should be numerical and sorted to fit the desired data structure. Each batch is allowed to have different numbers of rows.
An example for its usage is time-series like sensor data of machines. Such a data set could contain data taken from a machine across 10 days. Each day these machines were active for a different time resulting in varying numbers of time-steps (rows) per day (batch) reaching from only 20 to 30 measurements. For this simple example days might be grouped into 'good' runs and 'bad' runs (as our label values). To encode this situation into the data, we would need a batch attribute that contains values 1 to 10, is nominal and has the special role 'batch'. The time-steps (rows per batch) would be encoded by using a numeric ID attribute, with the special role 'ID', which would have values between 20 and 30. Since this example would predict only one value from many time-steps of each example (batch), the label would need to contain the same label value across all rows of each batch. In this scenario we could have multiple numeric attributes describing each time-step (row) of each measurement (batch).
Considering another example with the same data, but this time we would like to make a prediction for each time-step. Therefore we would just need to provide a numeric label attribute containing labeled data for each time-step. This would be considered a many-to-many problem. The operator will detect this automatically.
After converting series like data to a Tensor object, use the "Deep Learning (Tensor)" Operator to train a Deep Learning model on it and the "Apply Model (Generic)" Operator for application.
The neural network architecture for series like problems, often features one or more LSTM operators followed by fully-connected ones.
Input
ExampleSet
A collection of ExampleSets containing time-series like data. A single ExampleSet is treated as one Example in the tensor, while each row of an ExampleSet is seen as a time-step with columns representing the given Attributes. Make sure that all ExampleSets have the same Attribute structure, the number of time-steps (rows) can differ.
Output
Tensor
A Tensor object suitable for the use with the "Deep Learning (Tensor)"- and the "Apply Model (Generic)"-Operators.
original
The ExampleSet, that was given as input is passed through without changes
Parameters
Batch attribute
Selects attribute that will be used for sequence identification. Records with the same value will automatically belong to the same sequence (~sequence elements/steps). Should be sorted in ascending order.
Id attribute
Selects attribute that will be used for sequence element/step identification. Should be sorted in ascending order. The order of the elements/steps will be defined by their values (their batch attribute value is the same).
Infer label sequence type
If TRUE, the sequence type (many-to-one, many-to-many) will be determined by the data automatically. Otherwise the user needs to manually select the appropriate setting.
Label type
Available "sequence types" for classification and regression analysis with deep learning.