Random Forest Encoder
Synopsis
This operator applies a Random Forest model on a data set. The difference between this operator and the usual Apply Model operator is that this does not create confidences and predictions but rather the confidence for the positive class for each individual tree in the forest. The result is an ExampleSet with X new attributes (where X = number of trees) called score_X. This can be used as an encoder. One application for this is to build a more sophisticated voting model than the typical voting (average) by training another learner on the results. Another use case is to encode nominal features into numerical ones.
This operator also provides a preprocessing model. This preprocessing model can be grouped with any subsequent model to be applied after another.
Input
exa
Input ExampleSet which should be encoded.
mod
Random Forest model which is used for encoding.
Output
exa
The ExampleSet with the result of the application.
mod
The passed through Random Forest model.
pre
A preprocessing model which can be used to apply the same transformation to another data set. This can also be used with the Group Models operator.
Parameters
Remove original attributes
If checked all original attributes are removed from the resulting ExampleSet and only the encoding attributes are kept.