Skip to main content

Generate Univariate Interpretation

Synopsis

This operator helps you to understand the univariate relationship between you prediction and your attributes. It currently supports two types of methods for this: Partial Dependency Plots (PDP) and Accumulated Local Effect (ALE) plots

Description

Partial Dependency Plots are a tool to understand the dependency of your model on a certain attribute. To do this we take for example the titanic data set. We put the value of of the attribute at hand to the minimal value of this attribute. We can then calculate the average confidence for the positive class, or, in case of a regression problem, we can calculate the average of the prediction. We now use higher values for our attributes and repeat the procedure. The numbers of steps can be defined using the number of bins parameter. The result is a data set which can be used to understand the behaviour of our prediction with respect to the attribute. Often you use a line chart to display the behaviour.

One downside of Partial Dependency plots which is often quoted is, that they do not treat the effects of correlations in a data set. In the titanic example you may set the fare of a first class passenger to 6, which is unreasonable. This problem can be treated by using Accumulated Local Effects plots.

In accumulated local effect plots we first discretize the column we want to look at. In the case of the titanice data set and Age the attribute to interpret we may look first at the passengers between 0 and 10 years. We take those passengers, put their age to the minimum value of the bin (0) and score them with our model. We then set the age for every passenger to the maximal value of the bin (10). We now calculate the difference for each passenger between confidences or regression values with lower or upper values. We then take the average of all differences this reflect the local effect of a given attribute. The ALE-Value for the bin is then the sum of the ALE values in all proceeding bins and the given bin. This effect is centered so that the mean effect is zero.

The value of the ALE can be interpreted as the main effect of the feature at a certain value compared to the average prediction of the data. For example, an ALE estimate of -2 at xj=3 means that when the j-th feature has value 3, then the prediction is lower by 2 compared to the average prediction. For more details, please see: https://christophm.github.io/interpretable-ml-book/ale.html

Input

model

The model whose predictions are to be explained.

exa

The training data.

Output

exa

The data set with all interpretations.

ori

The original data passed through.

mod

The original model passed through.

Parameters

Number of bins

Number of bins to use in PDP calculation.

Method

The method used to generate the interpretation

  • PDP: Uses Partial Dependency Plots.
  • ALE: Uses Accumulated Local Effect plots.

Equal size binning

If set to true, the bins used in ALE have the same size, i.e. the same number of examples in. This leads to bins of different widths which may look unusual but increases the stability of the values.