Local Interpretation (LIME)
Synopsis
This operator is a meta operator to generate an approximation of the decision a given (complex) model made for specific examples. The key idea is to generate local feature weights ("Interpretations") which can be easier interpreted and thus can help to understand the "reasoning" for a decision of a complex model on a per example basis.
Description
To do this we run the following algorithm:
-
- Draw uniformly distributed random data in [0,1] for all attributes
-
- Scale them to the same min/max like the input data
-
- Score them with the complex model
-
For each example in your input set do:
-
- Calculate the euclidian distance d between the example and your random examples (normalized)
-
- Use w = sqrt(exp(-(pow(d,2)/pow(this.kernel_width,2))) as a weight for the local model
-
- Run the inner process, which creates a feature weight (e.g. via Weight by XXX, Linear Regression or Logistic Regression)
-
- Add the top k attributes with their importance to the input set
As a result you get example set with new attributes describing the most important local attributes and a collection of attribute weights containing the full vector. If you calculate a performance vector in the inner process and connect it to the Per port, you can get the performance of the local description attached to every example.
- https://homes.cs.washington.edu/~marcotcr/blog/lime/
- https://arxiv.org/pdf/1602.04938v1.pdf
- https://github.com/marcotcr/lime
Input
exa
The ExampleSet you want to get interpretations for. Needs to have a reasonable size to estimate min and max.
mod
The (complex) input model.
Output
exa
ExampleSet with local interpretations.
mod
The passed through input model.
wei
A collection of Attribute Weights for each example.
loc
The collection of local models.
Parameters
Use locality heuristics
If this parameter is set to true the locality heuristics derived from LIME (0.2*sqrt(#atts)) is used, otherwise the locality has to be set manually.
Locality
A factor describing how local the model should be. The smaller this value is the more localized the model. It is used as kernel_width in step 5.
Sample size
Number of random examples drawn to built the local models on.
Number of attributes
Number of attributes put into the ExampleSet in step 7. All attribute weights are delivered via the 'wei' port.
Weight threshold
A threshold to remove all examples with weights smaller than this value in each iteration. This removes irrelevant (non-local) random examples from the learning and can significantly speed up the operator.
Use local random seed
This parameter indicates if a local random seed should be used.
Local random seed
If the use local random seed parameter is checked this parameter determines the local random seed.