Add Batch Normalization Layer
Synopsis
Adds a batch normalization layer to your neural net structure.
Description
This operator has to be placed into the subprocess of the Deep Learning, Deep Learning (Tensor) or Autoencoder operator. It adds a batch normalization layer to the neural net structure. However it is often used just before applying an activation function. In order to add it before an activation function, deactivate the activation function of the previous layer, add the batch normalization layer followed by an activation layer with the desired activation function chosen.
Batch normalization is a method used to potentially speed up training (depending on the size of the data) and avoid the so-called covariance shift. One wants to avoid covariance shift, to ease the process of optimization. Batch normalization works by subtracting the mean of the current batch from its entries and dividing them by their standard deviation. During training mean and standard deviation are taken from the current batch size, while for testing the values are obtained from the whole data set to avoid bias.
After normalization the batch values can also be scaled and changed via an offset, using the gamma and beta parameters.
Input
layerArchitecture
A network configuration setup with previous operators. Connect this port to the layerArchitecture output port of another add layer operator or to the layer port of the "Deep Learning" operator if this layer is the first one.
Output
layerArchitecture
The network with the configuration for this batch normalization layer added. Connect this port to the next input port of another layer or the layer port on the right side of the "Deep Learning" operator.
Parameters
Gamma
Provide a scaling factor used to change the data after it has been normalized.
Beta
Provide an offset value, that is added to the data after it has been normalized.
Lock gamma and beta
Check this option to keep gamma and beta values the same during the whole training process. Uncheck it, if these values should be changed using the decay parameter.
Decay
Provide a value, that is substracted from the initial gamma value before using it for scaling the mean.
Epsilon
Provide a small value, that is added to the variance.
Layer name
Provide a name for the layer for ease of identification, when inspecting the model or re-using it.