Activation functions

Activation functions such as the rectified linear unit (ReLu) or sigmoid are treated as layers within neon. For convenience, these functions are wrapped inside the Activation layer, which takes care of a lot of the layer-specific verbiage. Neon has the following activation functions:

Name Description
neon.transforms.Identity \(f(x) = x\)
neon.transforms.Rectlin \(f(x) = \max(x, 0)\)
neon.transforms.Explin \(f(x) = \max(x, 0) + \alpha (e^{\min(x, 0)}-1)\)
neon.transforms.Normalizer \(f(x) = x / \alpha\)
neon.transforms.Softmax \(f(x_j) = \frac{\exp{x_j}}{\sum_i \exp {x_i}}\)
neon.transforms.Tanh \(f(x) = \tanh(x)\)
neon.transforms.Logistic \(f(x) = \frac{1}{1+e^{-x}}\)

Creating custom activations

To create a new activation function, subclass from Transform and implement the bprop() and __call__() methods.

As an example, we implement the ReLu function:

class MyReLu(Transform):
    " ReLu activation function. Implements f(x) = max(0,x) "

    def __init__(self, name=None):
        super(MyReLu, self).__init__(name)

    # f(x) = max(0,x)
    def __call__(self, x):
        return, 0)

    # If x > 0, gradient is 1; otherwise 0.
    def bprop(self, x):
        return, 0)

Both methods receive as input a Tensor x.

In most models, activation functions are appended to a filtering (e.g. convolution) or linear (all-to-all) layer. For this reason, Neon provides several convenient CompoundLayer classes (Affine, Conv, and Deconv. For example, a linear layer followed by your ReLu function can be instantiated via

layers = [Affine(nout = 1000, activation=MyReLu())]