This API documentation covers each model within neon. Most modules have a corresponding user guide section that introduces the main concepts. See this API for specific function definitions.

Module API Description User Guide
neon Holds NervanaObject, the base object available to all other classes  
neon.backends Computational backend (CPU, MKL or GPU) neon backend
neon.data Data loading and handling Data loading, Datasets
neon.models Model architecture Models
neon.layers Layer objects Layers, Creating new layers, Layer containers
neon.initializers Weight initializer methods Initializers
neon.transforms Activation functions and Costs/Metrics Activations, Costs and Metrics
neon.callbacks Callbacks during model training Callbacks
neon.optimizers Learning algorithms Optimizers, Learning schedules
neon.visualizations Visualization of training cost and weight histograms Visualizing results
neon.util Utility module  


The base (global) object NervanaObject contains the attribute be, the reference to the computational backend.

neon.NervanaObject Base (global) object available to all other classes.


This module defines the computational backend of neon, either based on CPU or GPU hardware. Included are classes that implement neon’s auto-differentiation feature.

neon.backends.gen_backend Construct and return a backend instance of the appropriate type based on the arguments given.
neon.backends.backend.Tensor The n-dimensional array data structure.
neon.backends.backend.Backend Backend interface used to manipulate Tensor data.
neon.backends.backend.OpTreeNode An OpTreeNode is a tuple of length 3.
neon.backends.backend.Block Simple class that identifies different elements of the computation required to train or run inference on neural networks.
neon.backends.nervanacpu.CPUTensor The n-dimensional array data structure that resides in host memory, and is meant to be manipulated on the CPU.
neon.backends.nervanacpu.NervanaCPU Sets up a numpy baseyd backend for matrix ops.
neon.backends.nervanamkl.MKLTensor MKLTensor, special for MKL Backend
neon.backends.nervanamkl.NervanaMKL MKL Backend
neon.backends.autodiff.Autodiff Automatic differentiation given an op-tree.
neon.backends.autodiff.GradNode A node in grad_tree.
neon.backends.autodiff.GradUtil.get_grad_back Get left and right gradient increments from back-propagation.
neon.backends.autodiff.GradUtil.is_invalid Test if the result of grad_op_tree contains Nan, inf, -inf, or abnormally large or small numbers.


Data-related classes and methods comprise this module, including methods for loading data and iterating through minibatches of data during training.

neon.data.dataiterator.NervanaDataIterator Abstract class for data iterators.
neon.data.dataiterator.ArrayIterator The ArrayIterator class iterates over minibatches of data that have been preloaded into memory in the form of numpy arrays.
neon.data.hdf5iterator.HDF5Iterator Data iterator which uses an HDF5 file as the source of the data, useful when the entire HDF5 dataset cannot fit into memory (for smaller datasets, use the ArrayIterator).
neon.data.hdf5iterator.HDF5IteratorAutoencoder Extends the base HDF5Iterator class for an autoencoder model.
neon.data.hdf5iterator.HDF5IteratorOneHot Extends the HDF5Iterator class to add one hot conversion of the target data.
neon.data.imageloader.ImageLoader Encapsulates the data loader library and exposes a backward-compatible API to iterate over minibatches of images.
neon.data.dataloader.DataLoader Encapsulates the data loader library and exposes an API to iterate over generic data (images, video or audio given in compressed form).
neon.data.media.ImageParams Used to provide image specific parameters while loading data.
neon.data.media.VideoParams Used to provide video specific parameters while loading data.
neon.data.media.AudioParams Used to provide audio specific parameters while loading data.


The DataLoader and ImageLoader classes will be deprecated with the upcoming release of neon v2.0 in favor of the new Aeon-based DataLoader. For documentation of the aeon package, see http://aeon.nervanasys.com.

The new Aeon-based dataloader supports several classes that perform transformations on the data provisioned by aeon:

neon.data.dataloader_transformers.DataLoaderTransformer DataLoaderTransformers are used to transform the output of a DataLoader.
neon.data.dataloader_transformers.OneHot OneHot will convert index into a onehot vector.
neon.data.dataloader_transformers.PixelWiseOneHot OneHot will convert index into a onehot vector.
neon.data.dataloader_transformers.TypeCast TypeCast data from dataloader at index to dtype and move into device memory if not already.
neon.data.dataloader_transformers.BGRMeanSubtract subtract pixel_mean from data at index. Assumes data is in CxHxWxN

Dataset objects for storing data from common modalities (e.g. Text), as well as specific stock datasets (e.g. MNIST, CIFAR-10, Penn Treebank) are included.

neon.data.datasets.Dataset Container class for stock datasets.
neon.data.image.MNIST MNIST data set from http://yann.lecun.com/exdb/mnist/
neon.data.image.CIFAR10 CIFAR10 data set from https://www.cs.toronto.edu/~kriz/cifar.html
neon.data.imagecaption.ImageCaption This class loads in the sentences and CNN image features for image captioning that have been taken from Andrej Karpathy’s Deep Visual-Semantic Alignments work.
neon.data.imagecaption.Flickr8k Flickr8k data set from http://cs.stanford.edu/people/karpathy/cvpr2015.pdf
neon.data.imagecaption.Flickr30k Flickr30k data set from http://cs.stanford.edu/people/karpathy/cvpr2015.pdf
neon.data.imagecaption.Coco MSCOCO data set from http://cs.stanford.edu/people/karpathy/cvpr2015.pdf
neon.data.text.Text This class defines methods for loading and iterating over text datasets.
neon.data.text.Shakespeare Shakespeare data set from http://cs.stanford.edu/people/karpathy/char-rnn.
neon.data.text.PTB Penn Treebank data set from http://arxiv.org/pdf/1409.2329v5.pdf
neon.data.text.HutterPrize Hutter Prize data set from http://prize.hutter1.net/
neon.data.text.IMDB IMDB data set from http://www.aclweb.org/anthology/P11-1015..
neon.data.questionanswer.QA A general QA container to take Q&A dataset, which has already been vectorized and create a data iterator to feed data to training.
neon.data.questionanswer.BABI This class loads in the Facebook bAbI dataset and vectorizes them into stories, questions, and answers as described in: “Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks” http://arxiv.org/abs/1502.05698.
neon.data.ticker.Ticker This class defines methods for generating and iterating over ticker datasets.
neon.data.ticker.Task Base class from which ticker tasks inherit.
neon.data.ticker.CopyTask Copy task from the Neural Turing Machines paper: http://arxiv.org/abs/1410.5401.
neon.data.ticker.RepeatCopyTask Repeat Copy task from the Neural Turing Machines paper: http://arxiv.org/abs/1410.5401.
neon.data.ticker.PrioritySortTask Priority Sort task from the Neural Turing Machines paper: http://arxiv.org/abs/1410.5401.
neon.data.speech.Speech Not implemented.
neon.data.video.Video Not implemented.


The Model class stores a list of layers describing the model. Methods are provided to train the model weights, perform inference, and save/load the model.

neon.models.model.Model Class which stores a list of layers describing the model.


This modules contains class definitions for common neural network layers. Base layers from which other layers are subclassed are

neon.layers.layer.Layer Top level generic neural network layer class from which all other layer types inherit.
neon.layers.layer.ParameterLayer Intermediate class used for common functionality for any layer with weights.
neon.layers.layer.CompoundLayer Base class for macro layers.

Common Layers

neon.layers.layer.Bias A bias layer implemented that adds a learned bias to inputs and produces outputs of the same shape.
neon.layers.layer.Linear A fully connected layer implemented as the dot product of inputs and weights.
neon.layers.layer.Affine A linear layer with a learned bias and activation, implemented as a list composing separate linear, bias/batchnorm and activation layers.
neon.layers.layer.Dropout A dropout layer.
neon.layers.layer.LookupTable A lookup table layer or a word embedding layer.
neon.layers.layer.Activation A layer that applies a specified transform to the inputs and produces outputs of the same shape.
neon.layers.layer.BatchNorm A batch normalization layer as described in [Ioffe2015].
neon.layers.layer.BatchNormAutodiff An example to use autodiff in batchnorm.
neon.layers.layer.Pooling Pooling layer implementation.
neon.layers.layer.LRN Local Response Normalization layer.
neon.layers.layer.DataTransform A layer that applies a specified transform to input data in fprop only.
neon.layers.layer.BranchNode Layer that allows branching.
neon.layers.layer.SkipNode Layer that allows pass-through as in [He2015].

Convolutional Layers

neon.layers.layer.Convolution Convolutional layer implementation.
neon.layers.layer.Conv A convolutional layer with a learned bias and activation, implemented as a list composing separate Convolution, Bias and Activation layers.
neon.layers.layer.Deconvolution Deconvolutional layer implementation.
neon.layers.layer.Deconv Same as Conv layer, but implements a composite deconvolution layer.

Recurrent Layers

neon.layers.recurrent.Recurrent Basic recurrent layer.
neon.layers.recurrent.LSTM Long Short-Term Memory (LSTM) layer based on Hochreiter and Schmidhuber, Neural Computation 9(8): 1735-80 (1997).
neon.layers.recurrent.GRU Implementation of the Gated Recurrent Unit based on [Cho2014].
neon.layers.recurrent.BiRNN Basic Bi Directional Recurrent layer.
neon.layers.recurrent.BiLSTM Long Short-Term Memory (LSTM).
neon.layers.recurrent.DeepBiRNN A stacked Bi-directional recurrent layer.
neon.layers.recurrent.DeepBiLSTM A stacked Bi-directional LSTM layer.
neon.layers.recurrent.RecurrentOutput A layer to combine the recurrent layer outputs over time steps.
neon.layers.recurrent.RecurrentSum A layer that sums over the recurrent layer outputs over time.
neon.layers.recurrent.RecurrentMean A layer that gets the averaged recurrent layer outputs over time.
neon.layers.recurrent.RecurrentLast A layer that only keeps the recurrent layer output at the last time step.

Containers govern the structure of the model. For a linear cascade of layers, the Sequential container is sufficient. Models that have branching and merging should use the other containers.

neon.layers.container.LayerContainer Layer containers are a generic class that are used to encapsulate groups of layers and provide methods for propagating through the constituent layers, allocating memory.
neon.layers.container.Sequential Layer container that encapsulates a simple linear pathway of layers.
neon.layers.container.Tree Layer container that encapsulates a simple linear pathway of layers.
neon.layers.container.SingleOutputTree Subclass of the Tree container which returns only the output of the main branch (branch index 0) during inference.
neon.layers.container.Broadcast Parent class for MergeSum and MergeBroadcast.
neon.layers.container.MergeBroadcast Branches a single incoming layer or object (broadcast) into multiple output paths that are then combined again (merged).
neon.layers.container.MergeMultistream Merging multiple input sources via concatenation.
neon.layers.layer.RoiPooling RoiPooling uses max pooling to convert the features inside any ROI into a small feature map with a fixed spatial extend of H x W, where H and W are layer parameters independent of any particular ROI.

Generic cost layers are implemented in the following classes. Note that these classes subclass from NervanaObject, not any base layer class.

neon.layers.layer.GeneralizedCost A cost layer that applies the provided cost function and computes errors with respect to inputs and targets.
neon.layers.layer.GeneralizedCostMask A cost layer that applies the provided cost function and computes errors with respect to inputs and targets.
neon.layers.container.Multicost Class used to compute cost from a Tree container with multiple outputs.


Layer weights can be initialized with the following approaches

neon.initializers.initializer.Initializer Abstract base class from which parameter tensor initializers inherit.
neon.initializers.initializer.Array Initializes parameters with values specified by a provided numpy array.
neon.initializers.initializer.Constant Initializes parameters as a constant.
neon.initializers.initializer.Gaussian Initializes parameters with a gaussian distribution with the provided mean and standard deviation.
neon.initializers.initializer.IdentityInit Initializes parameters with the identity matrix.
neon.initializers.initializer.Uniform Initializes parameters with random values drawn from a uniform distribution.
neon.initializers.initializer.GlorotUniform Initializes parameter tensors with values drawn from a uniform distribution ranging from \(-K\) to \(K\).
neon.initializers.initializer.Kaiming Initializes parameters with a zero-mean Gaussian distribution.
neon.initializers.initializer.Orthonormal Initializes parameters with the single value decomposition of a random gaussian matrix.
neon.initializers.initializer.Xavier Initializes parameter tensors with values drawn from a uniform distribution ranging from \(-K\) to \(K\) We define \(K=\sqrt{3 / (n_{in})}\), where \(n_{in}\) is the number of input nodes.


This modules contain activation functions, costs, and metrics.

Activation functions

neon.transforms.transform.Transform Base class for activation or cost functions and their derivatives.
neon.transforms.activation.Identity Identity activation function, \(f(x) = x\)
neon.transforms.activation.Explin Exponential Linear activation function, \(f(x) = \max(x, 0) + \alpha (e^{\min(x, 0)}-1)\)
neon.transforms.activation.Rectlin Rectified Linear Unit (ReLu) activation function, \(f(x) = \max(x, 0)\).
neon.transforms.activation.Softmax SoftMax activation function.
neon.transforms.activation.Tanh Hyperbolic tangent activation function, \(f(x) = \tanh(x)\).
neon.transforms.activation.Logistic Logistic sigmoid activation function, \(f(x) = 1 / (1 + \exp(-x))\)
neon.transforms.activation.Normalizer Normalize inputs by a fixed divisor.


neon.transforms.cost.Cost Base class for cost functions that are used during training.
neon.transforms.cost.CrossEntropyBinary Binary cross-entropy cost.
neon.transforms.cost.CrossEntropyMulti Multi-class cross-entropy cost.
neon.transforms.cost.SumSquared Total Squared Error cost function.
neon.transforms.cost.MeanSquared Average Squared Error cost function.
neon.transforms.cost.LogLoss LogLoss metric.


neon.transforms.cost.Metric Base class for Metrics.
neon.transforms.cost.Misclassification Misclassification error metric.
neon.transforms.cost.TopKMisclassification Multiple misclassification metrics.
neon.transforms.cost.Accuracy Accuracy metric (correct rate).
neon.transforms.cost.PrecisionRecall Precision and Recall metrics.
neon.transforms.cost.ObjectDetection The object detection metric includes object label accuracy, and bounding box regression.


neon implements the following learning algorithms for updating the weights.

neon.optimizers.optimizer.Optimizer The optimizer class handles the gradient update stage of training a neural network.
neon.optimizers.optimizer.GradientDescentMomentum Stochastic gradient descent with momentum.
neon.optimizers.optimizer.RMSProp Root Mean Square propagation.
neon.optimizers.optimizer.Adadelta Adadelta optimization algorithm.
neon.optimizers.optimizer.Adagrad Adagrad optimization algorithm.
neon.optimizers.optimizer.Adam Adam optimizer.
neon.optimizers.optimizer.MultiOptimizer A wrapper class for using multiple Optimizers within the same model.

For some optimizers, users can adjust the learning rate over the course of training by providing a schedule.

neon.optimizers.optimizer.Schedule Learning rate schedule.
neon.optimizers.optimizer.StepSchedule Steps the learning rate over training time.
neon.optimizers.optimizer.PowerSchedule Multiplies the learning rate by a factor at regular epoch intervals.
neon.optimizers.optimizer.ExpSchedule Exponential learning rate schedule.
neon.optimizers.optimizer.PolySchedule Polynomial learning rate schedule.


Callbacks are methods that are called at user-defined times during training. They can be scheduled to occur at the beginning/end of training/minibatch/epoch. Callbacks can be used to, for example, periodically report training loss or save weight histograms.

neon.callbacks.callbacks.Callbacks Container class for storing and iterating over callbacks.
neon.callbacks.callbacks.Callback Interface defining common callback functions.
neon.callbacks.callbacks.RunTimerCallback Callback which tracks the total training time.
neon.callbacks.callbacks.TrainCostCallback Callback for computing average training cost periodically during training.
neon.callbacks.callbacks.ProgressBarCallback Callback providing a live updating console based progress bar.
neon.callbacks.callbacks.TrainLoggerCallback Callback for logging training progress.
neon.callbacks.callbacks.SerializeModelCallback Callback for serializing the state of the model.
neon.callbacks.callbacks.LossCallback Callback for calculating the loss on a given dataset periodically during training.
neon.callbacks.callbacks.MetricCallback Callback for calculating a metric on a given dataset periodically during training.
neon.callbacks.callbacks.MultiLabelStatsCallback Callback for calculating statistics on multi-label classification tasks.
neon.callbacks.callbacks.HistCallback Collect histograms of weights of all layers.
neon.callbacks.callbacks.SaveBestStateCallback Callback for saving the best model state so far.
neon.callbacks.callbacks.EarlyStopCallback Callback for stopping training when a threshold has been triggered.
neon.callbacks.callbacks.DeconvCallback Callback to store data after projecting activations back to pixel space using guided backpropagation.
neon.callbacks.callbacks.BatchNormTuneCallback Callback for tuning batch norm parameters with unbiased estimators for global mean and var.
neon.callbacks.callbacks.WatchTickerCallback Callback that examines a single input, output pair using a validation set.


This module generates visualizations using the nvis command line function.



Useful utility functions, including parsing the command line and saving/loading of objects.

neon.util.argparser.NeonArgparser Setup the command line arg parser and parse the arguments in sys.arg (or from configuration file).
neon.util.argparser.extract_valid_args Given a namespace of argparser args, extract those applicable to func.
neon.util.compat Contains various functions and wrappers to make code Python 2 and Python 3 compatible.
neon.util.persist.load_class Helper function to take a string with the neon module and
neon.util.persist.load_obj Loads a saved on-disk representation to a python data structure.
neon.util.persist.save_obj Dumps a python data structure to a saved on-disk representation.
neon.util.modeldesc.ModelDescription Container class for the model serialization dictionary.
neon.util.yaml_parse Tools for parsing neon model definition files (YAML formatted) and generating neon model objects from the definition.