Models

The Model class represents the network architecture, either provided as a layer container, a list of layers, or loaded from a previously saved model file.

Lifecycle of a model

Instantiation

We create a model by passing either a list of layers or a container. If a list of layers is provided, the model will wrap the layers in a Sequential container. Combinations of lists and containers is not supported. When a model is instantiated, the layer shapes are not determined until a training set is provided.

Training

To train, call the model.fit() function and provide

Argument Description
dataset An iterable of minibatches of the dataset (e.g. ArrayIterator).
cost Cost function to apply to the output of the last layer (neon.transforms.Costs)
optimizer The learning rule for updating the model parameters (neon.optimizers)
num_epochs Number of iterations over the dataset
callbacks Functions to run at the start/end of each epoch/minibatch (neon.callbacks)

When model.fit() is called and training data provided, the model is first initialized with model.initialize(). During initialization, the dataset is propagated through the layers to call each layer’s configure() method to set the input and output shapes. Then, the appropriate buffers are allocated with each layer’s allocate() method. Note that a model object can only be initialized once.

During training, the model iterates through mini-batches of the dataset, calling the forward and backward propagation functions to compute the gradients according to the provided cost and update the weights based on the optimizer. The length of training is controlled by the num_epochs argument. Callbacks can also be configured to end training when certain exit conditions are met.

# Pseudo-code of training procedure in neon.models.model
for (x_train, y_train) in dataset:

    # fprop through the layers
    x_train = self.fprop(x_train)

    # get deltas in the cost
    delta = self.cost.get_errors(x_train, y_train)

    # backprop the deltas through the layers
    self.bprop(delta)

    # update the weights
    self.optimizer.optimize(self.layer, epoch=epoch)

Evaluation

When training is completed, the model can be evaluated against a provided Metric and dataset with the model.eval(dataset, metric) method. This method iterates over the provided dataset, and calls fprop to obtain the model output. For efficient inference, the model calls fprop with inference=True argument to avoid unneeded memory and computation.

To directly obtain the model outputs for a specific dataset, the model.get_outputs(dataset) method can also be called, which returns a numpy array with the final layer output for each example in the dataset.

Inspecting the model

The easiest way of inspecting a model’s weights is by accessing the layer parameters directly. For example, to get the Tensor for the first layer in a model, call:

mlp.layers.layers[0].W

To get the entire model configuration and weights, call

pdict = model.get_description(get_weights=True)

We can now inspect each layer by obtaining a list of dicts, one for each layer:

ldict = pdict['model']['config']['layers']

Each layer dict has three keys:

  • 'config': arguments passed to the constructor (e.g., name, weight initializer)
  • 'type': layer class (e.g. neon.layers.layer.Linear)
  • 'params': dict of layer parameters (e.g. 'W' for the weight matrix)

For example, we can obtain a numpy array with the weight matrix of the first layer by calling

W = ldict[0]['params']['W']

# or more directly,
W = pdict['model']['config']['layers'][0]['params']['W']

Note that this copies all the data from the GPU device to host to produce the numpy array values.

Loading and saving models

The entire model (layers, per layer weights, epochs run, optimizer states, etc.) can be saved and loaded from disk with neon’s serialization feature.

There are two ways to save a model. One can call, after fitting is complete:

model.save_params("mnist_model.prm")

This will save the model objects into “save_path.prm”. Alternatively, the command line argument --serialize n will save the model every n epochs:

python mnist_mlp.py --save_path mnist_model.prm --serialize 1 -e 3 \

Then, the model will be saved every epoch of training.

To load the model, pass the file to the model constructor:

new_model = Model("mnist_model.prm")