Model class represents the network architecture,
either provided as a layer container, a list of layers, or loaded from a
previously saved model file.
Lifecycle of a model¶
We create a model by passing either a list of layers or a container. If
a list of layers is provided, the model will wrap the layers in a
Sequential container. Combinations of lists and
containers is not supported. When a model is instantiated, the layer
shapes are not determined until a training set is provided.
To train, call the
model.fit() function and provide
|dataset||An iterable of minibatches of the dataset (e.g.
|cost||Cost function to apply to the output of the last layer (
|optimizer||The learning rule for updating the model parameters (
|num_epochs||Number of iterations over the dataset|
|callbacks||Functions to run at the start/end of each epoch/minibatch (
model.fit() is called and training data provided, the model is
first initialized with
model.initialize(). During initialization,
the dataset is propagated through the layers to call each layer’s
configure() method to set the input and output shapes. Then, the
appropriate buffers are allocated with each layer’s
method. Note that a model object can only be initialized once.
During training, the model iterates through mini-batches of the dataset,
calling the forward and backward propagation functions to compute the
gradients according to the provided
cost and update the weights
based on the
optimizer. The length of training is controlled by the
num_epochs argument. Callbacks can also be configured to end training
when certain exit conditions are met.
# Pseudo-code of training procedure in neon.models.model for (x_train, y_train) in dataset: # fprop through the layers x_train = self.fprop(x_train) # get deltas in the cost delta = self.cost.get_errors(x_train, y_train) # backprop the deltas through the layers self.bprop(delta) # update the weights self.optimizer.optimize(self.layer, epoch=epoch)
When training is completed, the model can be evaluated against a
provided Metric and dataset with the
method. This method iterates over the provided dataset, and calls
fprop to obtain the model output. For efficient inference, the model
inference=True argument to avoid unneeded
memory and computation.
To directly obtain the model outputs for a specific dataset, the
model.get_outputs(dataset) method can also be called, which returns
a numpy array with the final layer output for each example in the
Inspecting the model¶
The easiest way of inspecting a model’s weights is by accessing the layer parameters directly. For example, to get the Tensor for the first layer in a model, call:
To get the entire model configuration and weights, call
pdict = model.get_description(get_weights=True)
We can now inspect each layer by obtaining a list of dicts, one for each layer:
ldict = pdict['model']['config']['layers']
Each layer dict has three keys:
'config': arguments passed to the constructor (e.g., name, weight initializer)
'type': layer class (e.g.
'params': dict of layer parameters (e.g.
'W'for the weight matrix)
For example, we can obtain a numpy array with the weight matrix of the first layer by calling
W = ldict['params']['W'] # or more directly, W = pdict['model']['config']['layers']['params']['W']
Note that this copies all the data from the GPU device to host to produce the numpy array values.
Loading and saving models¶
The entire model (layers, per layer weights, epochs run, optimizer states, etc.) can be saved and loaded from disk with neon’s serialization feature.
There are two ways to save a model. One can call, after fitting is complete:
This will save the model objects into “save_path.prm”. Alternatively,
the command line argument
--serialize n will save the model every
python mnist_mlp.py --save_path mnist_model.prm --serialize 1 -e 3 \
Then, the model will be saved every epoch of training.
To load the model, pass the file to the
new_model = Model("mnist_model.prm")