neon.models.model.Model

class neon.models.model.Model(layers, dataset=None, weights_only=False, name='model', optimizer=None)[source]

Bases: neon.NervanaObject

Class which stores a list of layers describing the model. Can train the layer weights on a dataset, evaluate on a test set and serialize the model. Additional functionality can be added to the fit method through callback functions.

Parameters:
  • layers – layer container, a list of layers (that will be containerized), or a serialized model description
  • dataset (NervanaDataIterator) – Data set (ignored, will be removed)
  • weights_only (bool) – set to True if you do not want to recreate layers and states during deserialization from a serialized model description. Defaults to False.
  • name (str) – Model name. Defaults to “model”
  • optimizer (Optimizer) – Optimizer object which defines the learning rule for updating model parameters (i.e., GradientDescentMomentum, Adadelta)
__init__(layers, dataset=None, weights_only=False, name='model', optimizer=None)[source]

Methods

__init__(layers[, dataset, weights_only, …])
allocate_deltas()
benchmark(dataset[, inference, cost, …]) Measure runtime for computing fprop and bprop separately, as well as full minibatch run times.
bprop(delta) Back propagates the error of a minibatch through the model.
deserialize(model_dict[, data, load_states]) Loads per layer (weights, states) and other model parameters from the dictionary passed.
eval(dataset, metric) Evaluates a model on a dataset according to an input metric.
fit(dataset, cost, optimizer, num_epochs, …) Trains the model parameters on a dataset by minimizing the cost function through gradient descent and updates the layer weights according to a learning rule defined in optimizer.
fprop(x[, inference]) Forward propagates a minibatch x through the model.
gen_class(pdict)
get_description([get_weights, keep_states]) Gets a description of the model required to reconstruct the model with no weights like from a yaml file.
get_outputs(dataset) Get the activation outputs of the final model layer for the dataset
get_outputs_beam(dataset[, num_beams, steps]) Get the activation outputs of the final model layer for the dataset
initialize(dataset[, cost]) Propagate shapes through the layers to configure, then allocate space.
load_params(param_path[, load_states]) Loads the model parameters (per layer weights, epochs run, optimizer states) saved in param_path from serialize().
load_weights(weight_path)

Deprecated since version 1.1.4.

recursive_gen(pdict, key) helper method to check whether the definition
save_params(param_path[, keep_states]) Serializes and saves model parameters to the path specified.
serialize([fn, keep_states]) Creates a dictionary storing the layer parameters and epochs complete.
set_batch_size(N) Set the actual minibatch size, so even though the buffers are allocated considering excessive padding, the processing for some layers may be shortened.
set_seq_len(S) Set the actual minibatch sequence length, so even though the buffers are allocated considering excessive padding, the processing for some layers may be shortened.
set_shortcut()
allocate_deltas()[source]
be = None
benchmark(dataset, inference=False, cost=None, optimizer=None, niterations=20, nskip=2)[source]

Measure runtime for computing fprop and bprop separately, as well as full minibatch run times. For inference case, only the fprop is measured.

Parameters:
  • dataset (NervanaDataIterator) –
  • cost (Cost) – Defines the function which the model is minimizing based on the output of the last layer and the input labels
  • niterations (optional, int) – Number of minibatches to average over
  • nskip (optional, int) – Number of iterations at the beginning to skip when calculating the runtime statistics
  • inference (bool, optional) – Is inference use case
  • optimizer (Optimizer) – Defines the learning rule for updating the model parameters.
Returns:

dictionary with fprop, bprop run times

bprop(delta)[source]

Back propagates the error of a minibatch through the model.

Parameters:delta (Tensor) – Derivative of cost with respect to the last layer’s output
Returns:Deltas to propagate to the next layer
Return type:Tensor
classnm

Returns the class name.

deserialize(model_dict, data=None, load_states=True)[source]

Loads per layer (weights, states) and other model parameters from the dictionary passed.

Parameters:
  • model_dict (dict) – dictionary describing the model including layers, cost, optimizers, backend settings, etc. generated by the serialize function
  • data (NervanaDataIterator) – Data set (ignored, will be removed)
  • load_states (bool) – if False, then only the weights will be loaded into a model in which the layers have already been created, otherwise will (re)create the layers from the serialized parameters and set the learning states as well
eval(dataset, metric)[source]

Evaluates a model on a dataset according to an input metric.

Parameters:
  • datasets (NervanaDataIterator) – dataset to evaluate on.
  • metric (Cost) – what function to evaluate dataset on.
Returns:

the error of the final layer for the evaluation dataset

Return type:

Host numpy array

fit(dataset, cost, optimizer, num_epochs, callbacks)[source]

Trains the model parameters on a dataset by minimizing the cost function through gradient descent and updates the layer weights according to a learning rule defined in optimizer.

Parameters:
  • dataset (NervanaDataIterator) – An iterable of minibatches where each element is a (x, y) tuple where x is the input data and y are the labels. x is of dimension (feature_size, batch_size) y is of dimension (label_size, batch_size) Length of the iterator is num_batches which is num_data / batch_size.
  • cost (Cost) – Defines the function which the model is minimizing based on the output of the last layer and the input labels.
  • optimizer (Optimizer) – Defines the learning rule for updating the model parameters.
  • num_epochs – Number of times to iterate over the dataset.
  • callbacks (Callbacks) – Defines callbacks to run at the end of each mini-batch / epoch.
fprop(x, inference=False)[source]

Forward propagates a minibatch x through the model.

Parameters:
  • x (Tensor) – Input minibatch data.
  • inference (bool) – Flag for performing training or inference Only affects batch norm and dropout layers.
Returns:

the output of the final layer in the model

Return type:

Tensor

gen_class(pdict)
get_description(get_weights=False, keep_states=False)[source]

Gets a description of the model required to reconstruct the model with no weights like from a yaml file.

Parameters:
  • get_weights – (Default value = False)
  • keep_states – (Default value = False)
Returns:

Description of each component of the model.

Return type:

dict

get_outputs(dataset)[source]

Get the activation outputs of the final model layer for the dataset

Parameters:dataset (NervanaDataIterator) – Dataset iterator to perform fit on
Returns:the output of the final layer for the entire Dataset
Return type:Host numpy array
get_outputs_beam(dataset, num_beams=0, steps=None)[source]

Get the activation outputs of the final model layer for the dataset

Parameters:
  • dataset (NervanaDataIterator) –
  • num_beams (int, optional) –
  • steps (Int) – Length of desired output in number of time steps
Returns:

the output of the final layer for the entire Dataset

Return type:

Host numpy array

initialize(dataset, cost=None)[source]

Propagate shapes through the layers to configure, then allocate space.

Parameters:
  • dataset (NervanaDataIterator) – Dataset iterator to perform initialization on
  • cost (Cost) – Defines the function which the model is minimizing based on the output of the last layer and the input labels.
layers_to_optimize

Helper function to return the layers which will be optimized.

load_params(param_path, load_states=True)[source]

Loads the model parameters (per layer weights, epochs run, optimizer states) saved in param_path from serialize().

Parameters:
  • param_path (str) – File containing serialized python dict with layer weights and states.
  • load_states (bool) – if False, then only the weights will be loaded into a model in which the layers have already been created, otherwise will (re)create the layers from the serialized parameters and set the learning states as well
load_weights(weight_path)[source]

Deprecated since version 1.1.4: Use load_params() instead

Parameters:weight_path
modulenm

Returns the full module path.

recursive_gen(pdict, key)

helper method to check whether the definition dictionary is defining a NervanaObject child, if so it will instantiate that object and replace the dictionary element with an instance of that object

save_params(param_path, keep_states=True)[source]

Serializes and saves model parameters to the path specified.

Parameters:
  • param_path (str) – File to write serialized parameter dict to.
  • keep_states (bool) – Whether to save optimizer states too. Defaults to True.
serialize(fn=None, keep_states=True)[source]

Creates a dictionary storing the layer parameters and epochs complete.

Parameters:
  • fn (str) – file to save pkl formatted model dictionary
  • keep_states (bool) – Whether to save optimizer states.
Returns:

Model data including layer parameters and epochs complete.

Return type:

dict

set_batch_size(N)[source]

Set the actual minibatch size, so even though the buffers are allocated considering excessive padding, the processing for some layers may be shortened. Currently most of the neon layers don’t use that to control the processing. The interface is here only for when someone wants to set that information and experiment.

Parameters:N

Returns:

set_seq_len(S)[source]

Set the actual minibatch sequence length, so even though the buffers are allocated considering excessive padding, the processing for some layers may be shortened. Currently most of the neon layers don’t use that to control the processing. The interface is here only for when someone wants to set that information and experiment.

Parameters:S

Returns:

set_shortcut()[source]