neon.optimizers.optimizer.MultiOptimizer

class neon.optimizers.optimizer.MultiOptimizer(optimizer_mapping, name=None)[source]

Bases: neon.optimizers.optimizer.Optimizer

A wrapper class for using multiple Optimizers within the same model.

To assign different optimizers to different layers we first define the different optimizers:

from neon.optimizers import GradientDescentMomentum, RMSprop

optimizer_A = GradientDescentMomentum(learning_rate=0.01, momentum_coef=0.9)
optimizer_B = GradientDescentMomentum(learning_rate=0.05, momentum_coef=0.9)
optimizer_C = RMSprop(learning_rate=2e-3, decay_rate=0.95)

Then, we instantiate this class and pass a dictionary mapping layers to optimizers. The keys can either be: default, a layer class name (e.g. Bias), or the Layer’s name attribute. The latter takes precedence for finer layer-to-layer control.

For example, if we have the following layers,

layers = []
layers.append(Linear(nout = 100, init=Gaussian(), name="layer_one"))
layers.append(Linear(nout = 50, init=Gaussian(), name="layer_two"))
layers.append(Affine(nout = 5, init=Gaussian(), activation=Softmax()))

we can define multiple optimizers with

from neon.optimizers import MultiOptimizer

# dictionary of mappings
mapping = {'default': optimizer_A, # default optimizer
           'Linear': optimizer_B, # all layers from the Linear class
           'layer_two': optimizer_C} # this overrides the previous entry

# use multiple optimizers
opt = MultiOptimizer(mapping)

After definition, we have the following mapping

Layer Optimizer
layer_one optimizer_B
layer_two optimizer_C
Affine.Linear optimizer_B
Affine.Bias optimizer_A
Affine.Softmax None (no parameters)
__init__(optimizer_mapping, name=None)[source]

Class constructor.

Parameters:optimizer_mapping (dict) – dictionary specifying the mapping of layers to optimizers. Key: 'default', layer class name or layer name attribute. Don’t name your layers 'default'. Value: the optimizer object to use for those layers.

Methods

__init__(optimizer_mapping[, name]) Class constructor.
clip_gradient_norm(param_list, clip_norm) Returns a scaling factor to apply to the gradients.
clip_value(v[, abs_bound]) Element-wise clip a gradient or parameter tensor to between -abs_bound and +abs_bound.
gen_class(pdict)
get_description()
optimize(layer_list, epoch) Determine which optimizer in the container should go with which layers, then apply their optimize functions to those layers.
recursive_gen(pdict, key) helper method to check whether the definition
be = None
classnm

Returns the class name.

clip_gradient_norm(param_list, clip_norm)

Returns a scaling factor to apply to the gradients.

The scaling factor is computed such that the root mean squared average of the scaled gradients across all layers will be less than or equal to the provided clip_norm value. This factor is always <1, so never scales up the gradients.

Parameters:
  • param_list (list) – List of layer parameters
  • clip_norm (float, optional) – Target norm for the gradients. If not provided the returned scale_factor will equal 1.
Returns:

Computed scale factor.

Return type:

scale_factor (float)

clip_value(v, abs_bound=None)

Element-wise clip a gradient or parameter tensor to between -abs_bound and +abs_bound.

Parameters:
  • v (tensor) – Tensor of gradients or parameters for a single layer
  • abs_bound (float, optional) – Value to element-wise clip gradients or parameters. Defaults to None.
Returns:

Tensor of clipped gradients or parameters.

Return type:

v (tensor)

classmethod gen_class(pdict)[source]
get_description()[source]
modulenm

Returns the full module path.

optimize(layer_list, epoch)[source]

Determine which optimizer in the container should go with which layers, then apply their optimize functions to those layers.

Notes:

We can recalculate map_list in case optimizer_mapping changes during training.

recursive_gen(pdict, key)

helper method to check whether the definition dictionary is defining a NervanaObject child, if so it will instantiate that object and replace the dictionary element with an instance of that object