neon.layers.layer.BatchNorm

class neon.layers.layer.BatchNorm(rho=0.9, eps=0.001, name=None, binary=False)[source]

Bases: neon.layers.layer.Layer

A batch normalization layer as described in [Ioffe2015].

Normalizes a batch worth of inputs by subtracting batch mean and dividing by batch variance. Then scales by learned factor gamma and shifts by learned bias beta.

Uses the inputs to fprop to infer if a precomputed batch-sum is supplied from previous layer (input is tuple), or if the sum still needs to be computed.

Notes:

[Ioffe2015]http://arxiv.org/abs/1502.03167
__init__(rho=0.9, eps=0.001, name=None, binary=False)[source]

Methods

__init__([rho, eps, name, binary])
accumulates(f) Higher order decorator function that enables accumulation functionality for that function.
allocate([shared_outputs, accumulate_updates]) Allocate output buffer to store activations from fprop.
allocate_deltas(global_deltas)
bprop(error[, alpha, beta]) Compute gradients for learning gamma and beta as well as layer weights.
configure(in_obj) Sets shape based parameters of this layer given an input tuple or int or input layer.
fprop(inputs[, inference, beta]) Normalize inputs (x) over batch mean and variance.
gen_class(pdict)
get_description([get_weights, keep_states]) Get layer parameters.
get_is_mklop() is_mklop true means this op is on mkl backend
get_param_attrs()
get_params()
get_params_serialize([keep_states])
get_terminal() Used for recursively getting final nodes from layer containers.
init_params(dim0)
load_weights(pdict[, load_states]) Load weights.
nested_str([level]) Utility function for displaying layer info with a given indentation level.
recursive_gen(pdict, key) helper method to check whether the definition
serialize() Get state parameters for this layer.
set_acc_on(acc_on) Set the acc_on flag according to bool argument.
set_batch_size(N) Set minibatch size.
set_deltas(delta_buffers) Use pre-allocated (by layer containers) list of buffers for backpropagated error.
set_is_mklop()
set_next(layer) Set next_layer to provided layer.
set_not_mklop()
set_params(pdict)
set_seq_len(S) Set sequence length.
set_states(pdict)
accumulates(f)

Higher order decorator function that enables accumulation functionality for that function. Object that use this decorator are required to have an acc_param attribute. This attribute tuple declares the names for existing temp parameter and real parameter buffers. The temp parameter buffer copies the value of the parameter buffer before f is called, and after f is called the temp and normal buffers are summed. This decorator could be used to wrap any function that may want to accumulate parameters instead of overwriting.

allocate(shared_outputs=None, accumulate_updates=False)[source]

Allocate output buffer to store activations from fprop.

Parameters:
  • shared_outputs (Tensor, optional) – pre-allocated tensor for activations to be computed into
  • bool (accumulate_updates) – allocate additional scratch accumulation buffers.
allocate_deltas(global_deltas)
be = None
bprop(error, alpha=1.0, beta=0.0)[source]

Compute gradients for learning gamma and beta as well as layer weights.

Parameters:
  • error (Tensor) – deltas back propagated from the adjacent higher layer
  • alpha (float, optional) – scale to apply to input for activation gradient bprop. Defaults to 1.0
  • beta (float, optional) – scale to apply to output activation gradient bprop. Defaults to 0.0

Returns:

classnm

Returns the class name.

configure(in_obj)[source]

Sets shape based parameters of this layer given an input tuple or int or input layer.

Parameters:in_obj (int, tuple, Layer or Tensor) – object that provides shape information for layer
Returns:shape of output data
Return type:(tuple)
fprop(inputs, inference=False, beta=0.0)[source]

Normalize inputs (x) over batch mean and variance. xhat = (x - xmean) / xvar

Scale and shift normalized inputs (xhat) by learned parameters gamma and beta. y = xhat * gamma + beta

Accumulate partial results to global mean and variance buffers used for inference.

Parameters:
  • inputs
  • inference – (Default value = False)
  • beta – (Default value = 0.0)
Returns:

output data

Return type:

Tensor

gen_class(pdict)
get_description(get_weights=False, keep_states=True)[source]

Get layer parameters.

Parameters:
  • get_weights (bool, optional) – Control whether all parameters are returned or just weights for serialization.
  • keep_states (bool, optional) – Controls whether the states should be returned
get_is_mklop()

is_mklop true means this op is on mkl backend and may require convert when from non-mkl op

get_param_attrs()
get_params()[source]
get_params_serialize(keep_states=True)[source]
get_terminal()

Used for recursively getting final nodes from layer containers.

init_params(dim0)[source]
load_weights(pdict, load_states=True)

Load weights.

Parameters:
  • pdict
  • load_states – (Default value = True)
modulenm

Returns the full module path.

nested_str(level=0)

Utility function for displaying layer info with a given indentation level.

Parameters:level (int, optional) – indentation level
Returns:layer info at the given indentation level
Return type:str
plist
recursive_gen(pdict, key)

helper method to check whether the definition dictionary is defining a NervanaObject child, if so it will instantiate that object and replace the dictionary element with an instance of that object

serialize()

Get state parameters for this layer.

Returns:whatever data this model wants to receive in order to restore state
Return type:varies
set_acc_on(acc_on)

Set the acc_on flag according to bool argument. If set to true, the layer will accumulate some (preset) parameters on calls to functions that are decorated with the accumulates decorator. In order to use this feature, accumulate_updates=True must have been passed to the layer’s allocate function

This currently only works for a few hard coded parameters in select layers

Parameters:acc_on (bool) – Value to set the acc_on flag to.
set_batch_size(N)

Set minibatch size.

Parameters:N (int) – minibatch size
set_deltas(delta_buffers)

Use pre-allocated (by layer containers) list of buffers for backpropagated error. Only set deltas for layers that own their own deltas Only allocate space if layer owns its own deltas (e.g., bias and activation work in-place, so do not own their deltas).

Parameters:delta_buffers (DeltasTree) – list of pre-allocated tensors (provided by layer container)
set_is_mklop()
set_next(layer)

Set next_layer to provided layer.

Parameters:layer (layer) – Next layer
set_not_mklop()
set_params(pdict)[source]
set_seq_len(S)

Set sequence length.

Parameters:S (int) – sequence length
set_states(pdict)[source]