neon.layers.recurrent.BiRNN

class neon.layers.recurrent.BiRNN(output_size, init, init_inner=None, activation=None, reset_cells=False, split_inputs=False, name=None, parallelism='Data')[source]

Bases: neon.layers.layer.ParameterLayer

Basic Bi Directional Recurrent layer.

Parameters:
  • output_size (int) – Number of hidden/output units
  • init (Initializer) – Function for initializing the model parameters
  • init_inner (Initializer, optional) – Function for initializing the model’s recurrent parameters. If absent, will default to using same initializer provided to init.
  • activation (Transform) – Activation function for the input modulation
  • reset_cells (bool) – default to be False to make the layer stateful, set to True to be stateless.
  • split_inputs (bool) – to expect the input coming from the same source of separate sources
  • name (str, optional) – name to refer to this layer as.
W_input

Tensor – weights from inputs to output units (input_size, output_size)

W_recur

Tensor – weights for recurrent connections (output_size, output_size)

b

Tensor – Biases on output units (output_size, 1)

__init__(output_size, init, init_inner=None, activation=None, reset_cells=False, split_inputs=False, name=None, parallelism='Data')[source]

Methods

__init__(output_size, init[, init_inner, …])
accumulates(f) Higher order decorator function that enables accumulation functionality for that function.
allocate([shared_outputs]) Allocate output buffer to store activations from fprop.
allocate_deltas(global_deltas)
bprop(error[, alpha, beta]) Backward propagation of errors through bi-directional recurrent layer.
configure(in_obj) Set shape based parameters of this layer given an input tuple, int or input layer.
fprop(inputs[, inference]) Forward propagation of input to bi-directional recurrent layer.
gen_class(pdict)
get_description([get_weights, keep_states]) Get layer parameters.
get_is_mklop() is_mklop true means this op is on mkl backend
get_param_attrs()
get_params() Get layer parameters, gradients, and states for optimization.
get_params_serialize([keep_states])
get_terminal() Used for recursively getting final nodes from layer containers.
init_buffers(inputs) Initialize buffers for recurrent internal units and outputs.
init_params(shape) Initialize params for LSTM including weights and biases.
load_weights(pdict[, load_states]) Load weights.
nested_str([level]) Utility function for displaying layer info with a given indentation level.
recursive_gen(pdict, key) helper method to check whether the definition
serialize() Get state parameters for this layer.
set_acc_on(acc_on) Set the acc_on flag according to bool argument.
set_batch_size(N) Set minibatch size.
set_deltas(delta_buffers) Use pre-allocated (by layer containers) list of buffers for backpropagated error.
set_is_mklop()
set_next(layer) Set next_layer to provided layer.
set_not_mklop()
set_params(pdict) Set layer parameters (weights).
set_seq_len(S) Set sequence length.
set_states(pdict)
accumulates(f)

Higher order decorator function that enables accumulation functionality for that function. Object that use this decorator are required to have an acc_param attribute. This attribute tuple declares the names for existing temp parameter and real parameter buffers. The temp parameter buffer copies the value of the parameter buffer before f is called, and after f is called the temp and normal buffers are summed. This decorator could be used to wrap any function that may want to accumulate parameters instead of overwriting.

allocate(shared_outputs=None)[source]

Allocate output buffer to store activations from fprop.

Parameters:shared_outputs (Tensor, optional) – pre-allocated tensor for activations to be computed into
allocate_deltas(global_deltas)
be = None
bprop(error, alpha=1.0, beta=1.0)[source]

Backward propagation of errors through bi-directional recurrent layer.

Parameters:
  • deltas (Tensor) – tensors containing the errors for each step of model unrolling. shape: (output_size, sequence_length * batch_size)
  • alpha (float, optional) – scale to apply to input for activation gradient bprop. Defaults to 1.0
  • beta (float, optional) – scale to apply to output activation gradient bprop. Defaults to 0.0
Returns:

back propagated errors for each step of time unrolling

for each mini-batch element shape: (input_size, sequence_length * batch_size)

Return type:

Tensor

classnm

Returns the class name.

configure(in_obj)[source]

Set shape based parameters of this layer given an input tuple, int or input layer.

Parameters:in_obj (int, tuple, Layer, Tensor or dataset) – object that provides shape information for layer
Returns:shape of output data
Return type:(tuple)
fprop(inputs, inference=False)[source]

Forward propagation of input to bi-directional recurrent layer.

Parameters:
  • inputs (Tensor) –

    input to the model for each time step of unrolling for each input in minibatch shape: (feature_size, sequence_length * batch_size) where:

    • feature_size: input size
    • sequence_length: degree of model unrolling
    • batch_size: number of inputs in each mini-batch
  • inference (bool, optional) – Set to true if you are running inference (only care about forward propagation without associated backward propagation). Default is False.
Returns:

layer output activations for each time step of

unrolling and for each input in the minibatch shape: (output_size, sequence_length * batch_size)

Return type:

Tensor

gen_class(pdict)
get_description(get_weights=False, keep_states=True)

Get layer parameters. All parameters are needed for optimization, but only weights are serialized.

Parameters:
  • get_weights (bool, optional) – Control whether all parameters are returned or just weights for serialization.
  • keep_states (bool, optional) – Control whether all parameters are returned or just weights for serialization.
get_is_mklop()

is_mklop true means this op is on mkl backend and may require convert when from non-mkl op

get_param_attrs()
get_params()

Get layer parameters, gradients, and states for optimization.

get_params_serialize(keep_states=True)
get_terminal()

Used for recursively getting final nodes from layer containers.

init_buffers(inputs)[source]

Initialize buffers for recurrent internal units and outputs. Buffers are initialized as 2D tensors with second dimension being steps * batch_size A list of views are created on the buffer for easy manipulation of data related to a certain time step

Parameters:inputs (Tensor) – input data as 2D tensor. The dimension is (input_size, sequence_length * batch_size)
init_params(shape)[source]

Initialize params for LSTM including weights and biases. The weight matrix and bias matrix are concatenated from the weights for inputs and weights for recurrent inputs and bias. The shape of the weights are (number of inputs + number of outputs +1 ) by (number of outputs * 4)

Parameters:shape (Tuple) – contains number of outputs and number of inputs
load_weights(pdict, load_states=True)

Load weights.

Parameters:
  • pdict
  • load_states – (Default value = True)
modulenm

Returns the full module path.

nested_str(level=0)

Utility function for displaying layer info with a given indentation level.

Parameters:level (int, optional) – indentation level
Returns:layer info at the given indentation level
Return type:str
recursive_gen(pdict, key)

helper method to check whether the definition dictionary is defining a NervanaObject child, if so it will instantiate that object and replace the dictionary element with an instance of that object

serialize()

Get state parameters for this layer.

Returns:whatever data this model wants to receive in order to restore state
Return type:varies
set_acc_on(acc_on)

Set the acc_on flag according to bool argument. If set to true, the layer will accumulate some (preset) parameters on calls to functions that are decorated with the accumulates decorator. In order to use this feature, accumulate_updates=True must have been passed to the layer’s allocate function

This currently only works for a few hard coded parameters in select layers

Parameters:acc_on (bool) – Value to set the acc_on flag to.
set_batch_size(N)

Set minibatch size.

Parameters:N (int) – minibatch size
set_deltas(delta_buffers)[source]

Use pre-allocated (by layer containers) list of buffers for backpropagated error. Only set deltas for layers that own their own deltas Only allocate space if layer owns its own deltas (e.g., bias and activation work in-place, so do not own their deltas).

Parameters:delta_buffers (list) – list of pre-allocated tensors (provided by layer container)
set_is_mklop()
set_next(layer)

Set next_layer to provided layer.

Parameters:layer (layer) – Next layer
set_not_mklop()
set_params(pdict)

Set layer parameters (weights). Allocate space for other parameters but do not initialize them.

Parameters:pdict (dict, ndarray) – dictionary or ndarray with layer parameters [support for ndarray is DEPRECATED and will be removed]
set_seq_len(S)

Set sequence length.

Parameters:S (int) – sequence length
set_states(pdict)