neon.backends.backend.Backend¶

class
neon.backends.backend.
Backend
(rng_seed=None, default_dtype=<type 'numpy.float32'>, compat_mode=None, deterministic=None)[source]¶ Bases:
neon.backends.abstract_backend.AbstractBackend
Backend interface used to manipulate Tensor data. This abstract base class defines what operations each concrete backend must support. NervanaGPU and NervanaCPU inherit Backend.
Parameters:  rng_seed (int, optional) – random number generator seed value
 default_dtype (numpy.ndtype, optional) – Elemental data type to use when creating new tensors if not otherwise specified. Defaults to np.float32
 compat_mode (str, optional) – Flag to match implementation of other libraries. Currently only ‘caffe’ is supported, defaults to None.
 deterministic (bool, optional) – Flag to use deterministic kernels where applicable. This may cause a small increase in memory usage and slow down. Only relevant for GPU backends.

__init__
(rng_seed=None, default_dtype=<type 'numpy.float32'>, compat_mode=None, deterministic=None)[source]¶
Methods
__init__
([rng_seed, default_dtype, ...])absolute
(a[, out])Perform elementwise absolute value of Tensor a, storing the result in Tensor out. add
(a, b[, out])Perform elementwise addition on the operands, storing the resultant values in the out Tensor. add_fc_bias
(inputs, bias)Add the bias for a fully connected network layer. allocate_backend
(name, **kargs)Allocate a named backend. allocate_new_deltas
(delta, in_shape, parallelism)For MKL backends, allocate new deltas for broadcast allocate_new_outputs
(layer, share_output)argmax
(a[, axis, out, keepdims])Calculates the indices of the maximal element value along the specified axis. argmin
(a[, axis, out, keepdims])Calculates the indices of the minimal element value along the specified axis. array
(ary[, dtype, name, persist_values, ...])Instantiate a new instance of this backend’s Tensor class, populating elements based on ary values. backend_choices
()Return the list of available backends. batched_dot
(A, B, C[, alpha, beta, relu])Perform one of the following operations: begin
(block, identifier)Signal the start of a block of repeated computation (at the start of a loop). binarize
(a[, stochastic, out])Perform elementwise binarization. bprop_conv
(layer, F, E, grad_I[, alpha, repeat])Backward propagate the error through a convolutional network layer. bprop_pool
(layer, I, E, grad_I)Backward propagate pooling layer. check_caffe_compat
()Check whether compatibility mode is set to ‘caffe’. clean_data
(tensor, layer_mkl)For MKL backends to clean mkl data (memory not freed) cleanup_backend
()Release any resources that have been acquired by this backend. clip
(a, a_min, a_max[, out])Performs elementwise clipping of Tensor a, storing the result in out. compound_bprop_lut
(nin, inputs, error, ...)Backward propagate lookup table layer. compound_dot
(A, B, C[, alpha, beta, relu])Perform one of the following operations (* is dot product) C = alpha * A * B + beta * C C = alpha * A.T * B + beta * C C = alpha * A * B.T + beta * C. compound_rnn_unroll_bprop
(W_recur, ...[, ...])Time step unrolling portion of recurrent layer bprop. compound_rnn_unroll_fprop
(W_recur, h_prev_s, ...)Time step unrolling portion of recurrent layer fprop. conv_layer
(dtype, N, C, K[, D, H, W, T, R, ...])Create a new ConvLayer parameter object. convert_data
(tensor, layer_mkl)For MKL backends to convert data from mkl layout to norm numpy layout deconv_layer
(dtype, N, C, K, P, Q[, R, S, ...])Create a new Deconvolution parameter object. distribute_data
(tensor, layer_parallelism)For backends which support distributed training, this will distribute or gather the error or activation tensor depending on the type of parallelism used to distribute the layer computation. divide
(a, b[, out])Perform elementwise division on the operands, storing the resultant values in the out Tensor. dot
(a, b[, out])Dot product of two Tensors. empty
(shape[, dtype, name, persist_values, ...])Instantiate a new instance of this backend’s Tensor class, without initializing element values. empty_like
(other_ary[, name, persist_values])Instantiate a new instance of this backend’s Tensor class, with the shape taken from other_ary. end
(block, identifier)Signal the corresponding end of a block of repeated computation (at the end of a loop). equal
(a, b[, out])Performs elementwise equality testing on each element of left and right, storing the result in out. execute
(node)Execute the optree. exp
(a[, out])Perform elementwise exponential transformation on Tensor a, storing the result in Tensor out. exp2
(a[, out])Perform elementwise 2based exponential transformation on Tensor a, storing the result in Tensor out. fabs
(a[, out])Perform elementwise absolute value of Tensor a, storing the result in Tensor out. finite
(a[, out])Perform elementwise test of finiteness (not infinity or not Not a Number) on Tensor a, storing the result in Tensor out. fprop_conv
(layer, I, F, O[, alpha, relu, repeat])Forward propagate the inputs of a convolutional network layer to produce output. fprop_pool
(layer, I, O)Forward propagate pooling layer. gen_rng
([seed])Setup the random number generator(s) and store the state in self.init_rng_state. greater
(a, b[, out])Performs elementwise greater than testing on each element of left and right, storing the result in out. greater_equal
(a, b[, out])Performs elementwise greater than or equal testing on each element of left and right, storing the result in out. iobuf
(dim0[, x, dtype, name, ...])Allocate input and output buffer for layer based on batch size. less
(a, b[, out])Performs elementwise less than testing on each element of left and right, storing the result in out. less_equal
(a, b[, out])Performs elementwise less than or equal testing on each element of left and right, storing the result in out. log
(a[, out])Perform elementwise natural logarithm transformation on Tensor a, storing the result in Tensor out. log2
(a[, out])Perform elementwise 2based logarithm transformation on Tensor a, storing the result in Tensor out. make_binary_mask
(out[, keepthresh])Create a binary mask for dropout layers. max
(a[, axis, out, keepdims])Calculates the maximal element value along the specified axes. maximum
(a, b[, out])Performs elementwise maximum value assignment based on corresponding elements of left and right, storing the result in out. mean
(a[, axis, partial, out, keepdims])Calculates the arithmetic mean of the elements along the specified axes. min
(a[, axis, out, keepdims])Calculates the minimal element value along the specified axes. minimum
(a, b[, out])Performs elementwise minimum value assignment based on corresponding elements of left and right, storing the result in out. multiply
(a, b[, out])Perform elementwise multiplication on the operands, storing the resultant values in the out Tensor. negative
(a[, out])Perform elementwise negation of Tensor a, storing the result in Tensor out. not_equal
(a, b[, out])Performs elementwise nonequality testing on each element of left and right, storing the result in out. onehot
(indices, axis[, out])Generate optree for converting indices to a onehot representation. ones
(shape[, dtype, name, persist_values, ...])Instantiate a new instance of this backend’s Tensor class, populating each element with a value of 1. output_dim
(X, S, padding, strides[, ...])Compute along 1 dimension, with these sizes, what will be the output dimension. pool_layer
(dtype, op, N, C[, D, H, W, J, T, ...])Create a new PoolLayer parameter object. power
(a, b[, out])Perform elementwise raise of tsr values to specified power, storing the result in Tensor out. reciprocal
(a[, out])Perform elementwise reciprocal of Tensor a, storing the result in Tensor out. revert_tensor
(tensor)Reverts a tensor to its original state after being distributed by distribute_data. rint
(a[, out])Perform elementwise rounding to nearest int. rng_get_state
(state)Get the random number generator state to a specific state. rng_reset
()Reset the random state to the state where the Backend is first initialized. rng_set_state
(state)Set the random number generator state to a specific state. safelog
(a[, out])Perform elementwise natural logarithm transformation on Tensor a, storing the result in Tensor out. set_caffe_compat
()Set flag to make layers compatible with caffe in terms of conv and pool layer output size determination and dropout layer implementation. sgn
(a[, out])Perform elementwise indication of the sign of Tensor a, storing the result in Tensor out. shared_iobuf_size
(shape, parallelism)Computes the backend specific size needed for an iobuf with a specified shape that is meant to be shared between layers. shift
(a, b[, value, out])Performs elementwise shift based on corresponding elements of left and right, storing the result in out. sig
(a[, out])Perform elementwise sigmoid transformation on Tensor a, storing the result in Tensor out. sig2
(a[, out])Perform elementwise 2based sigmoid logarithm transformation on Tensor a, storing the result in Tensor out. sqrt
(a[, out])Perform elementwise squareroot of Tensor a, storing the result in Tensor out. square
(a[, out])Perform elementwise square of Tensor a, storing the result in Tensor out. std
(a[, axis, partial, out, keepdims])Calculates the standard deviation of the elements along the specified axes. subtract
(a, b[, out])Perform elementwise subtraction on the operands, storing the resultant values in the out Tensor. sum
(a[, axis, out, keepdims])Calculates the summation of the elements along the specified axis. take
(a, indices, axis[, out])Extract elements based on the indices along a given axis. tanh
(a[, out])Perform elementwise hyperbolic tangent transformation on Tensor a, storing the result in Tensor out. tanh2
(a[, out])Perform elementwise 2based hyperbolic tangent transformation on Tensor a, storing the result in Tensor out. true_divide
(a, b[, out])Here it is an alias of divide. update_conv
(layer, I, E, grad_F[, alpha, repeat])Compute the updated gradient for a convolutional network layer. update_fc_bias
(err, out)Compute the updated bias gradient for a fully connected network layer. var
(a[, axis, partial, out, keepdims, binary])Calculates the variance of the elements along the specified axes. xnor_compound_dot
(A, B, C[, beta])Performs XNOR GEMM zeros
(shape[, dtype, name, persist_values, ...])Instantiate a new instance of this backend’s Tensor class, populating each element with a value of 0. zeros_like
(other_ary[, name, persist_values])Instantiate a new instance of this backend’s Tensor class, with the shape taken from other_ary and populating each element with a value of 0. Attributes
backends

absolute
(a, out=None)[source]¶ Perform elementwise absolute value of Tensor a, storing the result in Tensor out. Both Tensor’s should have identical shape.
Parameters:  a (Tensor) – input to be transformed.
 out (Tensor, optional) – where the result will be stored. If out is None, only the optree will be returned.
Returns: the resulting optree
Return type:

add
(a, b, out=None)[source]¶ Perform elementwise addition on the operands, storing the resultant values in the out Tensor. Each operand and out must have identical shape or be broadcastable as such.
Parameters:  a (Tensor, numeric) – lefthand side operand.
 b (Tensor, numeric) – righthand side operand.
 out (Tensor, optional) – where the result will be stored. If out is None, only the optree will be returned.
Returns: the resulting optree
Return type:

allocate_new_deltas
(delta, in_shape, parallelism)[source]¶ For MKL backends, allocate new deltas for broadcast

argmax
(a, axis=1, out=None, keepdims=True)[source]¶ Calculates the indices of the maximal element value along the specified axis. If multiple elements contain the maximum, only the indices of the first are returned.
Parameters:  a (Tensor) – the Tensor on which to perform the operation
 axis (int, optional) – the dimension along which to compute. If set to None, we will take argmax over all dimensions. Defaults to 1
 out (Tensor, optional) – where the result will be stored. If out is None, only the optree will be returned.
 keepdims (bool, optional) – Keep the axes being computed over in the output (with size 1), instead of collapsing. Defaults to True.
Returns: the resulting optree
Return type:

argmin
(a, axis=1, out=None, keepdims=True)[source]¶ Calculates the indices of the minimal element value along the specified axis. If multiple elements contain the minimum, only the indices of the first are returned.
Parameters:  a (Tensor) – the Tensor on which to perform the operation
 axis (int, optional) – the dimension along which to compute. If set to None, we will take argmin over all dimensions. Defaults to 1
 out (Tensor, optional) – where the result will be stored. If out is None, only the optree will be returned.
 keepdims (bool, optional) – Keep the axes being computed over in the output (with size 1), instead of collapsing. Defaults to True.
Returns: the resulting optree
Return type:

array
(ary, dtype=None, name=None, persist_values=True, parallel=False, distributed=False)¶ Instantiate a new instance of this backend’s Tensor class, populating elements based on ary values.
Parameters:  ary (array_like) – input array object to construct from. Can be builtin python scalar or list (of lists), or a numpy.ndarray
 dtype (datatype, optional) – If present, specifies the underlying type to employ for each element.
 name (str, optional) – name indentifying the tensor (used in printing).
 persist_values (bool, optional) – If set to True (the default), the values assigned to this Tensor will persist across multiple begin and end calls. Setting to False may provide a performance increase if values do not need to be maintained across such calls
 parallel (bool, optional) – If True and using multiGPU backend, replicate copies of this tensor across devices. Defaults to False, and has no effect on CPU, or (single) GPU backends.
 distributed (bool, optional) – If True and using multiGPU backend, this tensor is fragmented and partitioned across devices. Defaults to False, and has no effect on CPU, or (single) GPU backends.
Returns: array object
Return type: Raises: NotImplementedError
– Can’t be instantiated directly.

backends
= {'mkl': <class 'neon.backends.nervanamkl.NervanaMKL'>, 'cpu': <class 'neon.backends.nervanacpu.NervanaCPU'>}¶

batched_dot
(A, B, C, alpha=1.0, beta=0.0, relu=False)¶ Perform one of the following operations: 1 For fprop: A(K, C), B(X,C,N), C(X,K,N) –> call batched_dot(A, B, C) 2 For bprop: A(K, C), B(X,K,N), C(X,C,N) –> call batched_dot(A.T, B, C) 3 For update: A(X,K,N), B(X,C,N), C(K,C) –> call batched_dot(A, B.T, C)
Parameters:

begin
(block, identifier)[source]¶ Signal the start of a block of repeated computation (at the start of a loop). This operation can be used to help the compiler optimize instruction performance, but has no direct effect on calculations. It must be bookended by a corresponding Backend.end() call. Note that multiple begin calls can appear adjacent in nested loops.
Parameters:  block (Block.attr) – identifies the type of computation being worked on based on Block attribute specified
 identifier (int) – unique identifier for this particular iteration of the block. Will typically be something like epoch number, minibatch number, and so forth.
See also

binarize
(a, stochastic=True, out=None)[source]¶ Perform elementwise binarization.
Parameters:  a (Tensor) – input to be transformed.
 stochastic (Bool, optional) – stochastic or deterministic
 out (Tensor, optional) – where the result will be stored. If out is None, only the optree will be returned.
Returns: the resulting optree
Return type:

bprop_conv
(layer, F, E, grad_I, alpha=1.0, repeat=1)¶ Backward propagate the error through a convolutional network layer.
Parameters:  layer – the conv layer as a parameter object
 F (Tensor) – the weights (filters)
 E (Tensor) – errors
 grad_I (Tensor) – gradient to inputs (output delta)
 alpha (float, optional) – linear scaling. Defaults to 1.0
 repeat (int, optional) – Repeat this operation the specified number of times. Defaults to 1.

bprop_pool
(layer, I, E, grad_I)¶ Backward propagate pooling layer.
Parameters:

clip
(a, a_min, a_max, out=None)[source]¶ Performs elementwise clipping of Tensor a, storing the result in out. The clipped value will be between [a_min, a_max].
Parameters:  a (Tensor) – the Tensor on which to perform the operation
 a_min (Tensor, numeric) – lower bound for clip (inclusive).
 a_max (Tensor, numeric) – upper bound for clip (inclusive).
 out (Tensor, optional) – where the result will be stored. If out is None, only the optree will be returned.
Returns: the resulting optree
Return type:

compound_bprop_lut
(nin, inputs, error, error_t, dW, pad_idx, alpha=1.0, beta=0)¶ Backward propagate lookup table layer.
Parameters:

compound_dot
(A, B, C, alpha=1.0, beta=0.0, relu=False)¶ Perform one of the following operations (* is dot product) C = alpha * A * B + beta * C C = alpha * A.T * B + beta * C C = alpha * A * B.T + beta * C.
relu: if true, applied before output (and prior to beta addition)
The operation will be shortcircuited to: out < alpha * left * right if beta has value 0 (the default).
Parameters:

compound_rnn_unroll_bprop
(W_recur, delta_prev_s, delta_s, h_s, nout, num_steps, num_used_steps, activation, reverse=True)[source]¶ Time step unrolling portion of recurrent layer bprop.
Parameters:  W_recur (Tensor) – Recurrent weight matrix.
 delta_prev_s (Array) – Array of per time step input delta tensors. Each element in the array is a single time step view into one tensor containing all of the time steps in sequence.
 delta_s (Array) – Array of per time step input delta tensors. Each element in the array is a single time step view into one tensor containing all of the time steps in sequence.
 h_s (Tensor) – Array of per time step hidden state tensors. Each element in the array is a single time step view into one tensor containing all of the time steps in sequence.
 nout (integer) – Number of output units for the layer.
 num_steps (integer) – Total number of time steps in the buffer.
 num_used_steps (integer) – Number of time steps being used for real data.
 activation (Transform) – Activation function for the layer.
 reverse (boolean) – When true, unrolling will iterate over time steps in reverse (default case for RNN).

compound_rnn_unroll_fprop
(W_recur, h_prev_s, h_ff_s, h_s, bias, nout, num_steps, num_used_steps, activation, reverse=False)[source]¶ Time step unrolling portion of recurrent layer fprop.
Parameters:  W_recur (Tensor) – Recurrent weight matrix.
 h_prev_s (Array) – Array of per time step hidden state tensors. Each element in the array is a single time step view into one tensor containing all of the time steps in sequence.
 h_ff_s (Array) – Array of per time step hidden state tensors. Each element in the array is a single time step view into one tensor containing all of the time steps in sequence.
 h_s (Array) – Array of per time step hidden state tensors. Each element in the array is a single time step view into one tensor containing all of the time steps in sequence.
 bias (Tensor) – Bias tensor to add at each time step.
 nout (integer) – Number of output units for the layer.
 num_steps (integer) – Total number of time steps in the buffer.
 num_used_steps (integer) – Number of time steps being used for real data.
 activation (Transform) – Activation function for the layer.
 reverse (boolean) – When true, unrolling will iterate over time steps in reverse (for BiRNN).

conv_layer
(dtype, N, C, K, D=1, H=1, W=1, T=1, R=1, S=1, pad_d=0, pad_h=0, pad_w=0, str_d=1, str_h=1, str_w=1, relu=False, bsum=False)¶ Create a new ConvLayer parameter object. This is then passed as an argument to all the convolution operations.
Parameters:  dtype (datatype, optional) – If present, specifies the underlying type to employ for each element.
 N (int) – Number of images in minibatch
 C (int) – Number of input feature maps
 K (int) – Number of output feature maps
 D (int, optional) – Depth of input image. Defaults to 1
 H (int, optional) – Height of input image. Defaults to 1
 W (int, optional) – Width of input image. Defaults to 1
 T (int, optional) – Depth of filter kernel. Defaults to 1
 R (int, optional) – Height of filter kernel. Defaults to 1
 S (int, optional) – Width of filter kernel. Defaults to 1
 pad_d (int, optional) – amount of zeropadding around the depth edge Defaults to 0.
 pad_h (int, optional) – amount of zeropadding around the height edge Defaults to 0.
 pad_w (int, optional) – amount of zeropadding around the width edge Defaults to 0.
 str_d (int, optional) – factor to step the filters by in the depth direction. Defaults to 1
 str_h (int, optional) – factor to step the filters by in the depth direction. Defaults to 1
 str_w (int, optional) – factor to step the filters by in the depth direction. Defaults to 1
 relu (bool, optional) – apply a relu transform to the output for fprop or bprop. Defaults to False
 bsum (bool, optional) – calculate the sum along the batchnorm axis for fprop or bprop. Outputs an fp32 tensor of size Kx1. Defaults to False.

convert_data
(tensor, layer_mkl)[source]¶ For MKL backends to convert data from mkl layout to norm numpy layout

deconv_layer
(dtype, N, C, K, P, Q, R=1, S=1, pad_d=0, pad_h=0, pad_w=0, str_d=1, str_h=1, str_w=1)¶ Create a new Deconvolution parameter object. This then is passed as an argument to all deconvolution kernels.
Parameters:  dtype (datatype, optional) – If present, specifies the underlying type to employ for each element.
 N (int) – Number of images in minibatch
 C (int) – Number of input feature maps
 K (int) – Number of output feature maps
 P (int) – Height of output
 Q (int) – Width of output
 R (int, optional) – Height of filter kernel. Defaults to 1
 S (int, optional) – Width of filter kernel. Defaults to 1
 pad_d (int, optional) – amount of zeropadding around the depth edge Defaults to 0.
 pad_h (int, optional) – amount of zeropadding around the height edge Defaults to 0.
 pad_w (int, optional) – amount of zeropadding around the width edge Defaults to 0.
 str_d (int, optional) – factor to step the filters by in the depth direction. Defaults to 1
 str_h (int, optional) – factor to step the filters by in the depth direction. Defaults to 1
 str_w (int, optional) – factor to step the filters by in the depth direction. Defaults to 1
Leave spatial dimensions at 1 to allow feature map pooling in the fc layers.

distribute_data
(tensor, layer_parallelism)[source]¶ For backends which support distributed training, this will distribute or gather the error or activation tensor depending on the type of parallelism used to distribute the layer computation. Currently this is only supported by multiGPU in Nervana cloud.
Parameters:  tensor – Tensor containing either activations or errors
 layer_parallelism – Type of parallelism expected by the layer
Returns: Tensor which has been altered by this call or None

divide
(a, b, out=None)[source]¶ Perform elementwise division on the operands, storing the resultant values in the out Tensor. Each operand and out must have identical shape or be broadcastable as such.
Parameters:  a (Tensor, numeric) – lefthand side operand.
 b (Tensor, numeric) – righthand side operand.
 out (Tensor, optional) – where the result will be stored. If out is None, only the optree will be returned.
Returns: the resulting optree
Return type:

dot
(a, b, out=None)[source]¶ Dot product of two Tensors.
Parameters: Returns: the resulting optree from this operation.
Return type:

empty
(shape, dtype=None, name=None, persist_values=True, parallel=False, distributed=False)¶ Instantiate a new instance of this backend’s Tensor class, without initializing element values. This is slightly faster than
array()
,ones()
,zeros()
, but the values will be random.Parameters:  shape (int, list) – length of each dimension of the Tensor.
 dtype (datatype, optional) – If present, specifies the underlying type to employ for each element.
 name (str, optional) – name indentifying the tensor (used in printing).
 persist_values (bool, optional) – If set to True (the default), the values assigned to this Tensor will persist across multiple begin and end calls. Setting to False may provide a performance increase if values do not need to be maintained across such calls
 parallel (bool, optional) – If True and using multiGPU backend, replicate copies of this tensor across devices. Defaults to False, and has no effect on CPU, or (single) GPU backends.
 distributed (bool, optional) – If True and using multiGPU backend, this tensor is fragmented and partitioned across devices. Defaults to False, and has no effect on CPU, or (single) GPU backends.
Returns: array object
Return type: Raises: NotImplementedError
– Can’t be instantiated directly.

empty_like
(other_ary, name=None, persist_values=True)¶ Instantiate a new instance of this backend’s Tensor class, with the shape taken from other_ary.
Parameters:  other_ary (tensor object) – Tensor to inherit the dimensions of.
 name (str, optional) – name indentifying the tensor (used in printing).
 dtype (datatype, optional) – If present, specifies the underlying type to employ for each element.
 persist_values (bool, optional) – If set to True (the default), the values assigned to this Tensor will persist across multiple begin and end calls. Setting to False may provide a performance increase if values do not need to be maintained across such calls.
Returns: array object
Return type: Raises: NotImplementedError
– Can’t be instantiated directly.

end
(block, identifier)[source]¶ Signal the corresponding end of a block of repeated computation (at the end of a loop). This operation can be used to help the compiler optimize performance, but has no direct effect on calculations. It must be preceded by a corresponding Backend.begin() call.
Parameters:  block (Block.attr) – identifies the type of computation being worked on based on Block attribute specified
 identifier (int) – unique identifier for this particular iteration of the block. Will typically be something like epoch number, minibatch number, and so forth.
See also

equal
(a, b, out=None)[source]¶ Performs elementwise equality testing on each element of left and right, storing the result in out. Each operand is assumed to be the same shape (or broadcastable as such).
Parameters:  a (Tensor, numeric) – lefthand side operand.
 b (Tensor, numeric) – righthand side operand.
 out (Tensor, optional) – where the result will be stored. If out is None, only the optree will be returned.
Returns: the resulting optree
Return type:

execute
(node)[source]¶ Execute the optree. There must be one and only one ‘assign’ op at the top of the optree when execute is called.
Parameters: node (OpTreeNode) – The optree to execute.

exp
(a, out=None)[source]¶ Perform elementwise exponential transformation on Tensor a, storing the result in Tensor out. Both Tensor’s should have identical shape.
Parameters:  a (Tensor) – input to be transformed.
 out (Tensor, optional) – where the result will be stored. If out is None, only the optree will be returned.
Returns: the resulting optree
Return type:

exp2
(a, out=None)[source]¶ Perform elementwise 2based exponential transformation on Tensor a, storing the result in Tensor out. Both Tensor’s should have identical shape.
Parameters:  a (Tensor) – input to be transformed.
 out (Tensor, optional) – where the result will be stored. If out is None, only the optree will be returned.
Returns: the resulting optree
Return type:

fabs
(a, out=None)[source]¶ Perform elementwise absolute value of Tensor a, storing the result in Tensor out. Both Tensor’s should have identical shape. Implemented as an alias of absolute.
Parameters:  a (Tensor) – input to be transformed.
 out (Tensor, optional) – where the result will be stored. If out is None, only the optree will be returned.
Returns: the resulting optree
Return type:

finite
(a, out=None)[source]¶ Perform elementwise test of finiteness (not infinity or not Not a Number) on Tensor a, storing the result in Tensor out. Both Tensor’s should have identical shape.
Parameters:  a (Tensor) – input to be transformed.
 out (Tensor, optional) – where the result will be stored. If out is None, only the optree will be returned.
Returns: the resulting optree
Return type:

fprop_conv
(layer, I, F, O, alpha=1.0, relu=False, repeat=1)¶ Forward propagate the inputs of a convolutional network layer to produce output.
Parameters:  layer – the conv layer as a parameter object
 I (Tensor) – inputs
 F (Tensor) – the weights (filters)
 O (Tensor) – outputs
 alpha (float, optional) – linear scaling. Defaults to 1.0
 relu (bool, optional) – apply ReLu before output. Default not to.
 repeat (int, optional) – Repeat this operation the specified number of times. Defaults to 1.

fprop_pool
(layer, I, O)¶ Forward propagate pooling layer.
Parameters:

gen_rng
(seed=None)¶ Setup the random number generator(s) and store the state in self.init_rng_state.
Parameters: seed (int or None) – RNG seed, if the seed is None, then a seed will be randomly chosen Returns: numpy RNG Return type: np.random.RandomState

greater
(a, b, out=None)[source]¶ Performs elementwise greater than testing on each element of left and right, storing the result in out. Each operand is assumed to be the same shape (or broadcastable as such).
Parameters:  a (Tensor, numeric) – lefthand side operand.
 b (Tensor, numeric) – righthand side operand.
 out (Tensor, optional) – where the result will be stored. If out is None, only theshape optree will be returned.
Returns: the resulting optree
Return type:

greater_equal
(a, b, out=None)[source]¶ Performs elementwise greater than or equal testing on each element of left and right, storing the result in out. Each operand is assumed to be the same shape (or broadcastable as such).
Parameters:  a (Tensor, numeric) – lefthand side operand.
 b (Tensor, numeric) – righthand side operand.
 out (Tensor, optional) – where the result will be stored. If out is None, only the optree will be returned.
Returns: the resulting optree
Return type:

iobuf
(dim0, x=None, dtype=None, name=None, persist_values=True, shared=None, parallelism=None)[source]¶ Allocate input and output buffer for layer based on batch size. This is used because the layer does not know about the batch size.
Parameters:  dim0 (tuple or int) – I/O buffer dimension for layer (without the axis specifying the batch size).
 x (datatype, optional) – If present and not None, x will be returned directly. x will be not None if the buffer has already been allocated.
 dtype (datatype, optional) – If present, specifies the underlying type to employ for each element.
 name (str, optional) – name indentifying the tensor (used in printing).
 persist_values (bool, optional) – If set to True (the default), the values assigned to this Tensor will persist across multiple begin and end calls. Setting to False may provide a performance increase if values do not need to be maintained across such calls
 shared (buffer, optional) – If present will attempt to reuse the memory in shared to allocate the I/O buffer
 parallelism (str, optional) – Indicates type of parallelism (Data, Model) employed by this buffer. Ignored on CPU and GPU backends, defaults to no parallelism.
Returns: array object
Return type:

less
(a, b, out=None)[source]¶ Performs elementwise less than testing on each element of left and right, storing the result in out. Each operand is assumed to be the same shape (or broadcastable as such).
Parameters:  a (Tensor, numeric) – lefthand side operand.
 b (Tensor, numeric) – righthand side operand.
 out (Tensor, optional) – where the result will be stored. If out is None, only the optree will be returned.
Returns: the resulting optree
Return type:

less_equal
(a, b, out=None)[source]¶ Performs elementwise less than or equal testing on each element of left and right, storing the result in out. Each operand is assumed to be the same shape (or broadcastable as such).
Parameters:  a (Tensor, numeric) – lefthand side operand.
 b (Tensor, numeric) – righthand side operand.
 out (Tensor, optional) – where the result will be stored. If out is None, only the optree will be returned.
Returns: the resulting optree
Return type:

log
(a, out=None)[source]¶ Perform elementwise natural logarithm transformation on Tensor a, storing the result in Tensor out. Both Tensor’s should have identical shape.
Parameters:  a (Tensor) – input to be transformed.
 out (Tensor, optional) – where the result will be stored. If out is None, only the optree will be returned.
Returns: the resulting optree
Return type:

log2
(a, out=None)[source]¶ Perform elementwise 2based logarithm transformation on Tensor a, storing the result in Tensor out. Both Tensor’s should have identical shape.
Parameters:  a (Tensor) – input to be transformed.
 out (Tensor, optional) – where the result will be stored. If out is None, only the optree will be returned.
Returns: the resulting optree
Return type:

make_binary_mask
(out, keepthresh=0.5)¶ Create a binary mask for dropout layers.
Parameters:  out (Tensor) – Output tensor
 keepthresh (float, optional) – fraction of ones. Defaults to 0.5

max
(a, axis=None, out=None, keepdims=True)[source]¶ Calculates the maximal element value along the specified axes.
Parameters:  a (Tensor) – the Tensor on which to perform the operation
 axis (int, optional) – the dimension along which to compute. If set to None, we will take max over all dimensions.
 out (Tensor, optional) – where the result will be stored. If out is None, only the optree will be returned.
 keepdims (bool, optional) – Keep the axes being computed over in the output (with size 1), instead of collapsing. Defaults to True.
Returns: the resulting optree
Return type:

maximum
(a, b, out=None)[source]¶ Performs elementwise maximum value assignment based on corresponding elements of left and right, storing the result in out. Each operand is assumed to be the same shape (or broadcastable as such).
Parameters:  a (Tensor, numeric) – lefthand side operand.
 b (Tensor, numeric) – righthand side operand.
 out (Tensor, optional) – where the result will be stored. If out is None, only the optree will be returned.
Returns: the resulting optree
Return type:

mean
(a, axis=None, partial=None, out=None, keepdims=True)[source]¶ Calculates the arithmetic mean of the elements along the specified axes.
Parameters:  a (Tensor) – the Tensor on which to perform the operation
 axis (int, optional) – the dimension along which to compute. If set to None, we will take mean over all dimensions. Defaults to None
 partial (bool, optional) – Not currently used.
 out (Tensor, optional) – where the result will be stored. If out is None, only the optree will be returned.
 keepdims (bool, optional) – Keep the axes being computed over in the output (with size 1), instead of collapsing. Defaults to True.
Returns: the resulting optree
Return type:

min
(a, axis=None, out=None, keepdims=True)[source]¶ Calculates the minimal element value along the specified axes.
Parameters:  a (Tensor) – the Tensor on which to perform the operation
 axis (int, optional) – the dimension along which to compute. If set to None, we will take min over all dimensions.
 out (Tensor, optional) – where the result will be stored. If out is None, only the optree will be returned.
 keepdims (bool, optional) – Keep the axes being computed over in the output (with size 1), instead of collapsing. Defaults to True.
Returns: the resulting optree
Return type:

minimum
(a, b, out=None)[source]¶ Performs elementwise minimum value assignment based on corresponding elements of left and right, storing the result in out. Each operand is assumed to be the same shape (or broadcastable as such).
Parameters:  a (Tensor, numeric) – lefthand side operand.
 b (Tensor, numeric) – righthand side operand.
 out (Tensor, optional) – where the result will be stored. If out is None, only the optree will be returned.
Returns: the resulting optree
Return type:

multiply
(a, b, out=None)[source]¶ Perform elementwise multiplication on the operands, storing the resultant values in the out Tensor. Each operand and out must have identical shape or be broadcastable as such.
Parameters:  a (Tensor, numeric) – lefthand side operand.
 b (Tensor, numeric) – righthand side operand.
 out (Tensor, optional) – where the result will be stored. If out is None, only the optree will be returned.
Returns: the resulting optree
Return type:

negative
(a, out=None)[source]¶ Perform elementwise negation of Tensor a, storing the result in Tensor out. Both Tensor’s should have identical shape.
Parameters:  a (Tensor) – input to be transformed.
 out (Tensor, optional) – where the result will be stored. If out is None, only the optree will be returned.
Returns: the resulting optree
Return type:

not_equal
(a, b, out=None)[source]¶ Performs elementwise nonequality testing on each element of left and right, storing the result in out. Each operand is assumed to be the same shape (or broadcastable as such).
Parameters:  a (Tensor, numeric) – lefthand side operand.
 b (Tensor, numeric) – righthand side operand.
 out (Tensor, optional) – where the result will be stored. If out is None, only the optree will be returned.
Returns: the resulting optree
Return type:

onehot
(indices, axis, out=None)[source]¶ Generate optree for converting indices to a onehot representation.
Parameters: Returns: the resulting optree
Return type:

ones
(shape, dtype=None, name=None, persist_values=True, parallel=False, distributed=False)¶ Instantiate a new instance of this backend’s Tensor class, populating each element with a value of 1.
Parameters:  shape (int, list) – length of each dimension of the Tensor.
 dtype (datatype, optional) – If present, specifies the underlying type to employ for each element.
 name (str, optional) – name indentifying the tensor (used in printing).
 persist_values (bool, optional) – If set to True (the default), the values assigned to this Tensor will persist across multiple begin and end calls. Setting to False may provide a performance increase if values do not need to be maintained across such calls
 parallel (bool, optional) – If True and using multiGPU backend, replicate copies of this tensor across devices. Defaults to False, and has no effect on CPU, or (single) GPU backends.
 distributed (bool, optional) – If True and using multiGPU backend, this tensor is fragmented and partitioned across devices. Defaults to False, and has no effect on CPU, or (single) GPU backends.
Returns: array object
Return type: Raises: NotImplementedError
– Can’t be instantiated directly.

output_dim
(X, S, padding, strides, pooling=False, dilation=1)[source]¶ Compute along 1 dimension, with these sizes, what will be the output dimension.
Parameters:

pool_layer
(dtype, op, N, C, D=1, H=1, W=1, J=1, T=1, R=1, S=1, pad_j=0, pad_d=0, pad_h=0, pad_w=0, str_j=None, str_d=None, str_h=None, str_w=None)¶ Create a new PoolLayer parameter object. This then is passed as an argument to all pooling kernels.
Parameters:  op (str) – “max”, “avg”, “l2” pooling (currently bprop only supports max, but not avg and l2)
 N (int) – Number of images in minibatch
 C (int) – Number of input feature maps
 D (int, optional) – Depth of input image. Defaults to 1
 H (int, optional) – Height of input image. Defaults to 1
 W (int, optional) – Width of input image. Defaults to 1
 J (int, optional) – Size of feature map pooling window (maxout n_pieces). Defaults to 1
 T (int, optional) – Depth of pooling window. Defaults to 1
 R (int, optional) – Height of pooling window. Defaults to 1
 S (int, optional) – Width of pooling window. Defaults to 1
 pad_j (int, optional) – amount of zeropadding around the fm pooling window edge. Defaults to 0.
 pad_d (int, optional) – amount of zeropadding around the depth edge Defaults to 0.
 pad_h (int, optional) – amount of zeropadding around the height edge Defaults to 0.
 pad_w (int, optional) – amount of zeropadding around the width edge Defaults to 0.
 str_j (int, optional) – factor to step the filters by in the fm pooling window direction. Defaults to 1
 str_d (int, optional) – factor to step the filters by in the depth direction. Defaults to 1
 str_h (int, optional) – factor to step the filters by in the depth direction. Defaults to 1
 str_w (int, optional) – factor to step the filters by in the depth direction. Defaults to 1
Leave spatial dimensions at 1 to allow feature map pooling in the fc layers.

power
(a, b, out=None)[source]¶ Perform elementwise raise of tsr values to specified power, storing the result in Tensor out. Both Tensor’s should have identical shape.
Parameters:  a (Tensor) – input to be transformed.
 b (Tensor, numeric) – exponentiated value to be applied to element. Examples include 2 (square), 0.5 (sqaure root).
 out (Tensor, optional) – where the result will be stored. If out is None, only the optree will be returned.
Returns: the resulting optree
Return type:

reciprocal
(a, out=None)[source]¶ Perform elementwise reciprocal of Tensor a, storing the result in Tensor out. Both Tensor’s should have identical shape.
Parameters:  a (Tensor) – input to be transformed.
 power (Tensor, numeric) – exponentiated value to be applied to element. Examples include 2 (square), 0.5 (sqaure root).
 out (Tensor, optional) – where the result will be stored. If out is None, only the optree will be returned.
Returns: the resulting optree
Return type:

revert_tensor
(tensor)[source]¶ Reverts a tensor to its original state after being distributed by distribute_data.
Parameters: tensor – Tensor to be reverted

rint
(a, out=None)[source]¶ Perform elementwise rounding to nearest int.
Parameters:  a (Tensor) – input to be transformed.
 out (Tensor, optional) – where the result will be stored. If out is None, only the optree will be returned.
Returns: the resulting optree
Return type:

rng_get_state
(state)¶ Get the random number generator state to a specific state.
Returns a tuple since some backends have multiple RNG states (e.g. onhost and ondevice)
Returns:  array of numpy ndarray which defines the current
 state of the RNGs
Return type: tuple

rng_reset
()¶ Reset the random state to the state where the Backend is first initialized.

rng_set_state
(state)¶ Set the random number generator state to a specific state.
Parameters: state (np.array) – array which is used to define the RNG state

safelog
(a, out=None)[source]¶ Perform elementwise natural logarithm transformation on Tensor a, storing the result in Tensor out. Both Tensor’s should have identical shape. This log function has built in safety for underflow.
Parameters:  a (Tensor) – input to be transformed.
 out (Tensor, optional) – where the result will be stored. If out is None, only the optree will be returned.
Returns: the resulting optree
Return type:

set_caffe_compat
()[source]¶ Set flag to make layers compatible with caffe in terms of conv and pool layer output size determination and dropout layer implementation.

sgn
(a, out=None)[source]¶ Perform elementwise indication of the sign of Tensor a, storing the result in Tensor out. Both Tensor’s should have identical shape.
Parameters:  a (Tensor) – input to be transformed.
 out (Tensor, optional) – where the result will be stored. If out is None, only the optree will be returned.
Returns: the resulting optree
Return type:
Computes the backend specific size needed for an iobuf with a specified shape that is meant to be shared between layers.
Parameters: Returns: Size of required iobuf
Return type:

shift
(a, b, value=True, out=None)[source]¶ Performs elementwise shift based on corresponding elements of left and right, storing the result in out. Positive is left shift, and negative is right shift. Each operand is assumed to be the same shape (or broadcastable as such).
Parameters:  a (Tensor, numeric) – lefthand side operand.
 b (Tensor, numeric) – righthand side operand.
 value (int) – shift by value or exponent
 out (Tensor, optional) – where the result will be stored. If out is None, only the optree will be returned.
Returns: the resulting optree
Return type:

sig
(a, out=None)[source]¶ Perform elementwise sigmoid transformation on Tensor a, storing the result in Tensor out. Both Tensor’s should have identical shape.
Parameters:  a (Tensor) – input to be transformed.
 out (Tensor, optional) – where the result will be stored. If out is None, only the optree will be returned.
Returns: the resulting optree
Return type:

sig2
(a, out=None)[source]¶ Perform elementwise 2based sigmoid logarithm transformation on Tensor a, storing the result in Tensor out. Both Tensor’s should have identical shape.
Parameters:  a (Tensor) – input to be transformed.
 out (Tensor, optional) – where the result will be stored. If out is None, only the optree will be returned.
Returns: the resulting optree
Return type:

sqrt
(a, out=None)[source]¶ Perform elementwise squareroot of Tensor a, storing the result in Tensor out. Both Tensor’s should have identical shape.
Parameters:  a (Tensor) – input to be transformed.
 out (Tensor, optional) – where the result will be stored. If out is None, only the optree will be returned.
Returns: the resulting optree
Return type:

square
(a, out=None)[source]¶ Perform elementwise square of Tensor a, storing the result in Tensor out. Both Tensor’s should have identical shape.
Parameters:  a (Tensor) – input to be transformed.
 out (Tensor, optional) – where the result will be stored. If out is None, only the optree will be returned.
Returns: the resulting optree
Return type:

std
(a, axis=None, partial=None, out=None, keepdims=True)[source]¶ Calculates the standard deviation of the elements along the specified axes.
Parameters:  a (Tensor) – the Tensor on which to perform the operation
 axis (int, optional) – the dimension along which to compute. If set to None, we will take std over all dimensions.
 out (Tensor, optional) – where the result will be stored. If out is None, only the optree will be returned.
 partial (bool, optional) – Not currently used.
 keepdims (bool, optional) – Keep the axes being computed over in the output (with size 1), instead of collapsing. Defaults to True.
Returns: the resulting optree
Return type:

subtract
(a, b, out=None)[source]¶ Perform elementwise subtraction on the operands, storing the resultant values in the out Tensor. Each operand and out must have identical shape or be broadcastable as such.
Parameters:  a (Tensor, numeric) – lefthand side operand.
 b (Tensor, numeric) – righthand side operand.
 out (Tensor, optional) – where the result will be stored. If out is None, only the optree will be returned.
Returns: the resulting optree
Return type:

sum
(a, axis=None, out=None, keepdims=True)[source]¶ Calculates the summation of the elements along the specified axis.
Parameters:  a (Tensor) – the Tensor on which to perform the sum
 axis (int, optional) – the dimension along which to compute. If set to None, we will sum over all dimensions.
 out (Tensor, optional) – where the result will be stored. If out is None, only the optree will be returned.
 keepdims (bool, optional) – Keep the axes being computed over in the output (with size 1), instead of collapsing. Defaults to True.
Returns: the resulting optree
Return type:

take
(a, indices, axis, out=None)[source]¶ Extract elements based on the indices along a given axis.
Parameters:  a (Tensor) – the Tensor on which to perform the operation
 indices (Tensor, numpy ndarray) – indicies of elements to select
 axis (int, optional) – the dimension along which to compute. If set to None, we will extract over all dimensions (flattened first)
 out (Tensor, optional) – where the result will be stored. If out is None, only the optree will be returned.

tanh
(a, out=None)[source]¶ Perform elementwise hyperbolic tangent transformation on Tensor a, storing the result in Tensor out. Both Tensor’s should have identical shape.
Parameters:  a (Tensor) – input to be transformed.
 out (Tensor, optional) – where the result will be stored. If out is None, only the optree will be returned.
Returns: the resulting optree
Return type:

tanh2
(a, out=None)[source]¶ Perform elementwise 2based hyperbolic tangent transformation on Tensor a, storing the result in Tensor out. Both Tensor’s should have identical shape.
Parameters:  a (Tensor) – input to be transformed.
 out (Tensor, optional) – where the result will be stored. If out is None, only the optree will be returned.
Returns: the resulting optree
Return type:

true_divide
(a, b, out=None)[source]¶ Here it is an alias of divide. Instead of the Python traditional ‘floor division’, this returns a true division.
Parameters:  a (Tensor, numeric) – lefthand side operand.
 b (Tensor, numeric) – righthand side operand.
 out (Tensor, optional) – where the result will be stored. If out is None, only the optree will be returned.
Returns: the resulting optree
Return type:

update_conv
(layer, I, E, grad_F, alpha=1.0, repeat=1)¶ Compute the updated gradient for a convolutional network layer.
Parameters:

update_fc_bias
(err, out)[source]¶ Compute the updated bias gradient for a fully connected network layer.
Parameters:

var
(a, axis=None, partial=None, out=None, keepdims=True, binary=False)[source]¶ Calculates the variance of the elements along the specified axes.
Parameters:  a (Tensor) – the Tensor on which to perform the operation
 axis (int, optional) – the dimension along which to compute. If set to None, we will take var over all dimensions. Defaults to None
 partial (bool, optional) – Not currently used.
 out (Tensor, optional) – where the result will be stored. If out is None, only the optree will be returned.
 keepdims (bool, optional) – Keep the axes being computed over in the output (with size 1), instead of collapsing. Defaults to True.
Returns: the resulting optree
Return type:

zeros
(shape, dtype=None, name=None, persist_values=True, parallel=False, distributed=False)¶ Instantiate a new instance of this backend’s Tensor class, populating each element with a value of 0.
Parameters:  shape (int, list) – length of each dimension of the Tensor.
 dtype (datatype, optional) – If present, specifies the underlying type to employ for each element.
 name (str, optional) – name indentifying the tensor (used in printing).
 persist_values (bool, optional) – If set to True (the default), the values assigned to this Tensor will persist across multiple begin and end calls. Setting to False may provide a performance increase if values do not need to be maintained across such calls
 parallel (bool, optional) – If True and using multiGPU backend, replicate copies of this tensor across devices. Defaults to False, and has no effect on CPU, or (single) GPU backends.
 distributed (bool, optional) – If True and using multiGPU backend, this tensor is fragmented and partitioned across devices. Defaults to False, and has no effect on CPU, or (single) GPU backends.
Returns: array object
Return type: Raises: NotImplementedError
– Can’t be instantiated directly.

zeros_like
(other_ary, name=None, persist_values=True)¶ Instantiate a new instance of this backend’s Tensor class, with the shape taken from other_ary and populating each element with a value of 0.
Parameters:  other_ary (tensor object) – Tensor to inherit the dimensions of.
 name (str, optional) – name indentifying the tensor (used in printing).
 dtype (datatype, optional) – If present, specifies the underlying type to employ for each element.
 persist_values (bool, optional) – If set to True (the default), the values assigned to this Tensor will persist across multiple begin and end calls. Setting to False may provide a performance increase if values do not need to be maintained across such calls.
Returns: array object
Return type: Raises: NotImplementedError
– Can’t be instantiated directly.