neon.backends.backend.Backend

class neon.backends.backend.Backend(rng_seed=None, default_dtype=<type 'numpy.float32'>, compat_mode=None, deterministic=None)[source]

Bases: neon.backends.abstract_backend.AbstractBackend

Backend interface used to manipulate Tensor data. This abstract base class defines what operations each concrete backend must support. NervanaGPU and NervanaCPU inherit Backend.

Parameters:
  • rng_seed (int, optional) – random number generator seed value
  • default_dtype (numpy.ndtype, optional) – Elemental data type to use when creating new tensors if not otherwise specified. Defaults to np.float32
  • compat_mode (str, optional) – Flag to match implementation of other libraries. Currently only ‘caffe’ is supported, defaults to None.
  • deterministic (bool, optional) – Flag to use deterministic kernels where applicable. This may cause a small increase in memory usage and slow down. Only relevant for GPU backends.
__init__(rng_seed=None, default_dtype=<type 'numpy.float32'>, compat_mode=None, deterministic=None)[source]

Methods

__init__([rng_seed, default_dtype, ...])
absolute(a[, out]) Perform element-wise absolute value of Tensor a, storing the result in Tensor out.
add(a, b[, out]) Perform element-wise addition on the operands, storing the resultant values in the out Tensor.
add_fc_bias(inputs, bias) Add the bias for a fully connected network layer.
allocate_backend(name, **kargs) Allocate a named backend.
argmax(a[, axis, out, keepdims]) Calculates the indices of the maximal element value along the specified axis.
argmin(a[, axis, out, keepdims]) Calculates the indices of the minimal element value along the specified axis.
array(ary[, dtype, name, persist_values, ...]) Instantiate a new instance of this backend’s Tensor class, populating elements based on ary values.
backend_choices() Return the list of available backends.
batched_dot(A, B, C[, alpha, beta, relu]) Perform one of the following operations:
begin(block, identifier) Signal the start of a block of repeated computation (at the start of a loop).
binarize(a[, stochastic, out]) Perform element-wise binarization.
bprop_conv(layer, F, E, grad_I[, alpha, repeat]) Backward propagate the error through a convolutional network layer.
bprop_pool(layer, I, E, grad_I) Backward propagate pooling layer.
check_caffe_compat() Check whether compatibility mode is set to ‘caffe’.
cleanup_backend() Release any resources that have been acquired by this backend.
clip(a, a_min, a_max[, out]) Performs element-wise clipping of Tensor a, storing the result in out.
compound_bprop_lut(nin, inputs, error, ...) Backward propagate lookup table layer.
compound_dot(A, B, C[, alpha, beta, relu]) Perform one of the following operations (* is dot product) C = alpha * A * B + beta * C C = alpha * A.T * B + beta * C C = alpha * A * B.T + beta * C.
compound_rnn_unroll_bprop(W_recur, ...[, ...]) Time step unrolling portion of recurrent layer bprop.
compound_rnn_unroll_fprop(W_recur, h_prev_s, ...) Time step unrolling portion of recurrent layer fprop.
conv_layer(dtype, N, C, K[, D, H, W, T, R, ...]) Create a new ConvLayer parameter object.
deconv_layer(dtype, N, C, K, P, Q[, R, S, ...]) Create a new Deconvolution parameter object.
distribute_data(tensor, layer_parallelism) For backends which support distributed training, this will distribute or gather the error or activation tensor depending on the type of parallelism used to distribute the layer computation.
divide(a, b[, out]) Perform element-wise division on the operands, storing the resultant values in the out Tensor.
dot(a, b[, out]) Dot product of two Tensors.
empty(shape[, dtype, name, persist_values, ...]) Instantiate a new instance of this backend’s Tensor class, without initializing element values.
empty_like(other_ary[, name, persist_values]) Instantiate a new instance of this backend’s Tensor class, with the shape taken from other_ary.
end(block, identifier) Signal the corresponding end of a block of repeated computation (at the end of a loop).
equal(a, b[, out]) Performs element-wise equality testing on each element of left and right, storing the result in out.
execute(node) Execute the optree.
exp(a[, out]) Perform element-wise exponential transformation on Tensor a, storing the result in Tensor out.
exp2(a[, out]) Perform element-wise 2-based exponential transformation on Tensor a, storing the result in Tensor out.
fabs(a[, out]) Perform element-wise absolute value of Tensor a, storing the result in Tensor out.
finite(a[, out]) Perform element-wise test of finiteness (not infinity or not Not a Number) on Tensor a, storing the result in Tensor out.
fprop_conv(layer, I, F, O[, alpha, relu, repeat]) Forward propagate the inputs of a convolutional network layer to produce output.
fprop_pool(layer, I, O) Forward propagate pooling layer.
gen_rng([seed]) Setup the random number generator(s) and store the state in self.init_rng_state.
greater(a, b[, out]) Performs element-wise greater than testing on each element of left and right, storing the result in out.
greater_equal(a, b[, out]) Performs element-wise greater than or equal testing on each element of left and right, storing the result in out.
iobuf(dim0[, x, dtype, name, ...]) Allocate input and output buffer for layer based on batch size.
less(a, b[, out]) Performs element-wise less than testing on each element of left and right, storing the result in out.
less_equal(a, b[, out]) Performs element-wise less than or equal testing on each element of left and right, storing the result in out.
log(a[, out]) Perform element-wise natural logarithm transformation on Tensor a, storing the result in Tensor out.
log2(a[, out]) Perform element-wise 2-based logarithm transformation on Tensor a, storing the result in Tensor out.
make_binary_mask(out[, keepthresh]) Create a binary mask for dropout layers.
max(a[, axis, out, keepdims]) Calculates the maximal element value along the specified axes.
maximum(a, b[, out]) Performs element-wise maximum value assignment based on corresponding elements of left and right, storing the result in out.
mean(a[, axis, partial, out, keepdims]) Calculates the arithmetic mean of the elements along the specified axes.
min(a[, axis, out, keepdims]) Calculates the minimal element value along the specified axes.
minimum(a, b[, out]) Performs element-wise minimum value assignment based on corresponding elements of left and right, storing the result in out.
multiply(a, b[, out]) Perform element-wise multiplication on the operands, storing the resultant values in the out Tensor.
negative(a[, out]) Perform element-wise negation of Tensor a, storing the result in Tensor out.
not_equal(a, b[, out]) Performs element-wise non-equality testing on each element of left and right, storing the result in out.
onehot(indices, axis[, out]) Generate optree for converting indices to a onehot representation.
ones(shape[, dtype, name, persist_values, ...]) Instantiate a new instance of this backend’s Tensor class, populating each element with a value of 1.
output_dim(X, S, padding, strides[, ...]) Compute along 1 dimension, with these sizes, what will be the output dimension.
pool_layer(dtype, op, N, C[, D, H, W, J, T, ...]) Create a new PoolLayer parameter object.
power(a, b[, out]) Perform element-wise raise of tsr values to specified power, storing the result in Tensor out.
reciprocal(a[, out]) Perform element-wise reciprocal of Tensor a, storing the result in Tensor out.
revert_tensor(tensor) Reverts a tensor to its original state after being distributed by distribute_data.
rint(a[, out]) Perform element-wise rounding to nearest int.
rng_get_state(state) Get the random number generator state to a specific state.
rng_reset() Reset the random state to the state where the Backend is first initialized.
rng_set_state(state) Set the random number generator state to a specific state.
safelog(a[, out]) Perform element-wise natural logarithm transformation on Tensor a, storing the result in Tensor out.
set_caffe_compat() Set flag to make layers compatible with caffe in terms of conv and pool layer output size determination and dropout layer implementation.
sgn(a[, out]) Perform element-wise indication of the sign of Tensor a, storing the result in Tensor out.
shared_iobuf_size(shape, parallelism) Computes the backend specific size needed for an iobuf with a specified shape that is meant to be shared between layers.
shift(a, b[, value, out]) Performs element-wise shift based on corresponding elements of left and right, storing the result in out.
sig(a[, out]) Perform element-wise sigmoid transformation on Tensor a, storing the result in Tensor out.
sig2(a[, out]) Perform element-wise 2-based sigmoid logarithm transformation on Tensor a, storing the result in Tensor out.
sqrt(a[, out]) Perform element-wise square-root of Tensor a, storing the result in Tensor out.
square(a[, out]) Perform element-wise square of Tensor a, storing the result in Tensor out.
std(a[, axis, partial, out, keepdims]) Calculates the standard deviation of the elements along the specified axes.
subtract(a, b[, out]) Perform element-wise subtraction on the operands, storing the resultant values in the out Tensor.
sum(a[, axis, out, keepdims]) Calculates the summation of the elements along the specified axis.
take(a, indices, axis[, out]) Extract elements based on the indices along a given axis.
tanh(a[, out]) Perform element-wise hyperbolic tangent transformation on Tensor a, storing the result in Tensor out.
tanh2(a[, out]) Perform element-wise 2-based hyperbolic tangent transformation on Tensor a, storing the result in Tensor out.
true_divide(a, b[, out]) Here it is an alias of divide.
update_conv(layer, I, E, grad_F[, alpha, repeat]) Compute the updated gradient for a convolutional network layer.
update_fc_bias(err, out) Compute the updated bias gradient for a fully connected network layer.
var(a[, axis, partial, out, keepdims, binary]) Calculates the variance of the elements along the specified axes.
xnor_compound_dot(A, B, C[, beta]) Performs XNOR GEMM
zeros(shape[, dtype, name, persist_values, ...]) Instantiate a new instance of this backend’s Tensor class, populating each element with a value of 0.
zeros_like(other_ary[, name, persist_values]) Instantiate a new instance of this backend’s Tensor class, with the shape taken from other_ary and populating each element with a value of 0.

Attributes

backends
absolute(a, out=None)[source]

Perform element-wise absolute value of Tensor a, storing the result in Tensor out. Both Tensor’s should have identical shape.

Parameters:
  • a (Tensor) – input to be transformed.
  • out (Tensor, optional) – where the result will be stored. If out is None, only the op-tree will be returned.
Returns:

the resulting op-tree

Return type:

OpTreeNode

add(a, b, out=None)[source]

Perform element-wise addition on the operands, storing the resultant values in the out Tensor. Each operand and out must have identical shape or be broadcastable as such.

Parameters:
  • a (Tensor, numeric) – left-hand side operand.
  • b (Tensor, numeric) – right-hand side operand.
  • out (Tensor, optional) – where the result will be stored. If out is None, only the op-tree will be returned.
Returns:

the resulting op-tree

Return type:

OpTreeNode

add_fc_bias(inputs, bias)[source]

Add the bias for a fully connected network layer.

Parameters:
  • inputs (Tensor) – the input to update.
  • bias (Tensor) – the amount to increment
static allocate_backend(name, **kargs)[source]

Allocate a named backend.

argmax(a, axis=1, out=None, keepdims=True)[source]

Calculates the indices of the maximal element value along the specified axis. If multiple elements contain the maximum, only the indices of the first are returned.

Parameters:
  • a (Tensor) – the Tensor on which to perform the operation
  • axis (int, optional) – the dimension along which to compute. If set to None, we will take argmax over all dimensions. Defaults to 1
  • out (Tensor, optional) – where the result will be stored. If out is None, only the op-tree will be returned.
  • keepdims (bool, optional) – Keep the axes being computed over in the output (with size 1), instead of collapsing. Defaults to True.
Returns:

the resulting op-tree

Return type:

OpTreeNode

argmin(a, axis=1, out=None, keepdims=True)[source]

Calculates the indices of the minimal element value along the specified axis. If multiple elements contain the minimum, only the indices of the first are returned.

Parameters:
  • a (Tensor) – the Tensor on which to perform the operation
  • axis (int, optional) – the dimension along which to compute. If set to None, we will take argmin over all dimensions. Defaults to 1
  • out (Tensor, optional) – where the result will be stored. If out is None, only the op-tree will be returned.
  • keepdims (bool, optional) – Keep the axes being computed over in the output (with size 1), instead of collapsing. Defaults to True.
Returns:

the resulting op-tree

Return type:

OpTreeNode

array(ary, dtype=None, name=None, persist_values=True, parallel=False, distributed=False)

Instantiate a new instance of this backend’s Tensor class, populating elements based on ary values.

Parameters:
  • ary (array_like) – input array object to construct from. Can be built-in python scalar or list (of lists), or a numpy.ndarray
  • dtype (data-type, optional) – If present, specifies the underlying type to employ for each element.
  • name (str, optional) – name indentifying the tensor (used in printing).
  • persist_values (bool, optional) – If set to True (the default), the values assigned to this Tensor will persist across multiple begin and end calls. Setting to False may provide a performance increase if values do not need to be maintained across such calls
  • parallel (bool, optional) – If True and using multi-GPU backend, replicate copies of this tensor across devices. Defaults to False, and has no effect on CPU, or (single) GPU backends.
  • distributed (bool, optional) – If True and using multi-GPU backend, this tensor is fragmented and partitioned across devices. Defaults to False, and has no effect on CPU, or (single) GPU backends.
Returns:

array object

Return type:

Tensor

Raises:

NotImplementedError – Can’t be instantiated directly.

See also

empty(), zeros(), ones()

static backend_choices()[source]

Return the list of available backends.

backends = {'gpu': <class 'neon.backends.nervanagpu.NervanaGPU'>, 'cpu': <class 'neon.backends.nervanacpu.NervanaCPU'>}
batched_dot(A, B, C, alpha=1.0, beta=0.0, relu=False)

Perform one of the following operations: 1 For fprop: A(K, C), B(X,C,N), C(X,K,N) –> call batched_dot(A, B, C) 2 For bprop: A(K, C), B(X,K,N), C(X,C,N) –> call batched_dot(A.T, B, C) 3 For update: A(X,K,N), B(X,C,N), C(K,C) –> call batched_dot(A, B.T, C)

Parameters:
  • A (Tensor) – left-hand input operand
  • B (Tensor) – right-hand input operand
  • C (Tensor) – output operand
  • alpha (float. optional) – scale A*B term
  • beta (float, optional) – scale C term before sum
  • relu (bool, optional) – If True apply ReLu non-linearity before output. Defaults to False.
begin(block, identifier)[source]

Signal the start of a block of repeated computation (at the start of a loop). This operation can be used to help the compiler optimize instruction performance, but has no direct effect on calculations. It must be book-ended by a corresponding Backend.end() call. Note that multiple begin calls can appear adjacent in nested loops.

Parameters:
  • block (Block.attr) – identifies the type of computation being worked on based on Block attribute specified
  • identifier (int) – unique identifier for this particular iteration of the block. Will typically be something like epoch number, mini-batch number, and so forth.

See also

end()

binarize(a, stochastic=True, out=None)[source]

Perform element-wise binarization.

Parameters:
  • a (Tensor) – input to be transformed.
  • stochastic (Bool, optional) – stochastic or deterministic
  • out (Tensor, optional) – where the result will be stored. If out is None, only the op-tree will be returned.
Returns:

the resulting op-tree

Return type:

OpTreeNode

bprop_conv(layer, F, E, grad_I, alpha=1.0, repeat=1)

Backward propagate the error through a convolutional network layer.

Parameters:
  • layer – the conv layer as a parameter object
  • F (Tensor) – the weights (filters)
  • E (Tensor) – errors
  • grad_I (Tensor) – gradient to inputs (output delta)
  • alpha (float, optional) – linear scaling. Defaults to 1.0
  • repeat (int, optional) – Repeat this operation the specified number of times. Defaults to 1.
bprop_pool(layer, I, E, grad_I)

Backward propagate pooling layer.

Parameters:
  • layer (PoolLayer) – The pool layer object. Different backends have different pool layers.
  • I (Tensor) – Input tensor.
  • E (Tensor) – Error tensor.
  • grad_I (Tensor) – Gradient tensor (delta)
check_caffe_compat()[source]

Check whether compatibility mode is set to ‘caffe’.

cleanup_backend()[source]

Release any resources that have been acquired by this backend.

clip(a, a_min, a_max, out=None)[source]

Performs element-wise clipping of Tensor a, storing the result in out. The clipped value will be between [a_min, a_max].

Parameters:
  • a (Tensor) – the Tensor on which to perform the operation
  • a_min (Tensor, numeric) – lower bound for clip (inclusive).
  • a_max (Tensor, numeric) – upper bound for clip (inclusive).
  • out (Tensor, optional) – where the result will be stored. If out is None, only the op-tree will be returned.
Returns:

the resulting op-tree

Return type:

OpTreeNode

compound_bprop_lut(nin, inputs, error, error_t, dW, pad_idx, alpha=1.0, beta=0)

Backward propagate lookup table layer.

Parameters:
  • nin (int) – Number of input word_ids.
  • inputs (Tensor) – Input tensor.
  • error (Tensor) – Error tensor.
  • error_t (Tensor) – Transposed error tensor.
  • dW (Tensor) – Gradient tensor (delta).
  • pad_idx (int) –
  • alpha (float) –
  • beta (float) –
compound_dot(A, B, C, alpha=1.0, beta=0.0, relu=False)

Perform one of the following operations (* is dot product) C = alpha * A * B + beta * C C = alpha * A.T * B + beta * C C = alpha * A * B.T + beta * C.

relu: if true, applied before output (and prior to beta addition)

The operation will be short-circuited to: out <- alpha * left * right if beta has value 0 (the default).

Parameters:
  • A (Tensor) – left-hand side operand.
  • B (Tensor) – right-hand side operand.
  • C (Tensor) – output operand
  • alpha (float. optional) – scale A*B term
  • beta (float, optional) – scale C term before sum
  • relu (bool, optional) – If True apply ReLu non-linearity before output. Defaults to False.
compound_rnn_unroll_bprop(W_recur, delta_prev_s, delta_s, h_s, nout, num_steps, num_used_steps, activation, reverse=True)[source]

Time step unrolling portion of recurrent layer bprop.

Parameters:
  • W_recur (Tensor) – Recurrent weight matrix.
  • delta_prev_s (Array) – Array of per time step input delta tensors. Each element in the array is a single time step view into one tensor containing all of the time steps in sequence.
  • delta_s (Array) – Array of per time step input delta tensors. Each element in the array is a single time step view into one tensor containing all of the time steps in sequence.
  • h_s (Tensor) – Array of per time step hidden state tensors. Each element in the array is a single time step view into one tensor containing all of the time steps in sequence.
  • nout (integer) – Number of output units for the layer.
  • num_steps (integer) – Total number of time steps in the buffer.
  • num_used_steps (integer) – Number of time steps being used for real data.
  • activation (Transform) – Activation function for the layer.
  • reverse (boolean) – When true, unrolling will iterate over time steps in reverse (default case for RNN).
compound_rnn_unroll_fprop(W_recur, h_prev_s, h_ff_s, h_s, bias, nout, num_steps, num_used_steps, activation, reverse=False)[source]

Time step unrolling portion of recurrent layer fprop.

Parameters:
  • W_recur (Tensor) – Recurrent weight matrix.
  • h_prev_s (Array) – Array of per time step hidden state tensors. Each element in the array is a single time step view into one tensor containing all of the time steps in sequence.
  • h_ff_s (Array) – Array of per time step hidden state tensors. Each element in the array is a single time step view into one tensor containing all of the time steps in sequence.
  • h_s (Array) – Array of per time step hidden state tensors. Each element in the array is a single time step view into one tensor containing all of the time steps in sequence.
  • bias (Tensor) – Bias tensor to add at each time step.
  • nout (integer) – Number of output units for the layer.
  • num_steps (integer) – Total number of time steps in the buffer.
  • num_used_steps (integer) – Number of time steps being used for real data.
  • activation (Transform) – Activation function for the layer.
  • reverse (boolean) – When true, unrolling will iterate over time steps in reverse (for BiRNN).
conv_layer(dtype, N, C, K, D=1, H=1, W=1, T=1, R=1, S=1, pad_d=0, pad_h=0, pad_w=0, str_d=1, str_h=1, str_w=1, relu=False, bsum=False)

Create a new ConvLayer parameter object. This is then passed as an argument to all the convolution operations.

Parameters:
  • dtype (data-type, optional) – If present, specifies the underlying type to employ for each element.
  • N (int) – Number of images in mini-batch
  • C (int) – Number of input feature maps
  • K (int) – Number of output feature maps
  • D (int, optional) – Depth of input image. Defaults to 1
  • H (int, optional) – Height of input image. Defaults to 1
  • W (int, optional) – Width of input image. Defaults to 1
  • T (int, optional) – Depth of filter kernel. Defaults to 1
  • R (int, optional) – Height of filter kernel. Defaults to 1
  • S (int, optional) – Width of filter kernel. Defaults to 1
  • pad_d (int, optional) – amount of zero-padding around the depth edge Defaults to 0.
  • pad_h (int, optional) – amount of zero-padding around the height edge Defaults to 0.
  • pad_w (int, optional) – amount of zero-padding around the width edge Defaults to 0.
  • str_d (int, optional) – factor to step the filters by in the depth direction. Defaults to 1
  • str_h (int, optional) – factor to step the filters by in the depth direction. Defaults to 1
  • str_w (int, optional) – factor to step the filters by in the depth direction. Defaults to 1
  • relu (bool, optional) – apply a relu transform to the output for fprop or bprop. Defaults to False
  • bsum (bool, optional) – calculate the sum along the batchnorm axis for fprop or bprop. Outputs an fp32 tensor of size Kx1. Defaults to False.
deconv_layer(dtype, N, C, K, P, Q, R=1, S=1, pad_d=0, pad_h=0, pad_w=0, str_d=1, str_h=1, str_w=1)

Create a new Deconvolution parameter object. This then is passed as an argument to all deconvolution kernels.

Parameters:
  • dtype (data-type, optional) – If present, specifies the underlying type to employ for each element.
  • N (int) – Number of images in mini-batch
  • C (int) – Number of input feature maps
  • K (int) – Number of output feature maps
  • P (int) – Height of output
  • Q (int) – Width of output
  • R (int, optional) – Height of filter kernel. Defaults to 1
  • S (int, optional) – Width of filter kernel. Defaults to 1
  • pad_d (int, optional) – amount of zero-padding around the depth edge Defaults to 0.
  • pad_h (int, optional) – amount of zero-padding around the height edge Defaults to 0.
  • pad_w (int, optional) – amount of zero-padding around the width edge Defaults to 0.
  • str_d (int, optional) – factor to step the filters by in the depth direction. Defaults to 1
  • str_h (int, optional) – factor to step the filters by in the depth direction. Defaults to 1
  • str_w (int, optional) – factor to step the filters by in the depth direction. Defaults to 1

Leave spatial dimensions at 1 to allow feature map pooling in the fc layers.

distribute_data(tensor, layer_parallelism)[source]

For backends which support distributed training, this will distribute or gather the error or activation tensor depending on the type of parallelism used to distribute the layer computation. Currently this is only supported by multi-GPU in Nervana cloud.

Parameters:
  • tensor – Tensor containing either activations or errors
  • layer_parallelism – Type of parallelism expected by the layer
Returns:

Tensor which has been altered by this call or None

divide(a, b, out=None)[source]

Perform element-wise division on the operands, storing the resultant values in the out Tensor. Each operand and out must have identical shape or be broadcastable as such.

Parameters:
  • a (Tensor, numeric) – left-hand side operand.
  • b (Tensor, numeric) – right-hand side operand.
  • out (Tensor, optional) – where the result will be stored. If out is None, only the op-tree will be returned.
Returns:

the resulting op-tree

Return type:

OpTreeNode

dot(a, b, out=None)[source]

Dot product of two Tensors.

Parameters:
  • a (Tensor) – left-hand side operand.
  • b (Tensor) – right-hand side operand.
  • out (Tensor, optional) – where the result will be stored. If out is None, only the op-tree will be returned. Note that this object should differ from left and right.
Returns:

the resulting op-tree from this operation.

Return type:

OpTreeNode

empty(shape, dtype=None, name=None, persist_values=True, parallel=False, distributed=False)

Instantiate a new instance of this backend’s Tensor class, without initializing element values. This is slightly faster than array(), ones(), zeros(), but the values will be random.

Parameters:
  • shape (int, list) – length of each dimension of the Tensor.
  • dtype (data-type, optional) – If present, specifies the underlying type to employ for each element.
  • name (str, optional) – name indentifying the tensor (used in printing).
  • persist_values (bool, optional) – If set to True (the default), the values assigned to this Tensor will persist across multiple begin and end calls. Setting to False may provide a performance increase if values do not need to be maintained across such calls
  • parallel (bool, optional) – If True and using multi-GPU backend, replicate copies of this tensor across devices. Defaults to False, and has no effect on CPU, or (single) GPU backends.
  • distributed (bool, optional) – If True and using multi-GPU backend, this tensor is fragmented and partitioned across devices. Defaults to False, and has no effect on CPU, or (single) GPU backends.
Returns:

array object

Return type:

Tensor

Raises:

NotImplementedError – Can’t be instantiated directly.

See also

array(), zeros(), ones()

empty_like(other_ary, name=None, persist_values=True)

Instantiate a new instance of this backend’s Tensor class, with the shape taken from other_ary.

Parameters:
  • other_ary (tensor object) – Tensor to inherit the dimensions of.
  • name (str, optional) – name indentifying the tensor (used in printing).
  • dtype (data-type, optional) – If present, specifies the underlying type to employ for each element.
  • persist_values (bool, optional) – If set to True (the default), the values assigned to this Tensor will persist across multiple begin and end calls. Setting to False may provide a performance increase if values do not need to be maintained across such calls.
Returns:

array object

Return type:

Tensor

Raises:

NotImplementedError – Can’t be instantiated directly.

See also

empty(), ones(), array()

end(block, identifier)[source]

Signal the corresponding end of a block of repeated computation (at the end of a loop). This operation can be used to help the compiler optimize performance, but has no direct effect on calculations. It must be preceded by a corresponding Backend.begin() call.

Parameters:
  • block (Block.attr) – identifies the type of computation being worked on based on Block attribute specified
  • identifier (int) – unique identifier for this particular iteration of the block. Will typically be something like epoch number, mini-batch number, and so forth.

See also

begin()

equal(a, b, out=None)[source]

Performs element-wise equality testing on each element of left and right, storing the result in out. Each operand is assumed to be the same shape (or broadcastable as such).

Parameters:
  • a (Tensor, numeric) – left-hand side operand.
  • b (Tensor, numeric) – right-hand side operand.
  • out (Tensor, optional) – where the result will be stored. If out is None, only the op-tree will be returned.
Returns:

the resulting op-tree

Return type:

OpTreeNode

execute(node)[source]

Execute the optree. There must be one and only one ‘assign’ op at the top of the optree when execute is called.

Parameters:node (OpTreeNode) – The op-tree to execute.
exp(a, out=None)[source]

Perform element-wise exponential transformation on Tensor a, storing the result in Tensor out. Both Tensor’s should have identical shape.

Parameters:
  • a (Tensor) – input to be transformed.
  • out (Tensor, optional) – where the result will be stored. If out is None, only the op-tree will be returned.
Returns:

the resulting op-tree

Return type:

OpTreeNode

exp2(a, out=None)[source]

Perform element-wise 2-based exponential transformation on Tensor a, storing the result in Tensor out. Both Tensor’s should have identical shape.

Parameters:
  • a (Tensor) – input to be transformed.
  • out (Tensor, optional) – where the result will be stored. If out is None, only the op-tree will be returned.
Returns:

the resulting op-tree

Return type:

OpTreeNode

fabs(a, out=None)[source]

Perform element-wise absolute value of Tensor a, storing the result in Tensor out. Both Tensor’s should have identical shape. Implemented as an alias of absolute.

Parameters:
  • a (Tensor) – input to be transformed.
  • out (Tensor, optional) – where the result will be stored. If out is None, only the op-tree will be returned.
Returns:

the resulting op-tree

Return type:

OpTreeNode

finite(a, out=None)[source]

Perform element-wise test of finiteness (not infinity or not Not a Number) on Tensor a, storing the result in Tensor out. Both Tensor’s should have identical shape.

Parameters:
  • a (Tensor) – input to be transformed.
  • out (Tensor, optional) – where the result will be stored. If out is None, only the op-tree will be returned.
Returns:

the resulting op-tree

Return type:

OpTreeNode

fprop_conv(layer, I, F, O, alpha=1.0, relu=False, repeat=1)

Forward propagate the inputs of a convolutional network layer to produce output.

Parameters:
  • layer – the conv layer as a parameter object
  • I (Tensor) – inputs
  • F (Tensor) – the weights (filters)
  • O (Tensor) – outputs
  • alpha (float, optional) – linear scaling. Defaults to 1.0
  • relu (bool, optional) – apply ReLu before output. Default not to.
  • repeat (int, optional) – Repeat this operation the specified number of times. Defaults to 1.
fprop_pool(layer, I, O)

Forward propagate pooling layer.

Parameters:
  • layer (PoolLayer) – The pool layer object, different backends have different pool layers.
  • I (Tensor) – Input tensor.
  • O (Tensor) – output tensor.
gen_rng(seed=None)

Setup the random number generator(s) and store the state in self.init_rng_state.

Parameters:seed (int or None) – RNG seed, if the seed is None, then a seed will be randomly chosen
Returns:numpy RNG
Return type:np.random.RandomState
greater(a, b, out=None)[source]

Performs element-wise greater than testing on each element of left and right, storing the result in out. Each operand is assumed to be the same shape (or broadcastable as such).

Parameters:
  • a (Tensor, numeric) – left-hand side operand.
  • b (Tensor, numeric) – right-hand side operand.
  • out (Tensor, optional) – where the result will be stored. If out is None, only theshape op-tree will be returned.
Returns:

the resulting op-tree

Return type:

OpTreeNode

greater_equal(a, b, out=None)[source]

Performs element-wise greater than or equal testing on each element of left and right, storing the result in out. Each operand is assumed to be the same shape (or broadcastable as such).

Parameters:
  • a (Tensor, numeric) – left-hand side operand.
  • b (Tensor, numeric) – right-hand side operand.
  • out (Tensor, optional) – where the result will be stored. If out is None, only the op-tree will be returned.
Returns:

the resulting op-tree

Return type:

OpTreeNode

iobuf(dim0, x=None, dtype=None, name=None, persist_values=True, shared=None, parallelism=None)[source]

Allocate input and output buffer for layer based on batch size. This is used because the layer does not know about the batch size.

Parameters:
  • dim0 (tuple or int) – I/O buffer dimension for layer (without the axis specifying the batch size).
  • x (data-type, optional) – If present and not None, x will be returned directly. x will be not None if the buffer has already been allocated.
  • dtype (data-type, optional) – If present, specifies the underlying type to employ for each element.
  • name (str, optional) – name indentifying the tensor (used in printing).
  • persist_values (bool, optional) – If set to True (the default), the values assigned to this Tensor will persist across multiple begin and end calls. Setting to False may provide a performance increase if values do not need to be maintained across such calls
  • shared (buffer, optional) – If present will attempt to reuse the memory in shared to allocate the I/O buffer
  • parallelism (str, optional) – Indicates type of parallelism (Data, Model) employed by this buffer. Ignored on CPU and GPU backends, defaults to no parallelism.
Returns:

array object

Return type:

Tensor

less(a, b, out=None)[source]

Performs element-wise less than testing on each element of left and right, storing the result in out. Each operand is assumed to be the same shape (or broadcastable as such).

Parameters:
  • a (Tensor, numeric) – left-hand side operand.
  • b (Tensor, numeric) – right-hand side operand.
  • out (Tensor, optional) – where the result will be stored. If out is None, only the op-tree will be returned.
Returns:

the resulting op-tree

Return type:

OpTreeNode

less_equal(a, b, out=None)[source]

Performs element-wise less than or equal testing on each element of left and right, storing the result in out. Each operand is assumed to be the same shape (or broadcastable as such).

Parameters:
  • a (Tensor, numeric) – left-hand side operand.
  • b (Tensor, numeric) – right-hand side operand.
  • out (Tensor, optional) – where the result will be stored. If out is None, only the op-tree will be returned.
Returns:

the resulting op-tree

Return type:

OpTreeNode

log(a, out=None)[source]

Perform element-wise natural logarithm transformation on Tensor a, storing the result in Tensor out. Both Tensor’s should have identical shape.

Parameters:
  • a (Tensor) – input to be transformed.
  • out (Tensor, optional) – where the result will be stored. If out is None, only the op-tree will be returned.
Returns:

the resulting op-tree

Return type:

OpTreeNode

log2(a, out=None)[source]

Perform element-wise 2-based logarithm transformation on Tensor a, storing the result in Tensor out. Both Tensor’s should have identical shape.

Parameters:
  • a (Tensor) – input to be transformed.
  • out (Tensor, optional) – where the result will be stored. If out is None, only the op-tree will be returned.
Returns:

the resulting op-tree

Return type:

OpTreeNode

make_binary_mask(out, keepthresh=0.5)

Create a binary mask for dropout layers.

Parameters:
  • out (Tensor) – Output tensor
  • keepthresh (float, optional) – fraction of ones. Defaults to 0.5
max(a, axis=None, out=None, keepdims=True)[source]

Calculates the maximal element value along the specified axes.

Parameters:
  • a (Tensor) – the Tensor on which to perform the operation
  • axis (int, optional) – the dimension along which to compute. If set to None, we will take max over all dimensions.
  • out (Tensor, optional) – where the result will be stored. If out is None, only the op-tree will be returned.
  • keepdims (bool, optional) – Keep the axes being computed over in the output (with size 1), instead of collapsing. Defaults to True.
Returns:

the resulting op-tree

Return type:

OpTreeNode

maximum(a, b, out=None)[source]

Performs element-wise maximum value assignment based on corresponding elements of left and right, storing the result in out. Each operand is assumed to be the same shape (or broadcastable as such).

Parameters:
  • a (Tensor, numeric) – left-hand side operand.
  • b (Tensor, numeric) – right-hand side operand.
  • out (Tensor, optional) – where the result will be stored. If out is None, only the op-tree will be returned.
Returns:

the resulting op-tree

Return type:

OpTreeNode

mean(a, axis=None, partial=None, out=None, keepdims=True)[source]

Calculates the arithmetic mean of the elements along the specified axes.

Parameters:
  • a (Tensor) – the Tensor on which to perform the operation
  • axis (int, optional) – the dimension along which to compute. If set to None, we will take mean over all dimensions. Defaults to None
  • partial (bool, optional) – Not currently used.
  • out (Tensor, optional) – where the result will be stored. If out is None, only the op-tree will be returned.
  • keepdims (bool, optional) – Keep the axes being computed over in the output (with size 1), instead of collapsing. Defaults to True.
Returns:

the resulting op-tree

Return type:

OpTreeNode

min(a, axis=None, out=None, keepdims=True)[source]

Calculates the minimal element value along the specified axes.

Parameters:
  • a (Tensor) – the Tensor on which to perform the operation
  • axis (int, optional) – the dimension along which to compute. If set to None, we will take min over all dimensions.
  • out (Tensor, optional) – where the result will be stored. If out is None, only the op-tree will be returned.
  • keepdims (bool, optional) – Keep the axes being computed over in the output (with size 1), instead of collapsing. Defaults to True.
Returns:

the resulting op-tree

Return type:

OpTreeNode

minimum(a, b, out=None)[source]

Performs element-wise minimum value assignment based on corresponding elements of left and right, storing the result in out. Each operand is assumed to be the same shape (or broadcastable as such).

Parameters:
  • a (Tensor, numeric) – left-hand side operand.
  • b (Tensor, numeric) – right-hand side operand.
  • out (Tensor, optional) – where the result will be stored. If out is None, only the op-tree will be returned.
Returns:

the resulting op-tree

Return type:

OpTreeNode

multiply(a, b, out=None)[source]

Perform element-wise multiplication on the operands, storing the resultant values in the out Tensor. Each operand and out must have identical shape or be broadcastable as such.

Parameters:
  • a (Tensor, numeric) – left-hand side operand.
  • b (Tensor, numeric) – right-hand side operand.
  • out (Tensor, optional) – where the result will be stored. If out is None, only the op-tree will be returned.
Returns:

the resulting op-tree

Return type:

OpTreeNode

negative(a, out=None)[source]

Perform element-wise negation of Tensor a, storing the result in Tensor out. Both Tensor’s should have identical shape.

Parameters:
  • a (Tensor) – input to be transformed.
  • out (Tensor, optional) – where the result will be stored. If out is None, only the op-tree will be returned.
Returns:

the resulting op-tree

Return type:

OpTreeNode

not_equal(a, b, out=None)[source]

Performs element-wise non-equality testing on each element of left and right, storing the result in out. Each operand is assumed to be the same shape (or broadcastable as such).

Parameters:
  • a (Tensor, numeric) – left-hand side operand.
  • b (Tensor, numeric) – right-hand side operand.
  • out (Tensor, optional) – where the result will be stored. If out is None, only the op-tree will be returned.
Returns:

the resulting op-tree

Return type:

OpTreeNode

onehot(indices, axis, out=None)[source]

Generate optree for converting indices to a onehot representation.

Parameters:
  • indices (Tensor) – Elements must be of numpy integer type for gpu onehot to work.
  • axis (int) – the axis along the feature length dimension
  • out (Tensor, optional) – where the result will be stored. If out is None, only the op-tree will be returned.
Returns:

the resulting op-tree

Return type:

OpTreeNode

ones(shape, dtype=None, name=None, persist_values=True, parallel=False, distributed=False)

Instantiate a new instance of this backend’s Tensor class, populating each element with a value of 1.

Parameters:
  • shape (int, list) – length of each dimension of the Tensor.
  • dtype (data-type, optional) – If present, specifies the underlying type to employ for each element.
  • name (str, optional) – name indentifying the tensor (used in printing).
  • persist_values (bool, optional) – If set to True (the default), the values assigned to this Tensor will persist across multiple begin and end calls. Setting to False may provide a performance increase if values do not need to be maintained across such calls
  • parallel (bool, optional) – If True and using multi-GPU backend, replicate copies of this tensor across devices. Defaults to False, and has no effect on CPU, or (single) GPU backends.
  • distributed (bool, optional) – If True and using multi-GPU backend, this tensor is fragmented and partitioned across devices. Defaults to False, and has no effect on CPU, or (single) GPU backends.
Returns:

array object

Return type:

Tensor

Raises:

NotImplementedError – Can’t be instantiated directly.

See also

empty(), zeros(), array()

output_dim(X, S, padding, strides, pooling=False, dilation=1)[source]

Compute along 1 dimension, with these sizes, what will be the output dimension.

Parameters:
  • X (int) – input data dimension
  • S (int) – filter dimension
  • padding (int) – padding on each side
  • strides (int) – striding
  • pooling (bool) – flag for setting pooling layer size
  • dilation (int) – dilation of filter
pool_layer(dtype, op, N, C, D=1, H=1, W=1, J=1, T=1, R=1, S=1, pad_j=0, pad_d=0, pad_h=0, pad_w=0, str_j=None, str_d=None, str_h=None, str_w=None)

Create a new PoolLayer parameter object. This then is passed as an argument to all pooling kernels.

Parameters:
  • op (str) – “max”, “avg”, “l2” pooling (currently bprop only supports max, but not avg and l2)
  • N (int) – Number of images in mini-batch
  • C (int) – Number of input feature maps
  • D (int, optional) – Depth of input image. Defaults to 1
  • H (int, optional) – Height of input image. Defaults to 1
  • W (int, optional) – Width of input image. Defaults to 1
  • J (int, optional) – Size of feature map pooling window (maxout n_pieces). Defaults to 1
  • T (int, optional) – Depth of pooling window. Defaults to 1
  • R (int, optional) – Height of pooling window. Defaults to 1
  • S (int, optional) – Width of pooling window. Defaults to 1
  • pad_j (int, optional) – amount of zero-padding around the fm pooling window edge. Defaults to 0.
  • pad_d (int, optional) – amount of zero-padding around the depth edge Defaults to 0.
  • pad_h (int, optional) – amount of zero-padding around the height edge Defaults to 0.
  • pad_w (int, optional) – amount of zero-padding around the width edge Defaults to 0.
  • str_j (int, optional) – factor to step the filters by in the fm pooling window direction. Defaults to 1
  • str_d (int, optional) – factor to step the filters by in the depth direction. Defaults to 1
  • str_h (int, optional) – factor to step the filters by in the depth direction. Defaults to 1
  • str_w (int, optional) – factor to step the filters by in the depth direction. Defaults to 1

Leave spatial dimensions at 1 to allow feature map pooling in the fc layers.

power(a, b, out=None)[source]

Perform element-wise raise of tsr values to specified power, storing the result in Tensor out. Both Tensor’s should have identical shape.

Parameters:
  • a (Tensor) – input to be transformed.
  • b (Tensor, numeric) – exponentiated value to be applied to element. Examples include 2 (square), 0.5 (sqaure root).
  • out (Tensor, optional) – where the result will be stored. If out is None, only the op-tree will be returned.
Returns:

the resulting op-tree

Return type:

OpTreeNode

reciprocal(a, out=None)[source]

Perform element-wise reciprocal of Tensor a, storing the result in Tensor out. Both Tensor’s should have identical shape.

Parameters:
  • a (Tensor) – input to be transformed.
  • power (Tensor, numeric) – exponentiated value to be applied to element. Examples include 2 (square), 0.5 (sqaure root).
  • out (Tensor, optional) – where the result will be stored. If out is None, only the op-tree will be returned.
Returns:

the resulting op-tree

Return type:

OpTreeNode

revert_tensor(tensor)[source]

Reverts a tensor to its original state after being distributed by distribute_data.

Parameters:tensor – Tensor to be reverted
rint(a, out=None)[source]

Perform element-wise rounding to nearest int.

Parameters:
  • a (Tensor) – input to be transformed.
  • out (Tensor, optional) – where the result will be stored. If out is None, only the op-tree will be returned.
Returns:

the resulting op-tree

Return type:

OpTreeNode

rng_get_state(state)

Get the random number generator state to a specific state.

Returns a tuple since some backends have multiple RNG states (e.g. on-host and on-device)

Returns:
array of numpy ndarray which defines the current
state of the RNGs
Return type:tuple
rng_reset()

Reset the random state to the state where the Backend is first initialized.

rng_set_state(state)

Set the random number generator state to a specific state.

Parameters:state (np.array) – array which is used to define the RNG state
safelog(a, out=None)[source]

Perform element-wise natural logarithm transformation on Tensor a, storing the result in Tensor out. Both Tensor’s should have identical shape. This log function has built in safety for underflow.

Parameters:
  • a (Tensor) – input to be transformed.
  • out (Tensor, optional) – where the result will be stored. If out is None, only the op-tree will be returned.
Returns:

the resulting op-tree

Return type:

OpTreeNode

set_caffe_compat()[source]

Set flag to make layers compatible with caffe in terms of conv and pool layer output size determination and dropout layer implementation.

sgn(a, out=None)[source]

Perform element-wise indication of the sign of Tensor a, storing the result in Tensor out. Both Tensor’s should have identical shape.

Parameters:
  • a (Tensor) – input to be transformed.
  • out (Tensor, optional) – where the result will be stored. If out is None, only the op-tree will be returned.
Returns:

the resulting op-tree

Return type:

OpTreeNode

shared_iobuf_size(shape, parallelism)[source]

Computes the backend specific size needed for an iobuf with a specified shape that is meant to be shared between layers.

Parameters:
  • shape (tuple) – Requested iobuf shape
  • parallelism (string) – Parallelism of layer requesting this iobuf
Returns:

Size of required iobuf

Return type:

int

shift(a, b, value=True, out=None)[source]

Performs element-wise shift based on corresponding elements of left and right, storing the result in out. Positive is left shift, and negative is right shift. Each operand is assumed to be the same shape (or broadcastable as such).

Parameters:
  • a (Tensor, numeric) – left-hand side operand.
  • b (Tensor, numeric) – right-hand side operand.
  • value (int) – shift by value or exponent
  • out (Tensor, optional) – where the result will be stored. If out is None, only the op-tree will be returned.
Returns:

the resulting op-tree

Return type:

OpTreeNode

sig(a, out=None)[source]

Perform element-wise sigmoid transformation on Tensor a, storing the result in Tensor out. Both Tensor’s should have identical shape.

Parameters:
  • a (Tensor) – input to be transformed.
  • out (Tensor, optional) – where the result will be stored. If out is None, only the op-tree will be returned.
Returns:

the resulting op-tree

Return type:

OpTreeNode

sig2(a, out=None)[source]

Perform element-wise 2-based sigmoid logarithm transformation on Tensor a, storing the result in Tensor out. Both Tensor’s should have identical shape.

Parameters:
  • a (Tensor) – input to be transformed.
  • out (Tensor, optional) – where the result will be stored. If out is None, only the op-tree will be returned.
Returns:

the resulting op-tree

Return type:

OpTreeNode

sqrt(a, out=None)[source]

Perform element-wise square-root of Tensor a, storing the result in Tensor out. Both Tensor’s should have identical shape.

Parameters:
  • a (Tensor) – input to be transformed.
  • out (Tensor, optional) – where the result will be stored. If out is None, only the op-tree will be returned.
Returns:

the resulting op-tree

Return type:

OpTreeNode

square(a, out=None)[source]

Perform element-wise square of Tensor a, storing the result in Tensor out. Both Tensor’s should have identical shape.

Parameters:
  • a (Tensor) – input to be transformed.
  • out (Tensor, optional) – where the result will be stored. If out is None, only the op-tree will be returned.
Returns:

the resulting op-tree

Return type:

OpTreeNode

std(a, axis=None, partial=None, out=None, keepdims=True)[source]

Calculates the standard deviation of the elements along the specified axes.

Parameters:
  • a (Tensor) – the Tensor on which to perform the operation
  • axis (int, optional) – the dimension along which to compute. If set to None, we will take std over all dimensions.
  • out (Tensor, optional) – where the result will be stored. If out is None, only the op-tree will be returned.
  • partial (bool, optional) – Not currently used.
  • keepdims (bool, optional) – Keep the axes being computed over in the output (with size 1), instead of collapsing. Defaults to True.
Returns:

the resulting op-tree

Return type:

OpTreeNode

subtract(a, b, out=None)[source]

Perform element-wise subtraction on the operands, storing the resultant values in the out Tensor. Each operand and out must have identical shape or be broadcastable as such.

Parameters:
  • a (Tensor, numeric) – left-hand side operand.
  • b (Tensor, numeric) – right-hand side operand.
  • out (Tensor, optional) – where the result will be stored. If out is None, only the op-tree will be returned.
Returns:

the resulting op-tree

Return type:

OpTreeNode

sum(a, axis=None, out=None, keepdims=True)[source]

Calculates the summation of the elements along the specified axis.

Parameters:
  • a (Tensor) – the Tensor on which to perform the sum
  • axis (int, optional) – the dimension along which to compute. If set to None, we will sum over all dimensions.
  • out (Tensor, optional) – where the result will be stored. If out is None, only the op-tree will be returned.
  • keepdims (bool, optional) – Keep the axes being computed over in the output (with size 1), instead of collapsing. Defaults to True.
Returns:

the resulting op-tree

Return type:

OpTreeNode

take(a, indices, axis, out=None)[source]

Extract elements based on the indices along a given axis.

Parameters:
  • a (Tensor) – the Tensor on which to perform the operation
  • indices (Tensor, numpy ndarray) – indicies of elements to select
  • axis (int, optional) – the dimension along which to compute. If set to None, we will extract over all dimensions (flattened first)
  • out (Tensor, optional) – where the result will be stored. If out is None, only the op-tree will be returned.
tanh(a, out=None)[source]

Perform element-wise hyperbolic tangent transformation on Tensor a, storing the result in Tensor out. Both Tensor’s should have identical shape.

Parameters:
  • a (Tensor) – input to be transformed.
  • out (Tensor, optional) – where the result will be stored. If out is None, only the op-tree will be returned.
Returns:

the resulting op-tree

Return type:

OpTreeNode

tanh2(a, out=None)[source]

Perform element-wise 2-based hyperbolic tangent transformation on Tensor a, storing the result in Tensor out. Both Tensor’s should have identical shape.

Parameters:
  • a (Tensor) – input to be transformed.
  • out (Tensor, optional) – where the result will be stored. If out is None, only the op-tree will be returned.
Returns:

the resulting op-tree

Return type:

OpTreeNode

true_divide(a, b, out=None)[source]

Here it is an alias of divide. Instead of the Python traditional ‘floor division’, this returns a true division.

Parameters:
  • a (Tensor, numeric) – left-hand side operand.
  • b (Tensor, numeric) – right-hand side operand.
  • out (Tensor, optional) – where the result will be stored. If out is None, only the op-tree will be returned.
Returns:

the resulting op-tree

Return type:

OpTreeNode

update_conv(layer, I, E, grad_F, alpha=1.0, repeat=1)

Compute the updated gradient for a convolutional network layer.

Parameters:
  • layer – the conv layer as a parameter object
  • I (Tensor) – the inputs
  • E (Tensor) – the errors
  • grad_F (Tensor) – filter gradients (weights) to update.
  • alpha (float, optional) – linear scaling. Defaults to 1.0
  • repeat (int, optional) – Repeat this operation the specified number of times. Defaults to 1.
update_fc_bias(err, out)[source]

Compute the updated bias gradient for a fully connected network layer.

Parameters:
  • err (Tensor) – backpropagated error
  • out (Tensor) – Where to store the updated gradient value.
var(a, axis=None, partial=None, out=None, keepdims=True, binary=False)[source]

Calculates the variance of the elements along the specified axes.

Parameters:
  • a (Tensor) – the Tensor on which to perform the operation
  • axis (int, optional) – the dimension along which to compute. If set to None, we will take var over all dimensions. Defaults to None
  • partial (bool, optional) – Not currently used.
  • out (Tensor, optional) – where the result will be stored. If out is None, only the op-tree will be returned.
  • keepdims (bool, optional) – Keep the axes being computed over in the output (with size 1), instead of collapsing. Defaults to True.
Returns:

the resulting op-tree

Return type:

OpTreeNode

xnor_compound_dot(A, B, C, beta=0.0)[source]

Performs XNOR GEMM C = A * B

Parameters:
  • A (Tensor) – left-hand side operand.
  • B (Tensor) – right-hand side operand.
  • C (Tensor) – output operand
zeros(shape, dtype=None, name=None, persist_values=True, parallel=False, distributed=False)

Instantiate a new instance of this backend’s Tensor class, populating each element with a value of 0.

Parameters:
  • shape (int, list) – length of each dimension of the Tensor.
  • dtype (data-type, optional) – If present, specifies the underlying type to employ for each element.
  • name (str, optional) – name indentifying the tensor (used in printing).
  • persist_values (bool, optional) – If set to True (the default), the values assigned to this Tensor will persist across multiple begin and end calls. Setting to False may provide a performance increase if values do not need to be maintained across such calls
  • parallel (bool, optional) – If True and using multi-GPU backend, replicate copies of this tensor across devices. Defaults to False, and has no effect on CPU, or (single) GPU backends.
  • distributed (bool, optional) – If True and using multi-GPU backend, this tensor is fragmented and partitioned across devices. Defaults to False, and has no effect on CPU, or (single) GPU backends.
Returns:

array object

Return type:

Tensor

Raises:

NotImplementedError – Can’t be instantiated directly.

See also

empty(), ones(), array()

zeros_like(other_ary, name=None, persist_values=True)

Instantiate a new instance of this backend’s Tensor class, with the shape taken from other_ary and populating each element with a value of 0.

Parameters:
  • other_ary (tensor object) – Tensor to inherit the dimensions of.
  • name (str, optional) – name indentifying the tensor (used in printing).
  • dtype (data-type, optional) – If present, specifies the underlying type to employ for each element.
  • persist_values (bool, optional) – If set to True (the default), the values assigned to this Tensor will persist across multiple begin and end calls. Setting to False may provide a performance increase if values do not need to be maintained across such calls.
Returns:

array object

Return type:

Tensor

Raises:

NotImplementedError – Can’t be instantiated directly.

See also

empty(), ones(), array()