Layer containers are used to wrap groups of layers to provide convenience functions for iterating through them during training. Containers also support efficiently allocating memory for outputs or parameters that are shared between different layers.
Sequential containers are the
default container for models. This is the simplest type of container,
and used to encapsulate linear pathways of multiple layers. Upon
construction, a Sequential container will automatically flatten compound
layers (such as
fprop() on a Sequential container will in turn call
fprop on the constituent layers in its layers list.
Tree containers are used to represent
branching pathways of layers where the overall structure has multiple
output nodes. The Tree is constructed by specifying a list of
Sequential containers. Each
Sequential represents a pathway that
terminates in an output node. The pathways can be provided to the
Tree constructor as either a list of
Sequential containers, or
as a list of list of layers, in which case each list will be implicitly
encapsulated in a
Sequential during construction.
BranchNode is used to define branching points for the Tree.
The following are equivalent:
bnode = BranchNode() layer_list = [ [Conv((3,3,2), init=ifunc), bnode, Affine(nout=32, init=ifunc)], [bnode, Conv((3,3,2), init=ifunc), Affine(nout=16, init=ifunc)] ] t = Tree(layers=layer_list, alphas=[1., 1.])
bnode = BranchNode() layer_list = [ Sequential([Conv((3,3,2), init=ifunc), bnode, Affine(nout=32, init=ifunc)]), Sequential([bnode, Conv((3,3,2), init=ifunc), Affine(nout=16, init=ifunc)]) ] t = Tree(layers=layer_list, alphas=[1., 1.])
The layer pathways must be specified in the order of precedence from the root of the tree. So the “trunk” of the tree must be provided first, then the subsequent pathways in the order which they occur up the tree. All non-trunk pathways must start with an instantiated BranchNode layer that also occurs in the trunk at the point of branching.
Multiple branching points can be used to construct complex networks. The figure below shows a branching model with three output nodes, and how that model is constructed from a list of containers.
An example of how to create a branching model using a Tree container is provided in mnist_branch.py.
During training, the backpropagated errors of each branch are combined
using the weighting parameters in the
alphas list. By default, all
branches are equally weighted (
alphas are given default value of all
During inference, only the trunk branch (first
consist of multiple Sequential pathways that receive input from a single
input layer (the broadcast), and then combine their outputs via
concatenation (the merge). A
MergeBroadcast container’s layers
attribute consists of multiple
Sequential containers, each
representing one of the pathways receiving the broadcasted input. The
output of the
MergeBroadcast is the merged output of the Sequential
members of layers. The method of concatenation is specified by the merge
argument, which indicates the axis along which to concatenate. As in
Tree, the provided layers list can consist of either
objects or lists of layers which will be converted to
objects during construction.
are special cases of the
MergeBroadcast container, except that they
originate from the input provided to the model, by
for example. Instead of broadcasting from an input layer to their
Sequential pathways, each pathway gets its own input, that
is unpacked from the input object. One scenario may be if the data
source provides a tuple of
Tensor, with each
Tensor meant as
input to each pathway (stream).