|Date:||March 29, 2016|
- Support for commonly used models including convnets, RNNs, LSTMs, and autoencoders. You can find many pre-trained implementations of these in our model zoo
- Tight integration with our state-of-the-art GPU kernel library
- 3s/macrobatch (3072 images) on AlexNet on Titan X (Full run on 1 GPU ~ 32 hrs)
- Basic automatic differentiation support
- Framework for visualization
- Swappable hardware backends: write code once and deploy on CPUs, GPUs, or Nervana hardware
New features in recent releases:
- Winograd algorithm for faster convolutions (up to 2x)
- Kepler GPU support
- Greatly expanded model zoo now featuring deep residual nets for image classification, fast-RCNN for object localization, C3D video action recognition.
- and many more.
This release includes 3x3 convolution kernels based on the the Winograd minimal filtering algorithm for up to a 2x algorithmic performance gain. Winograd is enabled by default, but can be controlled
with the backend setting
enable_winograd. When Winograd is enabled, the first time
users run a network on the GPU, an autotuning process will compute the fastest kernel parameters. The results are cached in
~/nervana/cache folder for reuse.
The backend can also be configured to prefer 2x2 compared to 4x4 transform sizes through the
enable_winograd option (0 = disabled (direct convolution), 2 = prefer 2x2, 4 = prefer 4x4).