Model Zoo¶
Neon features fast implementations of most state-of-the-art models reported in the academic literature.
Several examples are packaged with neon in the neon/examples
folder.
Note that these are sometimes shortened models to reduce training time,
and are meant to illustrate different ways to use neon.
Our Model Zoo contains complete models, with python scripts as well as pre-trained weights. For the latest updates, we recommend paying a visit to the Model Zoo. Admission is free!
Multilayer Perceptron (MLP)¶
These are the simplest models, applying multilayer perceptron (MLP) to the problem of recognizing handwritten digits (MNIST dataset). One example is included with the CIFAR-10 dataset (60,000 natural images from 10 categories).
Model | Description |
---|---|
mnist_mlp.py | Simple MLP model |
mnist_branch.py | Small MLP with multiple branches |
mnist_merge.py | MLP model that demonstrates merging |
cifar10.py | Small MLP applied to natural images |
Convolutional Neural Networks¶
Convolutional neural networks are the state-of-art architecture for many image and video processing problems. The main datasets involved are:
- ImageNet: a large corpus of 1 million natural images (256x256 pixels), divided into 1000 categories.
- CIFAR10 : 60,000 natural images (32 x 32 pixels) from 10 categories.
- PASCAL_VOC: A subset of ImageNet images with object bounding boxes.
- UCF101: 13,320 videos from 101 action categories.
- Mini-Places2: Subset of the Places2 dataset. Includes 100,000 images from 100 scene categories.
Example scripts¶
These python scripts are found in the neon/examples
folder. While
these examples load a particular image dataset, in principle they can be
adapted to any dataset.
Model | Dataset | Description |
---|---|---|
cifar10_allcnn.py | CIFAR-10 | All-convolutional neural network |
cifar10_conv.py | CIFAR-10 | Small all-convolution network demonstrating use of fp16 data format |
cifar10_msra | CIFAR-10 | Deep residual network detailed in He, 2015 |
alexnet.py | ImageNet | Implementation of AlexNet |
imagenet_allcnn.py | ImageNet | All-convolutional network based on Springenberg, 2014 |
fast_rcnn | PASCAL VOC | Fast region-based CNN (R-CNN) for object localization and detection. Uses a pre-trained VGG16 network trained on ImageI1K to initialize the convolution layers. |
conv_autoencoder.py | MNIST | Autoencoder convolutional network that reconstructs the image with deconvolutional layers |
Model Zoo¶
Our model zoo also includes complete models with both the model script and pre-trained weights upon which to build your networks. The links below lead to individual pages where you can download the model and weights.
Model | Dataset | Description |
---|---|---|
Alexnet | ImageNet | Implementation of Alexnet described in Krizhevsky, 2012 |
VGG | ImageNet | Adapted the 16 and 19 layer VGG model from Caffe for use with neon. |
GoogleNet | ImageNet | 22-layer CNN with multiple branches. See Szegedy, 2014 |
ALLCNN | CIFAR10 | All convolutional model inspired by Springenberg, 2014 |
DeepResNet | CIFAR10 | Deep residual network detailed in He, 2015 |
DeepResNet | mini-Places2 | Deep residual network for scene classification |
FastRCNN | Pascal-VOC | Fast-RCNN model for object localization. The CNN layers are seeded by Alexnet pre-trained in neon with ImageNet. |
C3D model | UCF101 | 3D convolutional networks for video action recognition |
Recurrent Neural Networks¶
Neon has implementations for all-to-all recurrent neural networks (RNNs), as well as Long short-term memory (LSTM) networks, and Gated Recurrent Units (GRU) networks. Training datasets include:
- Penn Treebank (PTB): Text corpus with ~1 million words. Vocabulary is limited to 10,000 words. The task is predicting downstream words/characters.
- Shakespeare: Complete text from Shakespeare’s works.
- IMDB reviews: 25,000 movie reviews, labeled as positive or negative
- Facebook bAbI: As set of 20 question & answer tasks, each with 1,000 training examples.
- Flickr8k, COCO: Images with associated caption (sentences). Flickr8k consists of 8,092 images captioned by AmazonTurkers with ~40,000 captions. COCO has 328,000 images, each with 5 captions. The COCO images also come with labeled objects using segmentation algorithms.
Example scripts¶
These examples scripts, found in neon/examples
demonstrate how to
load and preprocess text data (for some models) and construct the
recurrent networks.
Model | Dataset | Description |
---|---|---|
word_lstm.py | PTB (word) | LSTM/GRU network for prediction |
char_lstm.py | PTB (char) | LSTM/GRU network for prediction |
char_rnn.py | PTB (char) | One-layer RNN with tanh units for prediction |
text_generation_lstm.py | Shakespeare | Trains an LSTM network then demonstrates how to draw samples from the network |
timeseries_lstm.py | Time series | Trains a network on a synthetic time series and generates sequences |
imdb_lstm.py | IMDB | Performs sentiment analysis on IMDB (see Li, 2015) |
image_caption.py | Flickr, COCO | This model connects image features with sentences to learn how to caption unseen images. Uses precomputed VGG features and a sentence to train a LSTM. See Karpathy Neural Talk. |
Model Zoo¶
Model | Dataset | Description |
---|---|---|
Image Captioning | Flickr8k | Image captioning model based on Vinyals, 2015 using precomputed VGG features. |
Question & Answering | bABI | Facebook’s baseline GRU/LSTM model |
Sentiment analysis | IMDB | LSTM model for classifying movie reviews as positive/negative (Li, 2015) |
Other Examples¶
Model | Dataset | Description |
---|---|---|
Deep-Q Network | Atari video games | Deep reinforcement learning model to play video games (based on Minh, 2015) |