Model Zoo

Neon features fast implementations of most state-of-the-art models reported in the academic literature.

Several examples are packaged with neon in the neon/examples folder. Note that these are sometimes shortened models to reduce training time, and are meant to illustrate different ways to use neon.

Our Model Zoo contains complete models, with python scripts as well as pre-trained weights. For the latest updates, we recommend paying a visit to the Model Zoo. Admission is free!

Multilayer Perceptron (MLP)

These are the simplest models, applying multilayer perceptron (MLP) to the problem of recognizing handwritten digits (MNIST dataset). One example is included with the CIFAR-10 dataset (60,000 natural images from 10 categories).

Model Description Simple MLP model Small MLP with multiple branches MLP model that demonstrates merging Small MLP applied to natural images

Convolutional Neural Networks

Convolutional neural networks are the state-of-art architecture for many image and video processing problems. The main datasets involved are:

  1. ImageNet: a large corpus of 1 million natural images (256x256 pixels), divided into 1000 categories.
  2. CIFAR10 : 60,000 natural images (32 x 32 pixels) from 10 categories.
  3. PASCAL_VOC: A subset of ImageNet images with object bounding boxes.
  4. UCF101: 13,320 videos from 101 action categories.
  5. Mini-Places2: Subset of the Places2 dataset. Includes 100,000 images from 100 scene categories.

Example scripts

These python scripts are found in the neon/examples folder. While these examples load a particular image dataset, in principle they can be adapted to any dataset.

Model Dataset Description CIFAR-10 All-convolutional neural network CIFAR-10 Small all-convolution network demonstrating use of fp16 data format
cifar10_msra CIFAR-10 Deep residual network detailed in He, 2015 ImageNet Implementation of AlexNet ImageNet All-convolutional network based on Springenberg, 2014
fast_rcnn PASCAL VOC Fast region-based CNN (R-CNN) for object localization and detection. Uses a pre-trained VGG16 network trained on ImageI1K to initialize the convolution layers. MNIST Autoencoder convolutional network that reconstructs the image with deconvolutional layers

Model Zoo

Our model zoo also includes complete models with both the model script and pre-trained weights upon which to build your networks. The links below lead to individual pages where you can download the model and weights.

Model Dataset Description
Alexnet ImageNet Implementation of Alexnet described in Krizhevsky, 2012
VGG ImageNet Adapted the 16 and 19 layer VGG model from Caffe for use with neon.
GoogleNet ImageNet 22-layer CNN with multiple branches. See Szegedy, 2014
ALLCNN CIFAR10 All convolutional model inspired by Springenberg, 2014
DeepResNet CIFAR10 Deep residual network detailed in He, 2015
DeepResNet mini-Places2 Deep residual network for scene classification
FastRCNN Pascal-VOC Fast-RCNN model for object localization. The CNN layers are seeded by Alexnet pre-trained in neon with ImageNet.
C3D model UCF101 3D convolutional networks for video action recognition

Recurrent Neural Networks

Neon has implementations for all-to-all recurrent neural networks (RNNs), as well as Long short-term memory (LSTM) networks, and Gated Recurrent Units (GRU) networks. Training datasets include:

  1. Penn Treebank (PTB): Text corpus with ~1 million words. Vocabulary is limited to 10,000 words. The task is predicting downstream words/characters.
  2. Shakespeare: Complete text from Shakespeare’s works.
  3. IMDB reviews: 25,000 movie reviews, labeled as positive or negative
  4. Facebook bAbI: As set of 20 question & answer tasks, each with 1,000 training examples.
  5. Flickr8k, COCO: Images with associated caption (sentences). Flickr8k consists of 8,092 images captioned by AmazonTurkers with ~40,000 captions. COCO has 328,000 images, each with 5 captions. The COCO images also come with labeled objects using segmentation algorithms.

Example scripts

These examples scripts, found in neon/examples demonstrate how to load and preprocess text data (for some models) and construct the recurrent networks.

Model Dataset Description PTB (word) LSTM/GRU network for prediction PTB (char) LSTM/GRU network for prediction PTB (char) One-layer RNN with tanh units for prediction Shakespeare Trains an LSTM network then demonstrates how to draw samples from the network Time series Trains a network on a synthetic time series and generates sequences IMDB Performs sentiment analysis on IMDB (see Li, 2015) Flickr, COCO This model connects image features with sentences to learn how to caption unseen images. Uses precomputed VGG features and a sentence to train a LSTM. See Karpathy Neural Talk.

Model Zoo

Model Dataset Description
Image Captioning Flickr8k Image captioning model based on Vinyals, 2015 using precomputed VGG features.
Question & Answering bABI Facebook’s baseline GRU/LSTM model
Sentiment analysis IMDB LSTM model for classifying movie reviews as positive/negative (Li, 2015)

Other Examples

Model Dataset Description
Deep-Q Network Atari video games Deep reinforcement learning model to play video games (based on Minh, 2015)