neon.data.imagecaption.ImageCaption

class neon.data.imagecaption.ImageCaption(path, max_images=-1)[source]

Bases: neon.data.dataiterator.NervanaDataIterator

This class loads in the sentences and CNN image features for image captioning that have been taken from Andrej Karpathy’s Deep Visual-Semantic Alignments work. They are converted to pkl format to avoid using scipy for loading the .mat image features.

The dataset assumes the model takes the precomputed VGG features of an image and a sentence converted to its one hot representation. The model then transforms both the image and the sentence to the same space and prepends the image to the sentence so that it is treated as the first word of the sequence to be fed to a RNN.

__init__(path, max_images=-1)[source]

Load vocab and image features. Convert sentences to indices

Parameters:
  • path (str) – Directory containing sentences and image features.
  • max_images (int) – Number of images to load. Set to -1 for max.

Methods

__init__(path[, max_images]) Load vocab and image features.
bleu_score(sents, targets) Compute the BLEU score from a list of predicted sentences and reference sentences
gen_class(pdict)
get_description([skip]) Returns a dict that contains all necessary information needed to serialize this object.
iterImageSentenceGroup() Iterate over all image sentence groups
iterImageSentencePair() Iterate over all image sentence pairs where an image may be repeated
iterSentences() Iterate over all sentences
load_vocab() Load vocab and initialize buffers Input sentence batch is of dimension (vocab_size, max_sentence_length * batch_size) where each column is the 1-hot representation of a word and the first batch_size columns are the first words of each sentence.
nbatches() Return the number of minibatches in this dataset.
predict(model) Given a model, generate sentences from this dataset.
prob_to_word(prob) Convert 1 hot probabilities to sentences.
read_images(split) Read sentences and image features from pickled dict
recursive_gen(pdict, key) helper method to check whether the definition
reset() Reset the starting index of this dataset back to zero.
be = None
bleu_score(sents, targets)[source]

Compute the BLEU score from a list of predicted sentences and reference sentences

Parameters:
  • sents (list) – list of predicted sentences
  • targets (list) – list of reference sentences where each element is a list of multiple references.
classnm

Returns the class name.

end_token = '.'
gen_class(pdict)
get_description(skip=[], **kwargs)

Returns a dict that contains all necessary information needed to serialize this object.

Parameters:skip (list) – Objects to omit from the dictionary.
Returns:Dictionary format for object information.
Return type:(dict)
image_size = 4096
iterImageSentenceGroup()[source]

Iterate over all image sentence groups

iterImageSentencePair()[source]

Iterate over all image sentence pairs where an image may be repeated

iterSentences()[source]

Iterate over all sentences

load_vocab()[source]

Load vocab and initialize buffers Input sentence batch is of dimension (vocab_size, max_sentence_length * batch_size) where each column is the 1-hot representation of a word and the first batch_size columns are the first words of each sentence.

modulenm

Returns the full module path.

nbatches()

Return the number of minibatches in this dataset.

predict(model)[source]

Given a model, generate sentences from this dataset.

Parameters:model (Model) – Image captioning model.
Returns:list, list containing predicted sentences and target sentences
prob_to_word(prob)[source]

Convert 1 hot probabilities to sentences.

Parameters:prob (Tensor) – Word probabilities of each sentence of batch. Of size (vocab_size, batch_size * (max_sentence_length+1))
Returns:list containing sentences
read_images(split)[source]

Read sentences and image features from pickled dict

Parameters:split (str) – test or train split
recursive_gen(pdict, key)

helper method to check whether the definition dictionary is defining a NervanaObject child, if so it will instantiate that object and replace the dictionary element with an instance of that object

reset()

Reset the starting index of this dataset back to zero.