neon.data.datasets.Dataset

class neon.data.datasets.Dataset(filename, url, size, path='.', subset_pct=100)[source]

Bases: neon.NervanaObject

Container class for stock datasets.

Parameters:
  • filename (str/list) – name of the file to download
  • url (str) – url for dataset
  • size (int/list) – file size
  • path (str) – local path to place files
  • subset_pct (float/int) – percentage of data set to use for training
__init__(filename, url, size, path='.', subset_pct=100)[source]

Methods

__init__(filename, url, size[, path, subset_pct])
fetch_dataset(url, sourcefile, destfile, totalsz) Download the file specified by the given URL.
gen_class(pdict)
gen_iterators() Method that generates the data set iterators for the train, test and validation data sets.
get_description([skip]) Returns a dict that contains all necessary information needed to serialize this object.
get_iterator(setname) Helper method to get the data iterator for specified dataset
load_zip(filename, size) Helper function for downloading test files
recursive_gen(pdict, key) helper method to check whether the definition
serialize() Generates dictionary with the required parameters to describe this object
be = None
classnm

Returns the class name.

data_dict
static fetch_dataset(url, sourcefile, destfile, totalsz)[source]

Download the file specified by the given URL.

Parameters:
  • url (str) – Base URL of the file to be downloaded.
  • sourcefile (str) – Name of the source file.
  • destfile (str) – Path to the destination.
  • totalsz (int) – Size of the file to be downloaded.
gen_class(pdict)
gen_iterators()[source]

Method that generates the data set iterators for the train, test and validation data sets. This method needs to set the instance data_set attribute to a dictionary of data iterators.

Returns:dictionary with the various data set iterators
Return type:dict
get_description(skip=[], **kwargs)

Returns a dict that contains all necessary information needed to serialize this object.

Parameters:skip (list) – Objects to omit from the dictionary.
Returns:Dictionary format for object information.
Return type:(dict)
get_iterator(setname)[source]

Helper method to get the data iterator for specified dataset

Parameters:setname (str) – which iterator to return (e.g. ‘train’, ‘valid’)
load_zip(filename, size)[source]

Helper function for downloading test files Will download and unzip the file into the directory self.path

Parameters:
  • filename (str) – name of file to download from self.url
  • size (str) – size of the file in bytes?
Returns:

Path to the downloaded dataset.

Return type:

str

modulenm

Returns the full module path.

recursive_gen(pdict, key)

helper method to check whether the definition dictionary is defining a NervanaObject child, if so it will instantiate that object and replace the dictionary element with an instance of that object

serialize()[source]

Generates dictionary with the required parameters to describe this object

test_iter

Helper method to return test set iterator

train_iter

Helper method to return training set iterator

valid_iter

Helper method to return validation set iterator