neuralgym.data¶
-
class
neuralgym.data.
Dataset
¶ Bases:
object
Base class for datasets.
Dataset members are automatically logged except members with name ending of ‘_’, e.g. ‘self.fnamelists_’.
-
data_pipeline
(batch_size)¶ Return batch data with batch size, e.g. return batch_image or return (batch_data, batch_label).
Parameters: batch_size (int) – Batch size.
-
maybe_download_and_extract
()¶ Abstract class: dataset maybe need download items.
-
view_dataset_info
()¶ Function to view current dataset information.
-
-
class
neuralgym.data.
DataFromFNames
(fnamelists, shapes, random=False, random_crop=False, fn_preprocess=None, dtypes=tf.float32, enqueue_size=32, queue_size=256, nthreads=16, return_fnames=False, filetype='image')¶ Bases:
neuralgym.data.dataset.Dataset
Data pipeline from list of filenames.
Parameters: - fnamelists (list) – A list of filenames or tuple of filenames, e.g. [‘image_001.png’, …] or [(‘pair_image_001_0.png’, ‘pair_image_001_1.png’), …].
- shapes (tuple) – Shapes of data, e.g. [256, 256, 3] or [[256, 256, 3], [1]].
- random (bool) – Read from fnamelists randomly (default to False).
- random_crop (bool) – If random crop to the shape from raw image or directly resize raw images to the shape.
- dtypes (tf.Type) – Data types, default to tf.float32.
- enqueue_size (int) – Enqueue size for pipeline.
- enqueue_size – Enqueue size for pipeline.
- nthreads (int) – Parallel threads for reading from data.
- return_fnames (bool) – If True, data_pipeline will also return fnames (last tensor).
- filetype (str) – Currently only support image.
Examples
>>> fnames = ['img001.png', 'img002.png', ..., 'img999.png'] >>> data = ng.data.DataFromFNames(fnames, [256, 256, 3]) >>> images = data.data_pipeline(128) >>> sess = tf.Session(config=tf.ConfigProto()) >>> tf.train.start_queue_runners(sess) >>> for i in range(5): sess.run(images)
To get file lists, you can either use file:
with open('data/images.flist') as f: fnames = f.read().splitlines()
or glob:
import glob fnames = glob.glob('data/*.png')