Python chainer.datasets() Examples
The following are 4
code examples of chainer.datasets().
You can vote up the ones you like or vote down the ones you don't like,
and go to the original project or source file by following the links above each example.
You may also want to check out all available functions/classes of the module
chainer
, or try the search function
.
Example #1
Source File: cifar.py From chainer with MIT License | 5 votes |
def get_cifar10(withlabel=True, ndim=3, scale=1., dtype=None): """Gets the CIFAR-10 dataset. `CIFAR-10 <https://www.cs.toronto.edu/~kriz/cifar.html>`_ is a set of small natural images. Each example is an RGB color image of size 32x32, classified into 10 groups. In the original images, each component of pixels is represented by one-byte unsigned integer. This function scales the components to floating point values in the interval ``[0, scale]``. This function returns the training set and the test set of the official CIFAR-10 dataset. If ``withlabel`` is ``True``, each dataset consists of tuples of images and labels, otherwise it only consists of images. Args: withlabel (bool): If ``True``, it returns datasets with labels. In this case, each example is a tuple of an image and a label. Otherwise, the datasets only contain images. ndim (int): Number of dimensions of each image. The shape of each image is determined depending on ndim as follows: - ``ndim == 1``: the shape is ``(3072,)`` - ``ndim == 3``: the shape is ``(3, 32, 32)`` scale (float): Pixel value scale. If it is 1 (default), pixels are scaled to the interval ``[0, 1]``. dtype: Data type of resulting image arrays. ``chainer.config.dtype`` is used by default (see :ref:`configuration`). Returns: A tuple of two datasets. If ``withlabel`` is ``True``, both datasets are :class:`~chainer.datasets.TupleDataset` instances. Otherwise, both datasets are arrays of images. """ return _get_cifar('cifar-10', withlabel, ndim, scale, dtype)
Example #2
Source File: cifar.py From chainer with MIT License | 5 votes |
def get_cifar100(withlabel=True, ndim=3, scale=1., dtype=None): """Gets the CIFAR-100 dataset. `CIFAR-100 <https://www.cs.toronto.edu/~kriz/cifar.html>`_ is a set of small natural images. Each example is an RGB color image of size 32x32, classified into 100 groups. In the original images, each component pixels is represented by one-byte unsigned integer. This function scales the components to floating point values in the interval ``[0, scale]``. This function returns the training set and the test set of the official CIFAR-100 dataset. If ``withlabel`` is ``True``, each dataset consists of tuples of images and labels, otherwise it only consists of images. Args: withlabel (bool): If ``True``, it returns datasets with labels. In this case, each example is a tuple of an image and a label. Otherwise, the datasets only contain images. ndim (int): Number of dimensions of each image. The shape of each image is determined depending on ndim as follows: - ``ndim == 1``: the shape is ``(3072,)`` - ``ndim == 3``: the shape is ``(3, 32, 32)`` scale (float): Pixel value scale. If it is 1 (default), pixels are scaled to the interval ``[0, 1]``. dtype: Data type of resulting image arrays. ``chainer.config.dtype`` is used by default (see :ref:`configuration`). Returns: A tuple of two datasets. If ``withlabel`` is ``True``, both are :class:`~chainer.datasets.TupleDataset` instances. Otherwise, both datasets are arrays of images. """ return _get_cifar('cifar-100', withlabel, ndim, scale, dtype)
Example #3
Source File: scatter.py From chainer with MIT License | 4 votes |
def scatter_index(n_total_samples, comm, root=0, *, force_equal_length=True): '''Scatters only index to avoid heavy dataset broadcast This is core functionality of ``scatter_dataset``, which is almost equal to following code snippet:: (b, e) = scatter_index(len(dataset), comm) order = None if shuffle: order = numpy.random.RandomState(seed).permutation( n_total_samples) order = comm.bcast_obj(order) dataset = SubDataset(dataset, b, e, order) Note:: Make sure ``force_equal_length`` flag is *not* off for multinode evaluator or multinode updaters, which assume that the iterator has the same lengths among processes to work correctly. Args: n_total_samples (int): number of total samples to scatter comm: ChainerMN communicator object root (int): root rank to coordinate the operation force_equal_length (bool): Force the scattered fragments of the index have equal length. If ``True``, number of scattered indices is guaranteed to be equal among processes and scattered datasets may have duplication among processes. Otherwise, number of scattered indices may not be equal among processes, but scattered indices are guaranteed to have no duplication among processes, intended for strict evaluation of test dataset to avoid duplicated examples. Returns: Tuple of two integers, that stands for beginning and ending offsets of the assigned sub part of samples. The ending offset is not border inclusive. ''' if comm.rank == root: for (i, b, e) in _scatter_index(n_total_samples, comm.size, force_equal_length): if i == root: mine = (b, e) else: comm.send_obj((b, e), dest=i) return mine else: return comm.recv_obj(source=root)
Example #4
Source File: svhn.py From chainer with MIT License | 4 votes |
def get_svhn(withlabel=True, scale=1., dtype=None, label_dtype=numpy.int32, add_extra=False): """Gets the SVHN dataset. `The Street View House Numbers (SVHN) dataset <http://ufldl.stanford.edu/housenumbers/>`_ is a dataset similar to MNIST but composed of cropped images of house numbers. The functionality of this function is identical to the counterpart for the MNIST dataset (:func:`~chainer.datasets.get_mnist`), with the exception that there is no ``ndim`` argument. .. note:: `SciPy <https://www.scipy.org/>`_ is required to use this feature. Args: withlabel (bool): If ``True``, it returns datasets with labels. In this case, each example is a tuple of an image and a label. Otherwise, the datasets only contain images. scale (float): Pixel value scale. If it is 1 (default), pixels are scaled to the interval ``[0, 1]``. dtype: Data type of resulting image arrays. ``chainer.config.dtype`` is used by default (see :ref:`configuration`). label_dtype: Data type of the labels. add_extra: Use extra training set. Returns: If ``add_extra`` is ``False``, a tuple of two datasets (train and test). Otherwise, a tuple of three datasets (train, test, and extra). If ``withlabel`` is ``True``, all datasets are :class:`~chainer.datasets.TupleDataset` instances. Otherwise, both datasets are arrays of images. """ if not _scipy_available: raise RuntimeError('SciPy is not available: %s' % _error) train_raw = _retrieve_svhn_training() dtype = chainer.get_dtype(dtype) train = _preprocess_svhn(train_raw, withlabel, scale, dtype, label_dtype) test_raw = _retrieve_svhn_test() test = _preprocess_svhn(test_raw, withlabel, scale, dtype, label_dtype) if add_extra: extra_raw = _retrieve_svhn_extra() extra = _preprocess_svhn(extra_raw, withlabel, scale, dtype, label_dtype) return train, test, extra else: return train, test