Datasets¶

TransformDataset¶

class chainercv.datasets.TransformDataset(dataset, transform)¶

Dataset that indexes data of a base dataset and transforms it.

This dataset wraps a base dataset by modifying the behavior of the base dataset’s __getitem__(). Arrays returned by __getitem__() of the base dataset with an integer index are transformed by the given function transform.

The function transform takes, as an argument, in_data, which is output of the base dataset’s __getitem__(), and returns transformed arrays as output. Please see the following example.

>>> from chainer.datasets import get_mnist
>>> from chainercv.datasets import TransformDataset
>>> dataset, _ = get_mnist()
>>> def transform(in_data):
>>>     img, label = in_data
>>>     img -= 0.5  # scale to [-0.5, -0.5]
>>>     return img, label
>>> dataset = TransformDataset(dataset, transform)

Note

The index used to access data is either an integer or a slice. If it is a slice, the base dataset is assumed to return a list of outputs each corresponding to the output of the integer indexing.

Note

This class is deprecated. Please use chainer.datasets.TransformDataset instead.

Parameters:	dataset – Underlying dataset. The index of this dataset corresponds to the index of the base dataset. transform (callable) – A function that is called to transform values returned by the underlying dataset’s `__getitem__()`.

CamVid¶

CamVidDataset¶

class chainercv.datasets.CamVidDataset(data_dir='auto', split='train')¶

Dataset class for a semantic segmantion task on CamVid u.

Parameters:	data_dir (string) – Path to the root of the training data. If this is `auto`, this class will automatically download data for you under `$CHAINER_DATASET_ROOT/pfnet/chainercv/camvid`. split ({'train', 'val', 'test'}) – Select from dataset splits used in CamVid Dataset.

CUB¶

CUBLabelDataset¶

class chainercv.datasets.CUBLabelDataset(data_dir='auto', crop_bbox=True)¶

Caltech-UCSD Birds-200-2011 dataset with annotated class labels.

When queried by an index, this dataset returns a corresponding img, label, a tuple of an image and class id. The image is in RGB and CHW format. The class id are between 0 and 199.

There are 200 labels of birds in total.

Parameters:	data_dir (string) – Path to the root of the training data. If this is `auto`, this class will automatically download data for you under `$CHAINER_DATASET_ROOT/pfnet/chainercv/cub`. crop_bbox (bool) – If true, this class returns an image cropped by the bounding box of the bird inside it.

CUBKeypointDataset¶

class chainercv.datasets.CUBKeypointDataset(data_dir='auto', crop_bbox=True, mask_dir='auto', return_mask=False)¶

Caltech-UCSD Birds-200-2011 dataset with annotated keypoints.

An index corresponds to each image.

When queried by an index, this dataset returns the corresponding img, keypoint, kp_mask, a tuple of an image, keypoints and a keypoint mask that indicates visible keypoints in the image. The data type of the three elements are float32, float32, bool. If return_mask = True, mask will be returned as well, making the returned tuple to be of length four. mask is a uint8 image which indicates the region of the image where a bird locates.

keypoints are packed into a two dimensional array of shape $(K, 2)$, where $K$ is the number of keypoints. Note that $K=15$ in CUB dataset. Also note that not all fifteen keypoints are visible in an image. When a keypoint is not visible, the values stored for that keypoint are undefined. The second axis corresponds to the $y$ and $x$ coordinates of the keypoints in the image.

A keypoint mask array indicates whether a keypoint is visible in the image or not. This is a boolean array of shape $(K,)$.

A mask image of the bird shows how likely the bird is located at a given pixel. If the value is close to 255, more likely that a bird locates at that pixel. The shape of this array is $(1, H, W)$, where $H$ and $W$ are height and width of the image respectively.

Parameters:

data_dir (string) – Path to the root of the training data. If this is auto, this class will automatically download data for you under $CHAINER_DATASET_ROOT/pfnet/chainercv/cub.
crop_bbox (bool) – If true, this class returns an image cropped by the bounding box of the bird inside it.
mask_dir (string) – Path to the root of the mask data. If this is auto, this class will automatically download data for you under $CHAINER_DATASET_ROOT/pfnet/chainercv/cub.
return_mask (bool) – Decide whether to include mask image of the bird in a tuple served for a query.

OnlineProducts¶

OnlineProductsDataset¶

class chainercv.datasets.OnlineProductsDataset(data_dir='auto', split='train')¶

Dataset class for Stanford Online Products Dataset.

When queried by an index, this dataset returns a corresponding img, class_id, super_class_id, a tuple of an image, a class id and a coarse level class id. Images are in RGB and CHW format. Class ids start from 0.

The split selects train and test split of the dataset as done in [1]. The train split contains the first 11318 classes and the test split contains the remaining 11316 classes.

[1]	Hyun Oh Song, Yu Xiang, Stefanie Jegelka, Silvio Savarese. Deep Metric Learning via Lifted Structured Feature Embedding. arXiv 2015.

Parameters:	data_dir (string) – Path to the root of the training data. If this is `auto`, this class will automatically download data for you under `$CHAINER_DATASET_ROOT/pfnet/chainercv/online_products`. split ({'train', 'test'}) – Select a split of the dataset.

PASCAL VOC¶

VOCDetectionDataset¶

class chainercv.datasets.VOCDetectionDataset(data_dir='auto', split='train', year='2012', use_difficult=False, return_difficult=False)¶

Dataset class for the detection task of PASCAL VOC.

The index corresponds to each image.

When queried by an index, if return_difficult == False, this dataset returns a corresponding img, bbox, label, a tuple of an image, bounding boxes and labels. This is the default behaviour. If return_difficult == True, this dataset returns corresponding img, bbox, label, difficult. difficult is a boolean array that indicates whether bounding boxes are labeled as difficult or not.

The bounding boxes are packed into a two dimensional tensor of shape $(R, 4)$, where $R$ is the number of bounding boxes in the image. The second axis represents attributes of the bounding box. They are (y_min, x_min, y_max, x_max), where the four attributes are coordinates of the top left and the bottom right vertices.

The labels are packed into a one dimensional tensor of shape $(R,)$. $R$ is the number of bounding boxes in the image. The class name of the label $l$ is $l$ th element of chainercv.datasets.voc_detection_label_names.

The array difficult is a one dimensional boolean array of shape $(R,)$. $R$ is the number of bounding boxes in the image. If use_difficult is False, this array is a boolean array with all False.

The type of the image, the bounding boxes and the labels are as follows.

img.dtype == numpy.float32
bbox.dtype == numpy.float32
label.dtype == numpy.int32
difficult.dtype == numpy.bool

Parameters:

data_dir (string) – Path to the root of the training data. If this is auto, this class will automatically download data for you under $CHAINER_DATASET_ROOT/pfnet/chainercv/voc.
split ({'train', 'val', 'trainval', 'test'}) – Select a split of the dataset. test split is only available for 2007 dataset.
year ({'2007', '2012'}) – Use a dataset prepared for a challenge held in year.
use_difficult (bool) – If true, use images that are labeled as difficult in the original annotation.
return_difficult (bool) – If true, this dataset returns a boolean array that indicates whether bounding boxes are labeled as difficult or not. The default value is False.

VOCSemanticSegmentationDataset¶

class chainercv.datasets.VOCSemanticSegmentationDataset(data_dir='auto', split='train')¶

Dataset class for the semantic segmantion task of PASCAL VOC2012.

The class name of the label $l$ is $l$ th element of chainercv.datasets.voc_semantic_segmentation_label_names.

Parameters:	data_dir (string) – Path to the root of the training data. If this is `auto`, this class will automatically download data for you under `$CHAINER_DATASET_ROOT/pfnet/chainercv/voc`. split ({'train', 'val', 'trainval'}) – Select a split of the dataset.