Datasets¶
General datasets¶
DirectoryParsingLabelDataset¶
-
class
chainercv.datasets.
DirectoryParsingLabelDataset
(root, check_img_file=None, color=True, numerical_sort=False)¶ A label dataset whose label names are the names of the subdirectories.
The label names are the names of the directories that locate a layer below the root directory. All images locating under the subdirectoies will be categorized to classes with subdirectory names. An image is parsed only when the function
check_img_file
returnsTrue
by taking the path to the image as an argument. Ifcheck_img_file
isNone
, the path with any image extensions will be parsed.Example
A directory structure should be one like below.
root |-- class_0 | |-- img_0.png | |-- img_1.png | --- class_1 |-- img_0.png
>>> from chainercv.datasets import DirectoryParsingLabelDataset >>> dataset = DirectoryParsingLabelDataset('root') >>> dataset.paths ['root/class_0/img_0.png', 'root/class_0/img_1.png', 'root_class_1/img_0.png'] >>> dataset.labels array([0, 0, 1])
Parameters: - root (string) – The root directory.
- check_img_file (callable) – A function to determine if a file should be included in the dataset.
- color (bool) – If
True
, this dataset read images as color images. The default value isTrue
. - numerical_sort (bool) – Label names are sorted numerically.
This means that label
2
is before label10
, which is not the case when string sort is used. Regardless of this option, string sort is used for the order of files with the same label. The default value isFalse
.
This dataset returns the following data.
name shape dtype format img
\((3, H, W)\) [1] float32
RGB, \([0, 255]\) label
scalar int32
\([0, \#class - 1]\) [1] \((1, H, W)\) if color = False
.
directory_parsing_label_names¶
-
chainercv.datasets.
directory_parsing_label_names
(root, numerical_sort=False)¶ Get label names from the directories that are named by them.
The label names are the names of the directories that locate a layer below the root directory.
The label names can be used together with
DirectoryParsingLabelDataset
. The index of a label name corresponds to the label id that is used by the dataset to refer the label.Parameters: Returns: Sorted names of classes.
Return type: list of strings
MixUpSoftLabelDataset¶
-
class
chainercv.datasets.
MixUpSoftLabelDataset
(dataset, n_class)¶ Dataset which returns mixed images and labels for mixup learning [2].
MixUpSoftLabelDataset
mixes two pairs of labeled images fetched from the base dataset.Unlike LabeledImageDatasets, label is a one-dimensional float array with at most two nonnegative weights (i.e. soft label). The sum of the two weights is one.
Example
We construct a mixup dataset from MNIST.
>>> from chainer.datasets import get_mnist >>> from chainercv.datasets import SiameseDataset >>> from chainercv.datasets import MixUpSoftLabelDataset >>> mnist, _ = get_mnist() >>> base_dataset = SiameseDataset(mnist, mnist) >>> dataset = MixUpSoftLabelDataset(base_dataset, 10) >>> mixed_image, mixed_label = dataset[0] >>> mixed_label.shape (10,) >>> mixed_label.dtype dtype('float32')
Parameters: - dataset –
The underlying dataset. The dataset returns
img_0, label_0, img_1, label_1
, which is a tuple containing two pairs of an image and a label. Typically, dataset is SiameseDataset.The shapes of images and labels should be constant.
- n_class (int) – The number of classes in the base dataset.
[2] Hongyi Zhang, Moustapha Cisse, Yann N. Dauphin, David Lopez-Paz. mixup: Beyond Empirical Risk Minimization. arXiv 2017. This dataset returns the following data.
name shape dtype format img
[3] [3] [3] label
\((\#class,)\) float32
\([0, 1]\) [3] (1, 2, 3) Same as dataset
.- dataset –
SiameseDataset¶
-
class
chainercv.datasets.
SiameseDataset
(dataset_0, dataset_1, pos_ratio=None, length=None, labels_0=None, labels_1=None)¶ A dataset that returns samples fetched from two datasets.
The dataset returns samples from the two base datasets. If
pos_ratio
is notNone
,SiameseDataset
can be configured to return positive pairs at the ratio ofpos_ratio
and negative pairs at the ratio of1 - pos_ratio
. In this mode, the base datasets are assumed to be label datasets that return an image and a label as a sample.Example
We construct a siamese dataset from MNIST.
>>> from chainer.datasets import get_mnist >>> from chainercv.datasets import SiameseDataset >>> mnist, _ = get_mnist() >>> dataset = SiameseDataset(mnist, mnist, pos_ratio=0.3) # The probability of the two samples having the same label # is 0.3 as specified by pos_ratio. >>> img_0, label_0, img_1, label_1 = dataset[0] # The returned examples may change in the next # call even if the index is the same as before # because SiameseDataset picks examples randomly # (e.g., img_0_new may differ from img_0). >>> img_0_new, label_0_new, img_1_new, label_1_new = dataset[0]
Parameters: - dataset_0 – The first base dataset.
- dataset_1 – The second base dataset.
- pos_ratio (float) – If this is not
None
, this dataset tries to construct positive pairs at the given rate. IfNone
, this dataset randomly samples examples from the base datasets. The default value isNone
. - length (int) – The length of this dataset. If
None
, the length of the first base dataset is the length of this dataset. - labels_0 (numpy.ndarray) – The labels associated to
the first base dataset. The length should be the same as
the length of the first dataset. If this is
None
, the labels are automatically fetched using the following line of code:[ex[1] for ex in dataset_0]
. By settinglabels_0
and skipping the fetching iteration, the computation cost can be reduced. Also, ifpos_ratio
isNone
, this value is ignored. The default value isNone
. Iflabels_1
is spcified anddataset_0
anddataset_1
are the same,labels_0
can be skipped. - labels_1 (numpy.ndarray) – The labels associated to
the second base dataset. If
labels_0
is spcified anddataset_0
anddataset_1
are the same,labels_1
can be skipped. Please consult the explanation forlabels_0
.
This dataset returns the following data.
name shape dtype format img_0
[4] [4] [4] label_0
scalar int32
\([0, \#class - 1]\) img_1
[5] [5] [5] label_1
scalar int32
\([0, \#class - 1]\) [4] (1, 2, 3) Same as dataset_0
.[5] (1, 2, 3) Same as dataset_1
.
ADE20K¶
ADE20KSemanticSegmentationDataset¶
-
class
chainercv.datasets.
ADE20KSemanticSegmentationDataset
(data_dir='auto', split='train')¶ Semantic segmentation dataset for ADE20K.
This is ADE20K dataset distributed in MIT Scene Parsing Benchmark website. It has 20,210 training images and 2,000 validation images.
Parameters: - data_dir (string) – Path to the dataset directory. The directory should
contain the
ADEChallengeData2016
directory. And that directory should contain at leastimages
andannotations
directries. Ifauto
is given, the dataset is automatically downloaded into$CHAINER_DATASET_ROOT/pfnet/chainercv/ade20k
. - split ({'train', 'val'}) – Select from dataset splits used in MIT Scene Parsing Benchmark dataset (ADE20K).
This dataset returns the following data.
name shape dtype format img
\((3, H, W)\) float32
RGB, \([0, 255]\) label
\((H, W)\) int32
\([0, \#class - 1]\) - data_dir (string) – Path to the dataset directory. The directory should
contain the
ADE20KTestImageDataset¶
-
class
chainercv.datasets.
ADE20KTestImageDataset
(data_dir='auto')¶ Image dataset for test split of ADE20K.
This is an image dataset of test split in ADE20K dataset distributed at MIT Scene Parsing Benchmark website. It has 3,352 test images.
Parameters: data_dir (string) – Path to the dataset directory. The directory should contain the release_test
dir. Ifauto
is given, the dataset is automatically downloaded into$CHAINER_DATASET_ROOT/pfnet/chainercv/ade20k
.This dataset returns the following data.
name shape dtype format img
\((3, H, W)\) float32
RGB, \([0, 255]\)
CamVid¶
CamVidDataset¶
-
class
chainercv.datasets.
CamVidDataset
(data_dir='auto', split='train')¶ Semantic segmentation dataset for CamVid.
Parameters: - data_dir (string) – Path to the root of the training data. If this is
auto
, this class will automatically download data for you under$CHAINER_DATASET_ROOT/pfnet/chainercv/camvid
. - split ({'train', 'val', 'test'}) – Select from dataset splits used in CamVid Dataset.
This dataset returns the following data.
name shape dtype format img
\((3, H, W)\) float32
RGB, \([0, 255]\) label
\((H, W)\) int32
\([-1, \#class - 1]\) - data_dir (string) – Path to the root of the training data. If this is
Cityscapes¶
CityscapesSemanticSegmentationDataset¶
-
class
chainercv.datasets.
CityscapesSemanticSegmentationDataset
(data_dir='auto', label_resolution=None, split='train', ignore_labels=True)¶ Semantic segmentation dataset for Cityscapes dataset.
Note
Please manually download the data because it is not allowed to re-distribute Cityscapes dataset.
Parameters: - data_dir (string) – Path to the dataset directory. The directory should
contain at least two directories,
leftImg8bit
and eithergtFine
orgtCoarse
. Ifauto
is given, it uses$CHAINER_DATSET_ROOT/pfnet/chainercv/cityscapes
by default. - label_resolution ({'fine', 'coarse'}) – The resolution of the labels. It
should be either
fine
orcoarse
. - split ({'train', 'val'}) – Select from dataset splits used in Cityscapes dataset.
- ignore_labels (bool) – If
True
, the labels markedignoreInEval
defined in the original cityscapesScripts will be replaced with-1
in theget_example()
method. The default value isTrue
.
This dataset returns the following data.
name shape dtype format img
\((3, H, W)\) float32
RGB, \([0, 255]\) label
\((H, W)\) int32
\([-1, \#class - 1]\) - data_dir (string) – Path to the dataset directory. The directory should
contain at least two directories,
CityscapesTestImageDataset¶
-
class
chainercv.datasets.
CityscapesTestImageDataset
(data_dir='auto')¶ Image dataset for test split of Cityscapes dataset.
Note
Please manually download the data because it is not allowed to re-distribute Cityscapes dataset.
Parameters: data_dir (string) – Path to the dataset directory. The directory should contain the leftImg8bit
directory. Ifauto
is given, it uses$CHAINER_DATSET_ROOT/pfnet/chainercv/cityscapes
by default.This dataset returns the following data.
name shape dtype format img
\((3, H, W)\) float32
RGB, \([0, 255]\)
CUB¶
CUBLabelDataset¶
-
class
chainercv.datasets.
CUBLabelDataset
(data_dir='auto', return_bb=False, prob_map_dir='auto', return_prob_map=False)¶ Caltech-UCSD Birds-200-2011 dataset with annotated class labels.
Parameters: - data_dir (string) – Path to the root of the training data. If this is
auto
, this class will automatically download data for you under$CHAINER_DATASET_ROOT/pfnet/chainercv/cub
. - return_bb (bool) – If
True
, this returns a bounding box around a bird. The default value isFalse
. - prob_map_dir (string) – Path to the root of the probability maps.
If this is
auto
, this class will automatically download data for you under$CHAINER_DATASET_ROOT/pfnet/chainercv/cub
. - return_prob_map (bool) – Decide whether to include a probability map of
the bird in a tuple served for a query. The default value is
False
.
This dataset returns the following data.
name shape dtype format img
\((3, H, W)\) float32
RGB, \([0, 255]\) label
scalar int32
\([0, \#class - 1]\) bb
[6]\((4,)\) float32
\((y_{min}, x_{min}, y_{max}, x_{max})\) prob_map
[7]\((H, W)\) float32
\([0, 1]\) [6] bb
indicates the location of a bird. It is available ifreturn_bb = True
.[7] prob_map
indicates how likey a bird is located at each the pixel. It is available ifreturn_prob_map = True
.- data_dir (string) – Path to the root of the training data. If this is
CUBPointDataset¶
-
class
chainercv.datasets.
CUBPointDataset
(data_dir='auto', return_bb=False, prob_map_dir='auto', return_prob_map=False)¶ Caltech-UCSD Birds-200-2011 dataset with annotated points.
Parameters: - data_dir (string) – Path to the root of the training data. If this is
auto
, this class will automatically download data for you under$CHAINER_DATASET_ROOT/pfnet/chainercv/cub
. - return_bb (bool) – If
True
, this returns a bounding box around a bird. The default value isFalse
. - prob_map_dir (string) – Path to the root of the probability maps.
If this is
auto
, this class will automatically download data for you under$CHAINER_DATASET_ROOT/pfnet/chainercv/cub
. - return_prob_map (bool) – Decide whether to include a probability map of
the bird in a tuple served for a query. The default value is
False
.
This dataset returns the following data.
name shape dtype format img
\((3, H, W)\) float32
RGB, \([0, 255]\) point
\((P, 2)\) float32
\((y, x)\) mask
\((P,)\) bool
– bb
[8]\((4,)\) float32
\((y_{min}, x_{min}, y_{max}, x_{max})\) prob_map
[9]\((H, W)\) float32
\([0, 1]\) [8] bb
indicates the location of a bird. It is available ifreturn_bb = True
.[9] prob_map
indicates how likey a bird is located at each the pixel. It is available ifreturn_prob_map = True
.- data_dir (string) – Path to the root of the training data. If this is
OnlineProducts¶
OnlineProductsDataset¶
-
class
chainercv.datasets.
OnlineProductsDataset
(data_dir='auto', split='train')¶ Dataset class for Stanford Online Products Dataset.
The
split
selects train and test split of the dataset as done in [10]. The train split contains the first 11318 classes and the test split contains the remaining 11316 classes.[10] Hyun Oh Song, Yu Xiang, Stefanie Jegelka, Silvio Savarese. Deep Metric Learning via Lifted Structured Feature Embedding. arXiv 2015. Parameters: - data_dir (string) – Path to the root of the training data. If this is
auto
, this class will automatically download data for you under$CHAINER_DATASET_ROOT/pfnet/chainercv/online_products
. - split ({'train', 'test'}) – Select a split of the dataset.
This dataset returns the following data.
name shape dtype format img
\((3, H, W)\) float32
RGB, \([0, 255]\) label
scalar int32
\([0, \#class - 1]\) super_label
scalar int32
\([0, \#super\_class - 1]\) - data_dir (string) – Path to the root of the training data. If this is
PASCAL VOC¶
VOCBboxDataset¶
-
class
chainercv.datasets.
VOCBboxDataset
(data_dir='auto', split='train', year='2012', use_difficult=False, return_difficult=False)¶ Bounding box dataset for PASCAL VOC.
Parameters: - data_dir (string) – Path to the root of the training data. If this is
auto
, this class will automatically download data for you under$CHAINER_DATASET_ROOT/pfnet/chainercv/voc
. - split ({'train', 'val', 'trainval', 'test'}) – Select a split of the
dataset.
test
split is only available for 2007 dataset. - year ({'2007', '2012'}) – Use a dataset prepared for a challenge
held in
year
. - use_difficult (bool) – If
True
, use images that are labeled as difficult in the original annotation. - return_difficult (bool) – If
True
, this dataset returns a boolean array that indicates whether bounding boxes are labeled as difficult or not. The default value isFalse
.
This dataset returns the following data.
name shape dtype format img
\((3, H, W)\) float32
RGB, \([0, 255]\) bbox
[11]\((R, 4)\) float32
\((y_{min}, x_{min}, y_{max}, x_{max})\) label
[11]\((R,)\) int32
\([0, \#fg\_class - 1]\) difficult
(optional [12])\((R,)\) bool
– [11] (1, 2) If use_difficult = True
,bbox
andlabel
contain difficult instances.[12] difficult
is available ifreturn_difficult = True
.- data_dir (string) – Path to the root of the training data. If this is
VOCInstanceSegmentationDataset¶
-
class
chainercv.datasets.
VOCInstanceSegmentationDataset
(data_dir='auto', split='train')¶ Instance segmentation dataset for PASCAL VOC2012.
Parameters: - data_dir (string) – Path to the root of the training data. If this is
auto
, this class will automatically download data for you under$CHAINER_DATASET_ROOT/pfnet/chainercv/voc
. - split ({'train', 'val', 'trainval'}) – Select a split of the dataset.
This dataset returns the following data.
name shape dtype format img
\((3, H, W)\) float32
RGB, \([0, 255]\) mask
\((R, H, W)\) bool
– label
\((R,)\) int32
\([0, \#fg\_class - 1]\) - data_dir (string) – Path to the root of the training data. If this is
VOCSemanticSegmentationDataset¶
-
class
chainercv.datasets.
VOCSemanticSegmentationDataset
(data_dir='auto', split='train')¶ Semantic segmentation dataset for PASCAL VOC2012.
Parameters: - data_dir (string) – Path to the root of the training data. If this is
auto
, this class will automatically download data for you under$CHAINER_DATASET_ROOT/pfnet/chainercv/voc
. - split ({'train', 'val', 'trainval'}) – Select a split of the dataset.
This dataset returns the following data.
name shape dtype format img
\((3, H, W)\) float32
RGB, \([0, 255]\) label
\((H, W)\) int32
\([-1, \#class - 1]\) - data_dir (string) – Path to the root of the training data. If this is
Semantic Boundaries Dataset¶
SBDInstanceSegmentationDataset¶
-
class
chainercv.datasets.
SBDInstanceSegmentationDataset
(data_dir='auto', split='train')¶ Instance segmentation dataset for Semantic Boundaries Dataset SBD.
Parameters: - data_dir (string) – Path to the root of the training data. If this is
auto
, this class will automatically download data for you under$CHAINER_DATASET_ROOT/pfnet/chainercv/sbd
. - split ({'train', 'val', 'trainval'}) – Select a split of the dataset.
This dataset returns the following data.
name shape dtype format img
\((3, H, W)\) float32
RGB, \([0, 255]\) mask
\((R, H, W)\) bool
– label
\((R,)\) int32
\([0, \#fg\_class - 1]\) - data_dir (string) – Path to the root of the training data. If this is