Utils¶
Bounding Box Utilities¶
bbox_iou¶
-
chainercv.utils.
bbox_iou
(bbox_a, bbox_b)¶ Calculate the Intersection of Unions (IoUs) between bounding boxes.
IoU is calculated as a ratio of area of the intersection and area of the union.
This function accepts both
numpy.ndarray
andcupy.ndarray
as inputs. Please note that bothbbox_a
andbbox_b
need to be same type. The output is same type as the type of the inputs.Parameters: Returns: An array whose shape is \((N, K)\). An element at index \((n, k)\) contains IoUs between \(n\) th bounding box in
bbox_a
and \(k\) th bounding box inbbox_b
.Return type:
non_maximum_suppression¶
-
chainercv.utils.
non_maximum_suppression
(bbox, thresh, score=None, limit=None)¶ Suppress bounding boxes according to their IoUs.
This method checks each bounding box sequentially and selects the bounding box if the Intersection over Unions (IoUs) between the bounding box and the previously selected bounding boxes is less than
thresh
. This method is mainly used as postprocessing of object detection. The bounding boxes are selected from ones with higher scores. Ifscore
is not provided as an argument, the bounding box is ordered by its index in ascending order.The bounding boxes are expected to be packed into a two dimensional tensor of shape \((R, 4)\), where \(R\) is the number of bounding boxes in the image. The second axis represents attributes of the bounding box. They are
(y_min, x_min, y_max, x_max)
, where the four attributes are coordinates of the top left and the bottom right vertices.score
is a float array of shape \((R,)\). Each score indicates confidence of prediction.This function accepts both
numpy.ndarray
andcupy.ndarray
as inputs. Please note that bothbbox
andscore
need to be same type. The output is same type as the type of the inputs.Parameters: - bbox (array) – Bounding boxes to be transformed. The shape is \((R, 4)\). \(R\) is the number of bounding boxes.
- thresh (float) – Threshold of IoUs.
- score (array) – An array of confidences whose shape is \((R,)\).
- limit (int) – The upper bound of the number of the output bounding boxes. If it is not specified, this method selects as many bounding boxes as possible.
Returns: An array with indices of bounding boxes that are selected. They are sorted by the scores of bounding boxes in descending order. The shape of this array is \((K,)\) and its dtype is
numpy.int32
. Note that \(K \leq R\).Return type:
Download Utilities¶
cached_download¶
-
chainercv.utils.
cached_download
(url)¶ Downloads a file and caches it.
This is different from the original
cached_download
in that the download progress is reported.It downloads a file from the URL if there is no corresponding cache. After the download, this function stores a cache to the directory under the dataset root (see
set_dataset_root()
). If there is already a cache for the given URL, it just returns the path to the cache without downloading the same file.Parameters: url (str) – URL to download from. Returns: Path to the downloaded file. Return type: str
download_model¶
-
chainercv.utils.
download_model
(url)¶ Downloads a model file and puts it under model directory.
It downloads a file from the URL and puts it under model directory. For exmaple, if
url
is http://example.com/subdir/model.npz, the pretrained weights file will be saved to $CHAINER_DATASET_ROOT/pfnet/chainercv/models/model.npz. If there is already a file at the destination path, it just returns the path without downloading the same file.Parameters: url (str) – URL to download from. Returns: Path to the downloaded file. Return type: str
Image Utilities¶
read_image¶
-
chainercv.utils.
read_image
(path, dtype=<type 'numpy.float32'>, color=True)¶ Read an image from a file.
This function reads an image from given file. The image is CHW format and the range of its value is \([0, 255]\). If
color = True
, the order of the channels is RGB.Parameters: - path (str) – A path of image file.
- dtype – The type of array. The default value is
float32
. - color (bool) – This option determines the number of channels.
If
True
, the number of channels is three. In this case, the order of the channels is RGB. This is the default behaviour. IfFalse
, this function returns a grayscale image.
Returns: An image.
Return type:
Iterator Utilities¶
apply_prediction_to_iterator¶
-
chainercv.utils.
apply_prediction_to_iterator
(predict, iterator, hook=None)¶ Apply a prediction function/method to an iterator.
This function applies a prediction function/method to an iterator. It assumes that the iterator returns a batch of images or a batch of tuples whose first element is an image. In the case that it returns a batch of tuples, the rests are treated as ground truth values.
>>> imgs = next(iterator) >>> # imgs: [img] or >>> batch = next(iterator) >>> # batch: [(img, gt_val0, gt_val1)]
This function applys
predict()
to a batch of images and gets predicted value(s).predict()
should take a batch of images and return a batch of prediction values or a tuple of batches of prediction values.>>> pred_vals0 = predict(imgs) >>> # pred_vals0: [pred_val0] or >>> pred_vals0, pred_vals1 = predict(imgs) >>> # pred_vals0: [pred_val0] >>> # pred_vals1: [pred_val1]
Here is an exmple, which applies a pretrained Faster R-CNN to PASCAL VOC dataset.
>>> from chainer import iterators >>> >>> from chainercv.datasets import VOCDetectionDataset >>> from chainercv.links import FasterRCNNVGG16 >>> from chainercv.utils import apply_prediction_to_iterator >>> >>> dataset = VOCDetectionDataset(year='2007', split='test') >>> # next(iterator) -> [(img, gt_bbox, gt_label)] >>> iterator = iterators.SerialIterator( ... dataset, 2, repeat=False, shuffle=False) >>> >>> # model.predict([img]) -> ([pred_bbox], [pred_label], [pred_score]) >>> model = FasterRCNNVGG16(pretrained_model='voc07') >>> >>> imgs, pred_values, gt_values = apply_prediction_to_iterator( ... model.predict, iterator) >>> >>> # pred_values contains three iterators >>> pred_bboxes, pred_labels, pred_scores = pred_values >>> # gt_values contains two iterators >>> gt_bboxes, gt_labels = gt_values
Parameters: - predict – A callable that takes a batch of images and returns prediction.
- iterator (chainer.Iterator) – An iterator. Each sample should have
an image as its first element. This image is passed to
predict()
as an argument. The rests are treated as ground truth values. - hook – A callable that is called after each iteration.
imgs
,pred_values
andgt_values
are passed as arguments. Note that these values do not contain data from the previous iterations.
Returns: This function returns an iterator and two tuples of iterators:
imgs
,pred_values
andgt_values
.imgs
: An iterator that returns an image.pred_values
: A tuple of iterators. Each iterator returns a corresponding predicted value. For example, ifpredict()
returns([pred_val0], [pred_val1])
,next(pred_values[0])
andnext(pred_values[1])
will bepred_val0
andpred_val1
.gt_values
: A tuple of iterators. Each iterator returns a corresponding ground truth value. For example, if theiterator
returns[(img, gt_val0, gt_val1)]
,next(gt_values[0])
andnext(gt_values[1])
will begt_val0
andgt_val1
. If the input iterator does not give any ground truth values, this tuple will be empty.
Return type: An iterator and two tuples of iterators
unzip¶
-
chainercv.utils.
unzip
(iterable)¶ Converts an iterable of tuples into a tuple of iterators.
This function converts an iterable of tuples into a tuple of iterators. This is an inverse function of
six.moves.zip()
.>>> from chainercv.utils import unzip >>> data = [(0, 'a'), (1, 'b'), (2, 'c'), (3, 'd'), (4, 'e')] >>> int_iter, str_iter = unzip(data) >>> >>> next(int_iter) # 0 >>> next(int_iter) # 1 >>> next(int_iter) # 2 >>> >>> next(str_iter) # 'a' >>> next(str_iter) # 'b' >>> next(str_iter) # 'c'
Parameters: iterable (iterable) – An iterable of tuples. All tuples should have the same length. Returns: Each iterator corresponds to each element of input tuple. Note that each iterator stores values until they are popped. To reduce memory usage, it is recommended to delete unused iterators. Return type: tuple of iterators
Testing Utilities¶
assert_is_bbox¶
-
chainercv.utils.
assert_is_bbox
(bbox, size=None)¶ Checks if bounding boxes satisfy bounding box format.
This function checks if given bounding boxes satisfy bounding boxes format or not. If the bounding boxes do not satifiy the format, this function raises an
AssertionError
.Parameters: - bbox (ndarray) – Bounding boxes to be checked.
- size (tuple of ints) – The size of an image. If this argument is specified, Each bounding box should be within the image.
assert_is_detection_dataset¶
-
chainercv.utils.
assert_is_detection_dataset
(dataset, n_fg_class, n_example=None)¶ Checks if a dataset satisfies detection dataset APIs.
This function checks if a given dataset satisfies detection dataset APIs or not. If the dataset does not satifiy the APIs, this function raises an
AssertionError
.Parameters:
assert_is_detection_link¶
-
chainercv.utils.
assert_is_detection_link
(link, n_fg_class)¶ Checks if a link satisfies detection link APIs.
This function checks if a given link satisfies detection link APIs or not. If the link does not satifiy the APIs, this function raises an
AssertionError
.Parameters: - link – A link to be checked.
- n_fg_class (int) – The number of foreground classes.
assert_is_image¶
-
chainercv.utils.
assert_is_image
(img, color=True, check_range=True)¶ Checks if an image satisfies image format.
This function checks if a given image satisfies image format or not. If the image does not satifiy the format, this function raises an
AssertionError
.Parameters: - img (ndarray) – An image to be checked.
- color (bool) – A boolean that determines the expected channel size.
If it is
True
, the number of channels should be3
. Otherwise, it should be1
. The default value isTrue
. - check_range (bool) – A boolean that determines whether the range
of values are checked or not. If it is
True
, The values of image must be in \([0, 255]\). Otherwise, this function does not check the range. The default value isTrue
.
assert_is_semantic_segmentation_dataset¶
-
chainercv.utils.
assert_is_semantic_segmentation_dataset
(dataset, n_class, n_example=None)¶ Checks if a dataset satisfies semantic segmentation dataset APIs.
This function checks if a given dataset satisfies semantic segmentation dataset dataset APIs or not. If the dataset does not satifiy the APIs, this function raises an
AssertionError
.Parameters:
assert_is_semantic_segmentation_link¶
-
chainercv.utils.
assert_is_semantic_segmentation_link
(link, n_class)¶ Checks if a link satisfies semantic segmentation link APIs.
This function checks if a given link satisfies semantic segmentation link APIs or not. If the link does not satifiy the APIs, this function raises an
AssertionError
.Parameters: - link – A link to be checked.
- n_class (int) – The number of classes including background.
ConstantStubLink¶
-
class
chainercv.utils.
ConstantStubLink
(outputs)¶ A chainer.Link that returns constant value(s).
This is a
chainer.Link
that returns constantchainer.Variable
(s) when__call__()
method is called.Parameters: outputs (ndarray or tuple or ndarray) – The value(s) of variable(s) returned by __call__()
. If an array is specified,__call__()
returns achainer.Variable
. Otherwise, it returns a tuple ofchainer.Variable
.
generate_random_bbox¶
-
chainercv.utils.
generate_random_bbox
(n, img_size, min_length, max_length)¶ Generate valid bounding boxes with random position and shape.
Parameters: Returns: Coordinates of bounding boxes. Its shape is \((R, 4)\). Here, \(R\) equals
n
. The second axis contains \(y_{min}, x_{min}, y_{max}, x_{max}\), where \(min\_length \leq y_{max} - y_{min} < max\_length\). and \(min\_length \leq x_{max} - x_{min} < max\_length\)Return type: