Utils

Bounding Box Utilities

bbox_iou

chainercv.utils.bbox_iou(bbox_a, bbox_b)

Calculate the Intersection of Unions (IoUs) between bounding boxes.

IoU is calculated as a ratio of area of the intersection and area of the union.

This function accepts both numpy.ndarray and cupy.ndarray as inputs. Please note that both bbox_a and bbox_b need to be same type. The output is same type as the type of the inputs.

Parameters:
  • bbox_a (array) – An array whose shape is \((N, 4)\). \(N\) is the number of bounding boxes. The dtype should be numpy.float32.
  • bbox_b (array) – An array similar to bbox_a, whose shape is \((K, 4)\). The dtype should be numpy.float32.
Returns:

An array whose shape is \((N, K)\). An element at index \((n, k)\) contains IoUs between \(n\) th bounding box in bbox_a and \(k\) th bounding box in bbox_b.

Return type:

array

non_maximum_suppression

chainercv.utils.non_maximum_suppression(bbox, thresh, score=None, limit=None)

Suppress bounding boxes according to their IoUs.

This method checks each bounding box sequentially and selects the bounding box if the Intersection over Unions (IoUs) between the bounding box and the previously selected bounding boxes is less than thresh. This method is mainly used as postprocessing of object detection. The bounding boxes are selected from ones with higher scores. If score is not provided as an argument, the bounding box is ordered by its index in ascending order.

The bounding boxes are expected to be packed into a two dimensional tensor of shape \((R, 4)\), where \(R\) is the number of bounding boxes in the image. The second axis represents attributes of the bounding box. They are \((y_{min}, x_{min}, y_{max}, x_{max})\), where the four attributes are coordinates of the top left and the bottom right vertices.

score is a float array of shape \((R,)\). Each score indicates confidence of prediction.

This function accepts both numpy.ndarray and cupy.ndarray as an input. Please note that both bbox and score need to be the same type. The type of the output is the same as the input.

Parameters:
  • bbox (array) – Bounding boxes to be transformed. The shape is \((R, 4)\). \(R\) is the number of bounding boxes.
  • thresh (float) – Threshold of IoUs.
  • score (array) – An array of confidences whose shape is \((R,)\).
  • limit (int) – The upper bound of the number of the output bounding boxes. If it is not specified, this method selects as many bounding boxes as possible.
Returns:

An array with indices of bounding boxes that are selected. They are sorted by the scores of bounding boxes in descending order. The shape of this array is \((K,)\) and its dtype is numpy.int32. Note that \(K \leq R\).

Return type:

array

Download Utilities

cached_download

chainercv.utils.cached_download(url)

Downloads a file and caches it.

This is different from the original cached_download() in that the download progress is reported.

It downloads a file from the URL if there is no corresponding cache. After the download, this function stores a cache to the directory under the dataset root (see set_dataset_root()). If there is already a cache for the given URL, it just returns the path to the cache without downloading the same file.

Parameters:url (str) – URL to download from.
Returns:Path to the downloaded file.
Return type:str

download_model

chainercv.utils.download_model(url)

Downloads a model file and puts it under model directory.

It downloads a file from the URL and puts it under model directory. For exmaple, if url is http://example.com/subdir/model.npz, the pretrained weights file will be saved to $CHAINER_DATASET_ROOT/pfnet/chainercv/models/model.npz. If there is already a file at the destination path, it just returns the path without downloading the same file.

Parameters:url (str) – URL to download from.
Returns:Path to the downloaded file.
Return type:str

extractall

chainercv.utils.extractall(file_path, destination, ext)

Extracts an archive file.

This function extracts an archive file to a destination.

Parameters:
  • file_path (str) – The path of a file to be extracted.
  • destination (str) – A directory path. The archive file will be extracted under this directory.
  • ext (str) – An extension suffix of the archive file. This function supports '.zip', '.tar', '.gz' and '.tgz'.

Image Utilities

read_image

chainercv.utils.read_image(path, dtype=<type 'numpy.float32'>, color=True)

Read an image from a file.

This function reads an image from given file. The image is CHW format and the range of its value is \([0, 255]\). If color = True, the order of the channels is RGB.

Parameters:
  • path (str) – A path of image file.
  • dtype – The type of array. The default value is float32.
  • color (bool) – This option determines the number of channels. If True, the number of channels is three. In this case, the order of the channels is RGB. This is the default behaviour. If False, this function returns a grayscale image.
Returns:

An image.

Return type:

ndarray

tile_images

chainercv.utils.tile_images(imgs, n_col, pad=2, fill=0)

Make a tile of images

Parameters:
  • imgs (numpy.ndarray) – A batch of images whose shape is BCHW.
  • n_col (int) – The number of columns in a tile.
  • pad (int or tuple of two ints) – pad_y, pad_x. This is the amounts of padding in y and x directions. If this is an integer, the amounts of padding in the two directions are the same. The default value is 2.
  • fill (float, tuple or ndarray) – The value of padded pixels. If it is numpy.ndarray, its shape should be \((C, 1, 1)\), where \(C\) is the number of channels of img.
Returns:

An image array in CHW format. The size of this image is \(((H + pad_{y}) \times \lceil B / n_{n_{col}} \rceil, (W + pad_{x}) \times n_{col})\).

Return type:

ndarray

write_image

chainercv.utils.write_image(img, path)

Save an image to a file.

This function saves an image to given file. The image is in CHW format and the range of its value is \([0, 255]\).

Parameters:
  • image (ndarray) – An image to be saved.
  • path (str) – The path of an image file.

Iterator Utilities

apply_to_iterator

chainercv.utils.apply_to_iterator(func, iterator, n_input=1, hook=None)

Apply a function/method to batches from an iterator.

This function applies a function/method to an iterator of batches.

It assumes that the iterator iterates over a collection of tuples that contain inputs to func(). Additionally, the tuples may contain values that are not used by func(). For convenience, we allow the iterator to iterate over a collection of inputs that are not tuple. Here is an illustration of the expected behavior of the iterator. This behaviour is the same as chainer.Iterator.

>>> batch = next(iterator)
>>> # batch: [in_val]
or
>>> # batch: [(in_val0, ..., in_val{n_input - 1})]
or
>>> # batch: [(in_val0, ..., in_val{n_input - 1}, rest_val0, ...)]

func() should take batch(es) of data and return batch(es) of computed values. Here is an illustration of the expected behavior of the function.

>>> out_vals = func([in_val0], ..., [in_val{n_input - 1}])
>>> # out_vals: [out_val]
or
>>> out_vals0, out_vals1, ... = func([in_val0], ..., [in_val{n_input - 1}])
>>> # out_vals0: [out_val0]
>>> # out_vals1: [out_val1]

With apply_to_iterator(), users can get iterator(s) of values returned by func(). It also returns iterator(s) of input values and values that are not used for computation.

>>> in_values, out_values, rest_values = apply_to_iterator(
>>>     func, iterator, n_input)
>>> # in_values: (iter of in_val0, ..., iter of in_val{n_input - 1})
>>> # out_values: (iter of out_val0, ...)
>>> # rest_values: (iter of rest_val0, ...)

Here is an exmple, which applies a pretrained Faster R-CNN to PASCAL VOC dataset.

>>> from chainer import iterators
>>>
>>> from chainercv.datasets import VOCBBoxDataset
>>> from chainercv.links import FasterRCNNVGG16
>>> from chainercv.utils import apply_to_iterator
>>>
>>> dataset = VOCBBoxDataset(year='2007', split='test')
>>> # next(iterator) -> [(img, gt_bbox, gt_label)]
>>> iterator = iterators.SerialIterator(
...     dataset, 2, repeat=False, shuffle=False)
>>>
>>> # model.predict([img]) -> ([pred_bbox], [pred_label], [pred_score])
>>> model = FasterRCNNVGG16(pretrained_model='voc07')
>>>
>>> in_values, out_values, rest_values = apply_to_iterator(
...     model.predict, iterator)
>>>
>>> # in_values contains one iterator
>>> imgs, = in_values
>>> # out_values contains three iterators
>>> pred_bboxes, pred_labels, pred_scores = out_values
>>> # rest_values contains two iterators
>>> gt_bboxes, gt_labels = rest_values
Parameters:
  • func – A callable that takes batch(es) of input data and returns computed data.
  • iterator (iterator) – An iterator of batches. The first n_input elements in each sample are treated as input values. They are passed to func.
  • n_input (int) – The number of input data. The default value is 1.
  • hook – A callable that is called after each iteration. in_values, out_values, and rest_values are passed as arguments. Note that these values do not contain data from the previous iterations.
Returns:

This function returns three tuples of iterators: in_values, out_values and rest_values.

  • in_values: A tuple of iterators. Each iterator returns a corresponding input value. For example, if func() takes [in_val0], [in_val1], next(in_values[0]) and next(in_values[1]) will be in_val0 and in_val1.
  • out_values: A tuple of iterators. Each iterator returns a corresponding computed value. For example, if func() returns ([out_val0], [out_val1]), next(out_values[0]) and next(out_values[1]) will be out_val0 and out_val1.
  • rest_values: A tuple of iterators. Each iterator returns a corresponding rest value. For example, if the iterator returns [(in_val0, in_val1, rest_val0, rest_val1)], next(rest_values[0]) and next(rest_values[1]) will be rest_val0 and rest_val1. If the input iterator does not give any rest values, this tuple will be empty.

Return type:

Three tuples of iterators

ProgressHook

class chainercv.utils.ProgressHook(n_total=None)

A hook class reporting the progress of iteration.

This is a hook class designed for apply_prediction_to_iterator().

Parameters:n_total (int) – The number of images. This argument is optional.

unzip

chainercv.utils.unzip(iterable)

Converts an iterable of tuples into a tuple of iterators.

This function converts an iterable of tuples into a tuple of iterators. This is an inverse function of six.moves.zip().

>>> from chainercv.utils import unzip
>>> data = [(0, 'a'), (1, 'b'), (2, 'c'), (3, 'd'), (4, 'e')]
>>> int_iter, str_iter = unzip(data)
>>>
>>> next(int_iter)  # 0
>>> next(int_iter)  # 1
>>> next(int_iter)  # 2
>>>
>>> next(str_iter)  # 'a'
>>> next(str_iter)  # 'b'
>>> next(str_iter)  # 'c'
Parameters:iterable (iterable) – An iterable of tuples. All tuples should have the same length.
Returns:Each iterator corresponds to each element of input tuple. Note that each iterator stores values until they are popped. To reduce memory usage, it is recommended to delete unused iterators.
Return type:tuple of iterators

Mask Utilities

mask_iou

chainercv.utils.mask_iou(mask_a, mask_b)

Calculate the Intersection of Unions (IoUs) between masks.

IoU is calculated as a ratio of area of the intersection and area of the union.

This function accepts both numpy.ndarray and cupy.ndarray as inputs. Please note that both mask_a and mask_b need to be same type. The output is same type as the type of the inputs.

Parameters:
  • mask_a (array) – An array whose shape is \((N, H, W)\). \(N\) is the number of masks. The dtype should be numpy.bool.
  • mask_b (array) – An array similar to mask_a, whose shape is \((K, H, W)\). The dtype should be numpy.bool.
Returns:

An array whose shape is \((N, K)\). An element at index \((n, k)\) contains IoUs between \(n\) th mask in mask_a and \(k\) th mask in mask_b.

Return type:

array

mask_to_bbox

chainercv.utils.mask_to_bbox(mask)

Compute the bounding boxes around the masked regions.

This function accepts both numpy.ndarray and cupy.ndarray as inputs.

Parameters:mask (array) – An array whose shape is \((R, H, W)\). \(R\) is the number of masks. The dtype should be numpy.bool.
Returns:The bounding boxes around the masked regions. This is an array whose shape is \((R, 4)\). \(R\) is the number of bounding boxes. The dtype should be numpy.float32.
Return type:array

Testing Utilities

assert_is_bbox

chainercv.utils.assert_is_bbox(bbox, size=None)

Checks if bounding boxes satisfy bounding box format.

This function checks if given bounding boxes satisfy bounding boxes format or not. If the bounding boxes do not satifiy the format, this function raises an AssertionError.

Parameters:
  • bbox (ndarray) – Bounding boxes to be checked.
  • size (tuple of ints) – The size of an image. If this argument is specified, Each bounding box should be within the image.

assert_is_bbox_dataset

chainercv.utils.assert_is_bbox_dataset(dataset, n_fg_class, n_example=None)

Checks if a dataset satisfies the bounding box dataset API.

This function checks if a given dataset satisfies the bounding box dataset API or not. If the dataset does not satifiy the API, this function raises an AssertionError.

Parameters:
  • dataset – A dataset to be checked.
  • n_fg_class (int) – The number of foreground classes.
  • n_example (int) – The number of examples to be checked. If this argument is specified, this function picks examples ramdomly and checks them. Otherwise, this function checks all examples.

assert_is_image

chainercv.utils.assert_is_image(img, color=True, check_range=True)

Checks if an image satisfies image format.

This function checks if a given image satisfies image format or not. If the image does not satifiy the format, this function raises an AssertionError.

Parameters:
  • img (ndarray) – An image to be checked.
  • color (bool) – A boolean that determines the expected channel size. If it is True, the number of channels should be 3. Otherwise, it should be 1. The default value is True.
  • check_range (bool) – A boolean that determines whether the range of values are checked or not. If it is True, The values of image must be in \([0, 255]\). Otherwise, this function does not check the range. The default value is True.

assert_is_instance_segmentation_dataset

chainercv.utils.assert_is_instance_segmentation_dataset(dataset, n_fg_class, n_example=None)

Checks if a dataset satisfies instance segmentation dataset APIs.

This function checks if a given dataset satisfies instance segmentation dataset APIs or not. If the dataset does not satifiy the APIs, this function raises an AssertionError.

Parameters:
  • dataset – A dataset to be checked.
  • n_fg_class (int) – The number of foreground classes.
  • n_example (int) – The number of examples to be checked. If this argument is specified, this function picks examples ramdomly and checks them. Otherwise, this function checks all examples.

assert_is_label_dataset

chainercv.utils.assert_is_label_dataset(dataset, n_class, n_example=None, color=True)

Checks if a dataset satisfies the label dataset API.

This function checks if a given dataset satisfies the label dataset API or not. If the dataset does not satifiy the API, this function raises an AssertionError.

Parameters:
  • dataset – A dataset to be checked.
  • n_class (int) – The number of classes.
  • n_example (int) – The number of examples to be checked. If this argument is specified, this function picks examples ramdomly and checks them. Otherwise, this function checks all examples.
  • color (bool) – A boolean that determines the expected channel size. If it is True, the number of channels should be 3. Otherwise, it should be 1. The default value is True.

assert_is_point

chainercv.utils.assert_is_point(point, mask=None, size=None)

Checks if points satisfy the format.

This function checks if given points satisfy the format and raises an AssertionError when the points violate the convention.

Parameters:
  • point (ndarray) – Points to be checked.
  • mask (ndarray) – A mask of the points. If this is None, all points are regarded as valid.
  • size (tuple of ints) – The size of an image. If this argument is specified, the coordinates of valid points are checked to be within the image.

assert_is_point_dataset

chainercv.utils.assert_is_point_dataset(dataset, n_point=None, n_example=None, no_mask=False)

Checks if a dataset satisfies the point dataset API.

This function checks if a given dataset satisfies the point dataset API or not. If the dataset does not satifiy the API, this function raises an AssertionError.

Parameters:
  • dataset – A dataset to be checked.
  • n_point (int) – The number of expected points per image. If this is None, the number of points per image can be arbitrary.
  • n_example (int) – The number of examples to be checked. If this argument is specified, this function picks examples ramdomly and checks them. Otherwise, this function checks all examples.
  • no_mask (bool) – If True, we assume that point mask is always not contained. If False, point mask may or may not be contained.

assert_is_semantic_segmentation_dataset

chainercv.utils.assert_is_semantic_segmentation_dataset(dataset, n_class, n_example=None)

Checks if a dataset satisfies semantic segmentation dataset APIs.

This function checks if a given dataset satisfies semantic segmentation dataset APIs or not. If the dataset does not satifiy the APIs, this function raises an AssertionError.

Parameters:
  • dataset – A dataset to be checked.
  • n_class (int) – The number of classes including background.
  • n_example (int) – The number of examples to be checked. If this argument is specified, this function picks examples ramdomly and checks them. Otherwise, this function checks all examples.

generate_random_bbox

chainercv.utils.generate_random_bbox(n, img_size, min_length, max_length)

Generate valid bounding boxes with random position and shape.

Parameters:
  • n (int) – The number of bounding boxes.
  • img_size (tuple) – A tuple of length 2. The height and the width of the image on which bounding boxes locate.
  • min_length (float) – The minimum length of edges of bounding boxes.
  • max_length (float) – The maximum length of edges of bounding boxes.
Returns:

Coordinates of bounding boxes. Its shape is \((R, 4)\). Here, \(R\) equals n. The second axis contains \(y_{min}, x_{min}, y_{max}, x_{max}\), where \(min\_length \leq y_{max} - y_{min} < max\_length\). and \(min\_length \leq x_{max} - x_{min} < max\_length\)

Return type:

numpy.ndarray