YOLO¶

Detection Links¶

YOLOv2¶

class chainercv.links.model.yolo.YOLOv2(n_fg_class=None, pretrained_model=None)[source]¶

YOLOv2.

This is a model of YOLOv2 1. This model uses Darknet19Extractor as its feature extractor.

1: Joseph Redmon, Ali Farhadi. YOLO9000: Better, Faster, Stronger. CVPR 2017.

Parameters

n_fg_class (int) – The number of classes excluding the background.
pretrained_model (string) –
The weight file to be loaded. This can take 'voc0712', filepath or None. The default value is None.
- 'voc0712': Load weights trained on trainval split of PASCAL VOC 2007 and 2012. The weight file is downloaded and cached automatically. n_fg_class must be 20 or None. These weights were converted from the darknet model provided by the original implementation. The conversion code is chainercv/examples/yolo/darknet2npz.py.
- filepath: A path of npz file. In this case, n_fg_class must be specified properly.
- None: Do not load weights.

YOLOv3¶

class chainercv.links.model.yolo.YOLOv3(n_fg_class=None, pretrained_model=None)[source]¶

YOLOv3.

This is a model of YOLOv3 2. This model uses Darknet53Extractor as its feature extractor.

2: Joseph Redmon, Ali Farhadi. YOLOv3: An Incremental Improvement. arXiv 2018.

Parameters

n_fg_class (int) – The number of classes excluding the background.
pretrained_model (string) –
The weight file to be loaded. This can take 'voc0712', filepath or None. The default value is None.
- 'voc0712': Load weights trained on trainval split of PASCAL VOC 2007 and 2012. The weight file is downloaded and cached automatically. n_fg_class must be 20 or None. These weights were converted from the darknet model. The conversion code is chainercv/examples/yolo/darknet2npz.py.
- filepath: A path of npz file. In this case, n_fg_class must be specified properly.
- None: Do not load weights.

forward(x)[source]¶

Compute localization, objectness, and classification from a batch of images.

This method computes three variables, locs, objs, and confs. self._decode() converts these variables to bounding box coordinates and confidence scores. These variables are also used in training YOLOv3.

Parameters

x (chainer.Variable) – A variable holding a batch of images.

Returns

This method returns three variables, locs, objs, and confs.

locs: A variable of float arrays of shape \((B, K, 4)\), where \(B\) is the number of samples in the batch and \(K\) is the number of default bounding boxes.
objs: A variable of float arrays of shape \((B, K)\).
confs: A variable of float arrays of shape \((B, K, n\_fg\_class)\).

Return type

tuple of chainer.Variable

to_cpu()[source]¶

Copies parameter variables and persistent values to CPU.

This method does not handle non-registered attributes. If some of such attributes must be copied to CPU, the link implementation should override device_resident_accept() to do so.

Returns: self

to_gpu(device=None)[source]¶

Copies parameter variables and persistent values to GPU.

This method does not handle non-registered attributes. If some of such attributes must be copied to GPU, the link implementation must override device_resident_accept() to do so.

Parameters: device – Target device specifier. If omitted, the current device is used.

Returns: self

Utility¶

ResidualBlock¶

class chainercv.links.model.yolo.ResidualBlock(*links)[source]¶: ChainList with a residual connection.

Darknet19Extractor¶

class chainercv.links.model.yolo.Darknet19Extractor[source]¶

A Darknet19 based feature extractor for YOLOv2.

This is a feature extractor for YOLOv2

forward(x)[source]¶

Compute a feature map from a batch of images.

Parameters: x (ndarray) – An array holding a batch of images. The images should be resized to \(416\times 416\).
Returns
Return type: Variable

Darknet53Extractor¶

class chainercv.links.model.yolo.Darknet53Extractor[source]¶

A Darknet53 based feature extractor for YOLOv3.

This is a feature extractor for YOLOv3

forward(x)[source]¶

Compute feature maps from a batch of images.

This method extracts feature maps from 3 layers.

Parameters: x (ndarray) – An array holding a batch of images. The images should be resized to \(416\times 416\).
Returns: Each variable contains a feature map.
Return type: list of Variable

YOLOBase¶

class chainercv.links.model.yolo.YOLOBase(**links)[source]¶

Base class for YOLOv2 and YOLOv3.

A subclass of this class should have extractor, forward(), and _decode().

predict(imgs)[source]¶

Detect objects from images.

This method predicts objects for each image.

Parameters

imgs (iterable of numpy.ndarray) – Arrays holding images. All images are in CHW and RGB format and the range of their value is \([0, 255]\).

Returns

This method returns a tuple of three lists, (bboxes, labels, scores).

bboxes: A list of float arrays of shape \((R, 4)\), where \(R\) is the number of bounding boxes in a image. Each bounding box is organized by \((y_{min}, x_{min}, y_{max}, x_{max})\) in the second axis.
labels : A list of integer arrays of shape \((R,)\). Each value indicates the class of the bounding box. Values are in range \([0, L - 1]\), where \(L\) is the number of the foreground classes.
scores : A list of float arrays of shape \((R,)\). Each value indicates how confident the prediction is.

Return type

tuple of lists

use_preset(preset)[source]¶

Use the given preset during prediction.

This method changes values of nms_thresh and score_thresh. These values are a threshold value used for non maximum suppression and a threshold value to discard low confidence proposals in predict(), respectively.

If the attributes need to be changed to something other than the values provided in the presets, please modify them by directly accessing the public attributes.

Parameters: preset ({'visualize', 'evaluate'}) – A string to determine the preset to use.

YOLOv2Base¶

class chainercv.links.model.yolo.YOLOv2Base(n_fg_class=None, pretrained_model=None)[source]¶

Base class for YOLOv2 and YOLOv2Tiny.

A subclass of this class should have _extractor, _models, and _anchors.

forward(x)[source]¶

Compute localization, objectness, and classification from a batch of images.

This method computes three variables, locs, objs, and confs. self._decode() converts these variables to bounding box coordinates and confidence scores. These variables are also used in training YOLOv2.

Parameters

x (chainer.Variable) – A variable holding a batch of images.

Returns

This method returns three variables, locs, objs, and confs.

locs: A variable of float arrays of shape \((B, K, 4)\), where \(B\) is the number of samples in the batch and \(K\) is the number of default bounding boxes.
objs: A variable of float arrays of shape \((B, K)\).
confs: A variable of float arrays of shape \((B, K, n\_fg\_class)\).

Return type

tuple of chainer.Variable

to_cpu()[source]¶

Copies parameter variables and persistent values to CPU.

This method does not handle non-registered attributes. If some of such attributes must be copied to CPU, the link implementation should override device_resident_accept() to do so.

Returns: self

to_gpu(device=None)[source]¶

Copies parameter variables and persistent values to GPU.

This method does not handle non-registered attributes. If some of such attributes must be copied to GPU, the link implementation must override device_resident_accept() to do so.

Parameters: device – Target device specifier. If omitted, the current device is used.

Returns: self