YOLO¶
Detection Links¶
YOLOv2¶
-
class
chainercv.links.model.yolo.
YOLOv2
(n_fg_class=None, pretrained_model=None)[source]¶ YOLOv2.
This is a model of YOLOv2 [1]. This model uses
Darknet19Extractor
as its feature extractor.[1] Joseph Redmon, Ali Farhadi. YOLO9000: Better, Faster, Stronger. CVPR 2017. Parameters: - n_fg_class (int) – The number of classes excluding the background.
- pretrained_model (string) –
The weight file to be loaded. This can take
'voc0712'
, filepath orNone
. The default value isNone
.'voc0712'
: Load weights trained on trainval split of PASCAL VOC 2007 and 2012. The weight file is downloaded and cached automatically.n_fg_class
must be20
orNone
. These weights were converted from the darknet model provided by the original implementation. The conversion code is chainercv/examples/yolo/darknet2npz.py.- filepath: A path of npz file. In this case,
n_fg_class
must be specified properly. None
: Do not load weights.
-
to_cpu
()[source]¶ Copies parameter variables and persistent values to CPU.
This method does not handle non-registered attributes. If some of such attributes must be copied to CPU, the link implementation must override this method to do so.
Returns: self
-
to_gpu
(device=None)[source]¶ Copies parameter variables and persistent values to GPU.
This method does not handle non-registered attributes. If some of such attributes must be copied to GPU, the link implementation must override this method to do so.
Parameters: device – Target device specifier. If omitted, the current device is used. Returns: self
YOLOv3¶
-
class
chainercv.links.model.yolo.
YOLOv3
(n_fg_class=None, pretrained_model=None)[source]¶ YOLOv3.
This is a model of YOLOv3 [2]. This model uses
Darknet53Extractor
as its feature extractor.[2] Joseph Redmon, Ali Farhadi. YOLOv3: An Incremental Improvement. arXiv 2018. Parameters: - n_fg_class (int) – The number of classes excluding the background.
- pretrained_model (string) –
The weight file to be loaded. This can take
'voc0712'
, filepath orNone
. The default value isNone
.'voc0712'
: Load weights trained on trainval split of PASCAL VOC 2007 and 2012. The weight file is downloaded and cached automatically.n_fg_class
must be20
orNone
. These weights were converted from the darknet model. The conversion code is chainercv/examples/yolo/darknet2npz.py.- filepath: A path of npz file. In this case,
n_fg_class
must be specified properly. None
: Do not load weights.
-
to_cpu
()[source]¶ Copies parameter variables and persistent values to CPU.
This method does not handle non-registered attributes. If some of such attributes must be copied to CPU, the link implementation must override this method to do so.
Returns: self
-
to_gpu
(device=None)[source]¶ Copies parameter variables and persistent values to GPU.
This method does not handle non-registered attributes. If some of such attributes must be copied to GPU, the link implementation must override this method to do so.
Parameters: device – Target device specifier. If omitted, the current device is used. Returns: self
Utility¶
ResidualBlock¶
Darknet19Extractor¶
Darknet53Extractor¶
-
class
chainercv.links.model.yolo.
Darknet53Extractor
[source]¶ A Darknet53 based feature extractor for YOLOv3.
This is a feature extractor for
YOLOv3
-
__call__
(x)[source]¶ Compute feature maps from a batch of images.
This method extracts feature maps from 3 layers.
Parameters: x (ndarray) – An array holding a batch of images. The images should be resized to \(416\times 416\). Returns: Each variable contains a feature map. Return type: list of Variable
-
YOLOBase¶
-
class
chainercv.links.model.yolo.
YOLOBase
(**links)[source]¶ Base class for YOLOv2 and YOLOv3.
An inheriting this class should have
extractor
,__call__()
, and_decode()
.-
predict
(imgs)[source]¶ Detect objects from images.
This method predicts objects for each image.
Parameters: imgs (iterable of numpy.ndarray) – Arrays holding images. All images are in CHW and RGB format and the range of their value is \([0, 255]\). Returns: This method returns a tuple of three lists, (bboxes, labels, scores)
.- bboxes: A list of float arrays of shape \((R, 4)\), where \(R\) is the number of bounding boxes in a image. Each bouding box is organized by \((y_{min}, x_{min}, y_{max}, x_{max})\) in the second axis.
- labels : A list of integer arrays of shape \((R,)\). Each value indicates the class of the bounding box. Values are in range \([0, L - 1]\), where \(L\) is the number of the foreground classes.
- scores : A list of float arrays of shape \((R,)\). Each value indicates how confident the prediction is.
Return type: tuple of lists
-
use_preset
(preset)[source]¶ Use the given preset during prediction.
This method changes values of
nms_thresh
andscore_thresh
. These values are a threshold value used for non maximum suppression and a threshold value to discard low confidence proposals inpredict()
, respectively.If the attributes need to be changed to something other than the values provided in the presets, please modify them by directly accessing the public attributes.
Parameters: preset ({'visualize', 'evaluate'}) – A string to determine the preset to use.
-