PSPNet¶
Semantic Segmentation Link¶
PSPNetResNet101¶
-
class
chainercv.experimental.links.model.pspnet.
PSPNetResNet101
(n_class=None, pretrained_model=None, input_size=None, initialW=None, comm=None)[source]¶ PSPNet with Dilated ResNet101 as the feature extractor.
Parameters: - n_class (int) – The number of channels in the last convolution layer.
- pretrained_model (string) –
The weight file to be loaded. This can take
'cityscapes'
, filepath orNone
. The default value isNone
. - input_size (tuple) – The size of the input. This value is \((height, width)\).
- initialW (callable) – Initializer for the weights of convolution kernels.
- comm (chainermn.communicator) – If a ChainerMN communicator is given, it will be used for distributed batch normalization during training. If None, all batch normalization links will not share the input vectors among GPUs before calculating mean and variance. The original PSPNet implementation uses distributed batch normalization.
Utility¶
convolution_crop¶
-
chainercv.experimental.links.model.pspnet.
convolution_crop
(img, size, stride, return_param=False)[source]¶ Strided cropping.
This extracts cropped images from the input. The cropped images are extracted from the entire image, while taking a constant steps between neighboring patches.
Parameters: - img (ndarray) – An image array to be cropped. This is in CHW format.
- size (tuple) – The size of output image after cropping. This value is \((height, width)\).
- stride (tuple) – The stride between crops. This contains two values: stride in the vertical and horizontal directions.
- return_param (bool) – If
True
, this function returns information of slices.
Returns: If
return_param = False
, returns an arraycrop_imgs
that is a stack of cropped images.If
return_param = True
, returns a tuple whose elements arecrop_imgs, param
.param
is a dictionary of intermediate parameters whose contents are listed below with key, value-type and the description of the value.y_slices (list slices): Slices used to crop the input image. The relation below holds together with
x_slices
.x_slices (list of slices): Similar to
y_slices
.crop_y_slices (list of slices): This indicates the region of the cropped image that is actually extracted from the input. This is relevant only when borders of the input are cropped.
crop_x_slices (list of slices): Similar to
crop_y_slices
.crop_img = crop_imgs[i][:, crop_y_slices[i], crop_x_slices[i]] crop_img == img[:, y_slices[i], x_slices[i]]
Return type: Examples
>>> import numpy as np >>> from chainercv.datasets import VOCBboxDataset >>> from chainercv.transforms import resize >>> from chainercv.experimental.links.model.pspnet import ... convolution_crop >>> >>> img, _, _ = VOCBboxDataset(year='2007')[0] >>> img = resize(img, (300, 300)) >>> imgs, param = convolution_crop( >>> img, (128, 128), (96, 96), return_param=True) >>> # Restore the original image from the cropped images. >>> output = np.zeros((3, 300, 300)) >>> count = np.zeros((300, 300)) >>> for i in range(len(imgs)): >>> crop_y_slice = param['crop_y_slices'][i] >>> crop_x_slice = param['crop_x_slices'][i] >>> y_slice = param['y_slices'][i] >>> x_slice = param['x_slices'][i] >>> output[:, y_slice, x_slice] += ... imgs[i][:, crop_y_slice, crop_x_slice] >>> count[y_slice, x_slice] += 1 >>> output = output / count[None] >>> np.testing.assert_equal(output, img) >>> >>> # Visualization of the cropped images >>> import matplotlib.pyplot as plt >>> from chainercv.utils import tile_images >>> from chainercv.visualizations import vis_image >>> v_imgs = tile_images(imgs, 5, fill=122.5) >>> vis_image(v_imgs) >>> plt.show()
PSPNet¶
-
class
chainercv.experimental.links.model.pspnet.
PSPNet
(extractor, n_class, input_size, initialW=None, bn_kwargs=None)[source]¶ Pyramid Scene Parsing Network.
This is a PSPNet [1] model for semantic segmentation. This is based on the implementation found here.
[1] Hengshuang Zhao, Jianping Shi, Xiaojuan Qi, Xiaogang Wang Jiaya Jia “Pyramid Scene Parsing Network” CVPR, 2017 Parameters: - extractor (chainer.Chain) – A feature extractor.
- n_class (int) – The number of channels in the last convolution layer.
- input_size (tuple) – The size of the input. This value is \((height, width)\).
- initialW (callable) – Initializer for the weights of convolution kernels.
- bn_kwargs (dict) – Keyword arguments passed to initialize
chainer.links.BatchNormalization
. If a ChainerMN communicator (CommunicatorBase
) is given with the keycomm
,MultiNodeBatchNormalization
will be used for the batch normalization. Otherwise,BatchNormalization
will be used.
-
predict
(imgs)[source]¶ Conduct semantic segmentation from images.
Parameters: imgs (iterable of numpy.ndarray) – Arrays holding images. All images are in CHW and RGB format and the range of their values are \([0, 255]\). Returns: List of integer labels predicted from each image in the input list. Return type: list of numpy.ndarray