Functions

Spatial Pooling

ps_roi_average_align_2d

chainercv.functions.ps_roi_average_align_2d(x, rois, roi_indices, outsize, spatial_scale, group_size, sampling_ratio=None)[source]

Position Sensitive Region of Interest (ROI) Average align function.

This function computes position sensitive average of input spatial patch with the given region of interests. Each ROI is splitted into \((group\_size, group\_size)\) regions, and position sensitive values in each region is computed.

Parameters
  • x (Variable) – Input variable. The shape is expected to be 4 dimentional: (n: batch, c: channel, h, height, w: width).

  • rois (array) – Input roi. The shape is expected to be \((R, 4)\), and each datum is set as below: (y_min, x_min, y_max, x_max). The dtype is numpy.float32.

  • roi_indices (array) – Input roi indices. The shape is expected to be \((R, )\). The dtype is numpy.int32.

  • outsize ((int, int, int) or (int, int) or int) – Expected output size after pooled: (channel, height, width) or (height, width) or outsize. outsize=o and outsize=(o, o) are equivalent. Channel parameter is used to assert the input shape.

  • spatial_scale (float) – Scale of the roi is resized.

  • group_size (int) – Position sensitive group size.

  • sampling_ratio ((int, int) or int) – Sampling step for the alignment. It must be an integer over \(1\) or None, and the value is automatically decided when None is passed. Use of different ratio in height and width axis is also supported by passing tuple of int as (sampling_ratio_h, sampling_ratio_w). sampling_ratio=s and sampling_ratio=(s, s) are equivalent.

Returns

Output variable.

Return type

Variable

See the original paper proposing PSROIPooling: R-FCN. See the original paper proposing ROIAlign: Mask R-CNN.

ps_roi_average_pooling_2d

chainercv.functions.ps_roi_average_pooling_2d(x, rois, roi_indices, outsize, spatial_scale, group_size)[source]

Position Sensitive Region of Interest (ROI) Average pooling function.

This function computes position sensitive average of input spatial patch with the given region of interests. Each ROI is splitted into \((group\_size, group\_size)\) regions, and position sensitive values in each region is computed.

Parameters
  • x (Variable) – Input variable. The shape is expected to be 4 dimentional: (n: batch, c: channel, h, height, w: width).

  • rois (array) – Input roi. The shape is expected to be \((R, 4)\), and each datum is set as below: (y_min, x_min, y_max, x_max). The dtype is numpy.float32.

  • roi_indices (array) – Input roi indices. The shape is expected to be \((R, )\). The dtype is numpy.int32.

  • outsize ((int, int, int) or (int, int) or int) – Expected output size after pooled: (channel, height, width) or (height, width) or outsize. outsize=o and outsize=(o, o) are equivalent. Channel parameter is used to assert the input shape.

  • spatial_scale (float) – Scale of the roi is resized.

  • group_size (int) – Position sensitive group size.

Returns

Output variable.

Return type

Variable

See the original paper proposing PSROIPooling: R-FCN.

ps_roi_max_align_2d

chainercv.functions.ps_roi_max_align_2d(x, rois, roi_indices, outsize, spatial_scale, group_size, sampling_ratio=None)[source]

Position Sensitive Region of Interest (ROI) Max align function.

This function computes position sensitive max value of input spatial patch with the given region of interests. Each ROI is splitted into \((group\_size, group\_size)\) regions, and position sensitive values in each region is computed.

Parameters
  • x (Variable) – Input variable. The shape is expected to be 4 dimentional: (n: batch, c: channel, h, height, w: width).

  • rois (array) – Input roi. The shape is expected to be \((R, 4)\), and each datum is set as below: (y_min, x_min, y_max, x_max). The dtype is numpy.float32.

  • roi_indices (array) – Input roi indices. The shape is expected to be \((R, )\). The dtype is numpy.int32.

  • outsize ((int, int, int) or (int, int) or int) – Expected output size after pooled: (channel, height, width) or (height, width) or outsize. outsize=o and outsize=(o, o) are equivalent. Channel parameter is used to assert the input shape.

  • spatial_scale (float) – Scale of the roi is resized.

  • group_size (int) – Position sensitive group size.

  • sampling_ratio ((int, int) or int) – Sampling step for the alignment. It must be an integer over \(1\) or None, and the value is automatically decided when None is passed. Use of different ratio in height and width axis is also supported by passing tuple of int as (sampling_ratio_h, sampling_ratio_w). sampling_ratio=s and sampling_ratio=(s, s) are equivalent.

Returns

Output variable.

Return type

Variable

See the original paper proposing PSROIPooling: R-FCN. See the original paper proposing ROIAlign: Mask R-CNN.

ps_roi_max_pooling_2d

chainercv.functions.ps_roi_max_pooling_2d(x, rois, roi_indices, outsize, spatial_scale, group_size)[source]

Position Sensitive Region of Interest (ROI) Max pooling function.

This function computes position sensitive max of input spatial patch with the given region of interests. Each ROI is splitted into \((group\_size, group\_size)\) regions, and position sensitive values in each region is computed.

Parameters
  • x (Variable) – Input variable. The shape is expected to be 4 dimentional: (n: batch, c: channel, h, height, w: width).

  • rois (array) – Input roi. The shape is expected to be \((R, 4)\), and each datum is set as below: (y_min, x_min, y_max, x_max). The dtype is numpy.float32.

  • roi_indices (array) – Input roi indices. The shape is expected to be \((R, )\). The dtype is numpy.int32.

  • outsize ((int, int, int) or (int, int) or int) – Expected output size after pooled: (channel, height, width) or (height, width) or outsize. outsize=o and outsize=(o, o) are equivalent. Channel parameter is used to assert the input shape.

  • spatial_scale (float) – Scale of the roi is resized.

  • group_size (int) – Position sensitive group size.

Returns

Output variable.

Return type

Variable

See the original paper proposing PSROIPooling: R-FCN.