Naming Conventions¶

Here are the notations used.

\(B\) is the size of a batch.
\(H\) is the height of an image.
\(W\) is the width of an image.
\(C\) is the number of channels.
\(R\) is the total number of instances in an image.
\(L\) is the number of classes.

Data objects¶

Images¶

imgs: \((B, C, H, W)\) or \([(C, H, W)]\)
img: \((C, H, W)\)

Note

image is used for a name of a function or a class (e.g., chainercv.utils.write_image()).

Bounding boxes¶

bboxes: \((B, R, 4)\) or \([(R, 4)]\)
bbox: \((R, 4)\)
bb: \((4,)\)

Labels¶

name	classification	detection and instance segmentation	semantic segmentation
`labels`	\((B,)\)	\((B, R)\) or \([(R,)]\)	\((B, H, W)\)
`label`	\(()\)	\((R,)\)	\((H, W)\)
`l`	r `lb`	–	\(()\)	–

Scores and probabilities¶

score represents an unbounded confidence value. On the other hand, probability is bounded in [0, 1] and sums to 1.

name	classification	detection and instance segmentation	semantic segmentation
`scores` or `probs`	\((B, L)\)	\((B, R, L)\) or \([(R, L)]\)	\((B, L, H, W)\)
`score` or `prob`	\((L,)\)	\((R, L)\)	\((L, H, W)\)
`sc` or `pb`	–	\((L,)\)	–

Note

Even for objects that satisfy the definition of probability, they can be named as score.

Instance segmentations¶

masks: \((B, R, H, W)\) or \([(R, H, W)]\)
mask: \((R, H, W)\)
msk: \((H, W)\)

Attributing an additonal meaning to a basic data object¶

RoIs¶

rois: \((R', 4)\), which consists of bounding boxes for multiple images. Assuming that there are \(B\) images each containing \(R_i\) bounding boxes, the formula \(R' = \sum R_i\) is true.
roi_indices: An array of shape \((R',)\) that contains batch indices of images to which bounding boxes correspond.
roi: \((R, 4)\). This is RoIs for single image.

Attributes associated to RoIs¶

RoIs may have additional attributes, such as class scores and masks. These attributes are named by appending roi_ (e.g., scores-like object is named as roi_scores).

roi_xs: \((R',) + x_{shape}\)
roi_x: \((R,) + x_{shape}\)

In the case of scores with shape \((L,)\), roi_xs would have shape \((R', L)\).

Note

roi_nouns = roi_noun = noun when batchsize=1. Changing names interchangeably is fine.

Class-wise vs class-independent¶

cls_nouns is a multi-class version of nouns. For instance, cls_locs is \((B, R, L, 4)\) and locs is \((B, R, 4)\).

Note

cls_probs and probs can be used interchangeably in the case when there is no confusion.

Arbitrary input¶

x is a variable whose shape can be inferred from the context. It can be used only when there is no confusion on its shape. This is usually the case when naming an input to a neural network.