Naming Conventions

Here are the notations used.

  • \(B\) is the size of a batch.

  • \(H\) is the height of an image.

  • \(W\) is the width of an image.

  • \(C\) is the number of channels.

  • \(R\) is the total number of instances in an image.

  • \(L\) is the number of classes.

Data objects

Images

  • imgs: \((B, C, H, W)\) or \([(C, H, W)]\)

  • img: \((C, H, W)\)

Note

image is used for a name of a function or a class (e.g., chainercv.utils.write_image()).

Bounding boxes

  • bboxes: \((B, R, 4)\) or \([(R, 4)]\)

  • bbox: \((R, 4)\)

  • bb: \((4,)\)

Labels

name

classification

detection and instance segmentation

semantic segmentation

labels

\((B,)\)

\((B, R)\) or \([(R,)]\)

\((B, H, W)\)

label

\(()\)

\((R,)\)

\((H, W)\)

l

r lb

\(()\)

Scores and probabilities

score represents an unbounded confidence value. On the other hand, probability is bounded in [0, 1] and sums to 1.

name

classification

detection and instance segmentation

semantic segmentation

scores or probs

\((B, L)\)

\((B, R, L)\) or \([(R, L)]\)

\((B, L, H, W)\)

score or prob

\((L,)\)

\((R, L)\)

\((L, H, W)\)

sc or pb

\((L,)\)

Note

Even for objects that satisfy the definition of probability, they can be named as score.

Instance segmentations

  • masks: \((B, R, H, W)\) or \([(R, H, W)]\)

  • mask: \((R, H, W)\)

  • msk: \((H, W)\)

Attributing an additonal meaning to a basic data object

RoIs

  • rois: \((R', 4)\), which consists of bounding boxes for multiple images. Assuming that there are \(B\) images each containing \(R_i\) bounding boxes, the formula \(R' = \sum R_i\) is true.

  • roi_indices: An array of shape \((R',)\) that contains batch indices of images to which bounding boxes correspond.

  • roi: \((R, 4)\). This is RoIs for single image.

Attributes associated to RoIs

RoIs may have additional attributes, such as class scores and masks. These attributes are named by appending roi_ (e.g., scores-like object is named as roi_scores).

  • roi_xs: \((R',) + x_{shape}\)

  • roi_x: \((R,) + x_{shape}\)

In the case of scores with shape \((L,)\), roi_xs would have shape \((R', L)\).

Note

roi_nouns = roi_noun = noun when batchsize=1. Changing names interchangeably is fine.

Class-wise vs class-independent

cls_nouns is a multi-class version of nouns. For instance, cls_locs is \((B, R, L, 4)\) and locs is \((B, R, 4)\).

Note

cls_probs and probs can be used interchangeably in the case when there is no confusion.

Arbitrary input

x is a variable whose shape can be inferred from the context. It can be used only when there is no confusion on its shape. This is usually the case when naming an input to a neural network.