Naming Conventions¶
Here are the notations used.
\(B\) is the size of a batch.
\(H\) is the height of an image.
\(W\) is the width of an image.
\(C\) is the number of channels.
\(R\) is the total number of instances in an image.
\(L\) is the number of classes.
Data objects¶
Images¶
imgs
: \((B, C, H, W)\) or \([(C, H, W)]\)img
: \((C, H, W)\)
Note
image
is used for a name of a function or a class (e.g., chainercv.utils.write_image()
).
Bounding boxes¶
bboxes
: \((B, R, 4)\) or \([(R, 4)]\)bbox
: \((R, 4)\)bb
: \((4,)\)
Labels¶
name |
classification |
detection and instance segmentation |
semantic segmentation |
|
---|---|---|---|---|
|
\((B,)\) |
\((B, R)\) or \([(R,)]\) |
\((B, H, W)\) |
|
|
\(()\) |
\((R,)\) |
\((H, W)\) |
|
|
r |
– |
\(()\) |
– |
Scores and probabilities¶
score represents an unbounded confidence value.
On the other hand, probability is bounded in [0, 1]
and sums to 1.
name |
classification |
detection and instance segmentation |
semantic segmentation |
---|---|---|---|
|
\((B, L)\) |
\((B, R, L)\) or \([(R, L)]\) |
\((B, L, H, W)\) |
|
\((L,)\) |
\((R, L)\) |
\((L, H, W)\) |
|
– |
\((L,)\) |
– |
Note
Even for objects that satisfy the definition of probability, they can be named as score
.
Instance segmentations¶
masks
: \((B, R, H, W)\) or \([(R, H, W)]\)mask
: \((R, H, W)\)msk
: \((H, W)\)
Attributing an additonal meaning to a basic data object¶
RoIs¶
rois
: \((R', 4)\), which consists of bounding boxes for multiple images. Assuming that there are \(B\) images each containing \(R_i\) bounding boxes, the formula \(R' = \sum R_i\) is true.roi_indices
: An array of shape \((R',)\) that contains batch indices of images to which bounding boxes correspond.roi
: \((R, 4)\). This is RoIs for single image.
Attributes associated to RoIs¶
RoIs may have additional attributes, such as class scores and masks.
These attributes are named by appending roi_
(e.g., scores
-like object is named as roi_scores
).
roi_xs
: \((R',) + x_{shape}\)roi_x
: \((R,) + x_{shape}\)
In the case of scores
with shape \((L,)\), roi_xs
would have shape \((R', L)\).
Note
roi_nouns = roi_noun = noun
when batchsize=1
.
Changing names interchangeably is fine.
Class-wise vs class-independent¶
cls_nouns
is a multi-class version of nouns
.
For instance, cls_locs
is \((B, R, L, 4)\) and locs
is \((B, R, 4)\).
Note
cls_probs
and probs
can be used interchangeably in the case
when there is no confusion.
Arbitrary input¶
x
is a variable whose shape can be inferred from the context.
It can be used only when there is no confusion on its shape.
This is usually the case when naming an input to a neural network.