mmrotate.apis¶
- mmrotate.apis.inference_detector_by_patches(model, img, sizes, steps, ratios, merge_iou_thr, bs=1)[源代码]¶
inference patches with the detector.
Split huge image(s) into patches and inference them with the detector. Finally, merge patch results on one huge image by nms.
- 参数
model (nn.Module) – The loaded detector.
img (str | ndarray or) – Either an image file or loaded image.
sizes (list) – The sizes of patches.
steps (list) – The steps between two patches.
ratios (list) – Image resizing ratios for multi-scale detecting.
merge_iou_thr (float) – IoU threshold for merging results.
bs (int) – Batch size, must greater than or equal to 1.
- 返回
Detection results.
- 返回类型
list[np.ndarray]
mmrotate.core¶
anchor¶
- class mmrotate.core.anchor.PseudoAnchorGenerator(strides)[源代码]¶
Non-Standard pseudo anchor generator that is used to generate valid flags only!
- property num_base_anchors¶
total number of base anchors in a feature grid
- Type
list[int]
- class mmrotate.core.anchor.RotatedAnchorGenerator(strides, ratios, scales=None, base_sizes=None, scale_major=True, octave_base_scale=None, scales_per_octave=None, centers=None, center_offset=0.0)[源代码]¶
Fake rotate anchor generator for 2D anchor-based detectors.
Horizontal bounding box represented by (x,y,w,h,theta).
- single_level_grid_priors(featmap_size, level_idx, dtype=torch.float32, device='cuda')[源代码]¶
Generate grid anchors of a single level.
注解
This function is usually called by method
self.grid_priors
.- 参数
featmap_size (tuple[int]) – Size of the feature maps.
level_idx (int) – The index of corresponding feature map level.
(obj (dtype) – torch.dtype): Date type of points.Defaults to
torch.float32. –
device (str, optional) – The device the tensor will be put on.
to 'cuda'. (Defaults) –
- 返回
Anchors in the overall feature maps.
- 返回类型
torch.Tensor
- mmrotate.core.anchor.rotated_anchor_inside_flags(flat_anchors, valid_flags, img_shape, allowed_border=0)[源代码]¶
Check whether the rotated anchors are inside the border.
- 参数
flat_anchors (torch.Tensor) – Flatten anchors, shape (n, 5).
valid_flags (torch.Tensor) – An existing valid flags of anchors.
img_shape (tuple(int)) – Shape of current image.
allowed_border (int, optional) – The border to allow the valid anchor. Defaults to 0.
- 返回
Flags indicating whether the anchors are inside a valid range.
- 返回类型
torch.Tensor
bbox¶
- class mmrotate.core.bbox.ATSSKldAssigner(topk, use_reassign=False)[源代码]¶
Assign a corresponding gt bbox or background to each bbox.
Each proposals will be assigned with 0 or a positive integer indicating the ground truth index.
0: negative sample, no assigned gt
positive integer: positive sample, index (1-based) of assigned gt
- 参数
topk (float) – Number of bbox selected in each level.
use_reassign (bool, optional) – If true, it is used to reassign samples.
- AspectRatio(gt_rbboxes)[源代码]¶
compute the aspect ratio of all gts.
- 参数
gt_rbboxes (torch.Tensor) – Groundtruth polygons, shape (k, 8).
- 返回
The aspect ratio of gt_rbboxes, shape (k, 1).
- 返回类型
ratios (torch.Tensor)
- assign(bboxes, num_level_bboxes, gt_bboxes, gt_bboxes_ignore=None, gt_labels=None)[源代码]¶
Assign gt to bboxes.
The assignment is done in following steps
compute iou between all bbox (bbox of all pyramid levels) and gt
compute center distance between all bbox and gt
on each pyramid level, for each gt, select k bbox whose center are closest to the gt center, so we total select k*l bbox as candidates for each gt
get corresponding iou for the these candidates, and compute the mean and std, set mean + std as the iou threshold
compute the mean aspect ratio of all gts, and set exp((-mean aspect ratio / 4) * (mean + std) as the iou threshold
select these candidates whose iou are greater than or equal to the threshold as positive
limit the positive sample’s center in gt
- 参数
bboxes (Tensor) – Bounding boxes to be assigned, shape(n, 4).
num_level_bboxes (List) – num of bboxes in each level
gt_bboxes (Tensor) – Groundtruth boxes, shape (k, 4).
gt_bboxes_ignore (Tensor, optional) – Ground truth bboxes that are labelled as ignored, e.g., crowd boxes in COCO.
gt_labels (Tensor, optional) – Label of gt_bboxes, shape (k, ).
- 返回
The assign result.
- 返回类型
AssignResult
- get_horizontal_bboxes(gt_rbboxes)[源代码]¶
get_horizontal_bboxes from polygons.
- 参数
gt_rbboxes (torch.Tensor) – Groundtruth polygons, shape (k, 8).
- 返回
The horizontal bboxes, shape (k, 4).
- 返回类型
gt_rect_bboxes (torch.Tensor)
- kld_mixture2single(g1, g2)[源代码]¶
Compute Kullback-Leibler Divergence between two Gaussian distribution.
- 参数
g1 (dict[str, torch.Tensor]) – Gaussian distribution 1.
g2 (torch.Tensor) – Gaussian distribution 2.
- 返回
Kullback-Leibler Divergence.
- 返回类型
torch.Tensor
- kld_overlaps(gt_rbboxes, points, eps=1e-06)[源代码]¶
Compute overlaps between polygons and points by Kullback-Leibler Divergence loss.
- 参数
gt_rbboxes (torch.Tensor) – Ground truth polygons, shape (k, 8).
points (torch.Tensor) – Points to be assigned, shape(n, 18).
eps (float, optional) – Defaults to 1e-6.
- 返回
Kullback-Leibler Divergence loss.
- 返回类型
Tensor
- class mmrotate.core.bbox.ATSSObbAssigner(topk, angle_version='oc', iou_calculator={'type': 'RBboxOverlaps2D'})[源代码]¶
Assign a corresponding gt bbox or background to each bbox.
Each proposals will be assigned with 0 or a positive integer indicating the ground truth index.
0: negative sample, no assigned gt
positive integer: positive sample, index (1-based) of assigned gt
- 参数
topk (float) – Number of bbox selected in each level.
- assign(bboxes, num_level_bboxes, gt_bboxes, gt_bboxes_ignore=None, gt_labels=None)[源代码]¶
Assign gt to bboxes.
The assignment is done in following steps
compute iou between all bbox (bbox of all pyramid levels) and gt
compute center distance between all bbox and gt
on each pyramid level, for each gt, select k bbox whose center are closest to the gt center, so we total select k*l bbox as candidates for each gt
get corresponding iou for the these candidates, and compute the mean and std, set mean + std as the iou threshold
select these candidates whose iou are greater than or equal to the threshold as positive
limit the positive sample’s center in gt
- 参数
bboxes (Tensor) – Bounding boxes to be assigned, shape(n, 5).
num_level_bboxes (List) – num of bboxes in each level
gt_bboxes (Tensor) – Groundtruth boxes, shape (k, 5).
gt_bboxes_ignore (Tensor, optional) – Ground truth bboxes that are labelled as ignored, e.g., crowd boxes in COCO.
gt_labels (Tensor, optional) – Label of gt_bboxes, shape (k, ).
- 返回
The assign result.
- 返回类型
AssignResult
- class mmrotate.core.bbox.CSLCoder(angle_version, omega=1, window='gaussian', radius=6)[源代码]¶
Circular Smooth Label Coder.
- 参数
angle_version (str) – Angle definition.
omega (float, optional) – Angle discretization granularity. Default: 1.
window (str, optional) – Window function. Default: gaussian.
radius (int/float) – window radius, int type for [‘triangle’, ‘rect’, ‘pulse’], float type for [‘gaussian’]. Default: 6.
- class mmrotate.core.bbox.ConvexAssigner(scale=4, pos_num=3)[源代码]¶
Assign a corresponding gt bbox or background to each bbox. Each proposals will be assigned with 0 or a positive integer indicating the ground truth index.
0: negative sample, no assigned gt
positive integer: positive sample, index (1-based) of assigned gt
- 参数
scale (float) – IoU threshold for positive bboxes.
pos_num (float) – find the nearest pos_num points to gt center in this
level. –
- assign(points, gt_rbboxes, gt_rbboxes_ignore=None, gt_labels=None, overlaps=None)[源代码]¶
Assign gt to bboxes.
The assignment is done in following steps
compute iou between all bbox (bbox of all pyramid levels) and gt
compute center distance between all bbox and gt
on each pyramid level, for each gt, select k bbox whose center are closest to the gt center, so we total select k*l bbox as candidates for each gt
get corresponding iou for the these candidates, and compute the mean and std, set mean + std as the iou threshold
select these candidates whose iou are greater than or equal to the threshold as positive
limit the positive sample’s center in gt
- 参数
points (torch.Tensor) – Points to be assigned, shape(n, 18).
gt_rbboxes (torch.Tensor) – Groundtruth polygons, shape (k, 8).
gt_rbboxes_ignore (Tensor, optional) – Ground truth polygons that are labelled as ignored, e.g., crowd boxes in COCO.
gt_labels (Tensor, optional) – Label of gt_bboxes, shape (k, ).
- 返回
The assign result.
- 返回类型
AssignResult
- class mmrotate.core.bbox.DeltaXYWHAHBBoxCoder(target_means=(0.0, 0.0, 0.0, 0.0, 0.0), target_stds=(1.0, 1.0, 1.0, 1.0, 1.0), angle_range='oc', norm_factor=None, edge_swap=False, clip_border=True, add_ctr_clamp=False, ctr_clamp=32)[源代码]¶
Delta XYWHA HBBox coder.
this coder encodes bbox (x1, y1, x2, y2) into delta (dx, dy, dw, dh, da) and decodes delta (dx, dy, dw, dh, da) back to original bbox (cx, cy, w, h, a).
- 参数
target_means (Sequence[float]) – Denormalizing means of target for delta coordinates
target_stds (Sequence[float]) – Denormalizing standard deviation of target for delta coordinates
angle_range (str, optional) – Angle representations. Defaults to ‘oc’.
norm_factor (None|float, optional) – Regularization factor of angle.
edge_swap (bool, optional) – Whether swap the edge if w < h. Defaults to False.
clip_border (bool, optional) – Whether clip the objects outside the border of the image. Defaults to True.
add_ctr_clamp (bool) – Whether to add center clamp, when added, the predicted box is clamped is its center is too far away from the original anchor’s center. Only used by YOLOF. Default False.
ctr_clamp (int) – the maximum pixel shift to clamp. Only used by YOLOF. Default 32.
- decode(bboxes, pred_bboxes, max_shape=None, wh_ratio_clip=0.016)[源代码]¶
Apply transformation pred_bboxes to boxes.
- 参数
bboxes (torch.Tensor) – Basic boxes. Shape (B, N, 4) or (N, 4)
pred_bboxes (torch.Tensor) –
- Encoded offsets with respect to each
roi. Has shape (B, N, num_classes * 5) or (B, N, 5) or
(N, num_classes * 5) or (N, 5). Note N = num_anchors * W * H when rois is a grid of anchors.
(Sequence[int] or torch.Tensor or Sequence[ (max_shape) – Sequence[int]],optional): Maximum bounds for boxes, specifies (H, W, C) or (H, W). If bboxes shape is (B, N, 5), then the max_shape should be a Sequence[Sequence[int]] and the length of max_shape should also be B.
wh_ratio_clip (float, optional) – The allowed ratio between width and height.
- 返回
Decoded boxes.
- 返回类型
torch.Tensor
- encode(bboxes, gt_bboxes)[源代码]¶
Get box regression transformation deltas that can be used to transform the
bboxes
into thegt_bboxes
.- 参数
bboxes (torch.Tensor) – Source boxes, e.g., object proposals.
gt_bboxes (torch.Tensor) – Target of the transformation, e.g., ground-truth boxes.
- 返回
Box transformation deltas
- 返回类型
torch.Tensor
- class mmrotate.core.bbox.DeltaXYWHAOBBoxCoder(target_means=(0.0, 0.0, 0.0, 0.0, 0.0), target_stds=(1.0, 1.0, 1.0, 1.0, 1.0), angle_range='oc', norm_factor=None, edge_swap=False, proj_xy=False, add_ctr_clamp=False, ctr_clamp=32)[源代码]¶
Delta XYWHA OBBox coder. This coder is used for rotated objects detection (for example on task1 of DOTA dataset). this coder encodes bbox (xc, yc, w, h, a) into delta (dx, dy, dw, dh, da) and decodes delta (dx, dy, dw, dh, da) back to original bbox (xc, yc, w, h, a).
- 参数
target_means (Sequence[float]) – Denormalizing means of target for delta coordinates
target_stds (Sequence[float]) – Denormalizing standard deviation of target for delta coordinates
angle_range (str, optional) – Angle representations. Defaults to ‘oc’.
norm_factor (None|float, optional) – Regularization factor of angle.
edge_swap (bool, optional) – Whether swap the edge if w < h. Defaults to False.
proj_xy (bool, optional) – Whether project x and y according to angle. Defaults to False.
add_ctr_clamp (bool) – Whether to add center clamp, when added, the predicted box is clamped is its center is too far away from the original anchor’s center. Only used by YOLOF. Default False.
ctr_clamp (int) – the maximum pixel shift to clamp. Only used by YOLOF. Default 32.
- decode(bboxes, pred_bboxes, max_shape=None, wh_ratio_clip=0.016)[源代码]¶
Apply transformation pred_bboxes to boxes.
- 参数
bboxes (torch.Tensor) – Basic boxes. Shape (B, N, 5) or (N, 5)
pred_bboxes (torch.Tensor) – Encoded offsets with respect to each roi. Has shape (B, N, num_classes * 5) or (B, N, 5) or (N, num_classes * 5) or (N, 5). Note N = num_anchors * W * H when rois is a grid of anchors.
max_shape (Sequence[int] or torch.Tensor or Sequence[ Sequence[int]],optional) – Maximum bounds for boxes, specifies (H, W, C) or (H, W). If bboxes shape is (B, N, 5), then the max_shape should be a Sequence[Sequence[int]] and the length of max_shape should also be B.
wh_ratio_clip (float, optional) – The allowed ratio between width and height.
- 返回
Decoded boxes.
- 返回类型
torch.Tensor
- encode(bboxes, gt_bboxes)[源代码]¶
Get box regression transformation deltas that can be used to transform the
bboxes
into thegt_bboxes
.- 参数
bboxes (torch.Tensor) – Source boxes, e.g., object proposals.
gt_bboxes (torch.Tensor) – Target of the transformation, e.g., ground-truth boxes.
- 返回
Box transformation deltas
- 返回类型
torch.Tensor
- class mmrotate.core.bbox.GVFixCoder(angle_range='oc', **kwargs)[源代码]¶
Gliding vertex fix coder.
this coder encodes bbox (cx, cy, w, h, a) into delta (dt, dr, dd, dl) and decodes delta (dt, dr, dd, dl) back to original bbox (cx, cy, w, h, a).
- 参数
angle_range (str, optional) – Angle representations. Defaults to ‘oc’.
- decode(hbboxes, fix_deltas)[源代码]¶
Apply transformation fix_deltas to boxes.
- 参数
hbboxes (torch.Tensor) – Basic boxes. Shape (B, N, 4) or (N, 4)
fix_deltas (torch.Tensor) – Encoded offsets with respect to each roi. Has shape (B, N, num_classes * 4) or (B, N, 4) or (N, num_classes * 4) or (N, 4). Note N = num_anchors * W * H when rois is a grid of anchors.
- 返回
Decoded boxes.
- 返回类型
torch.Tensor
- class mmrotate.core.bbox.GVRatioCoder(angle_range='oc', **kwargs)[源代码]¶
Gliding vertex ratio coder.
this coder encodes bbox (cx, cy, w, h, a) into delta (ratios).
- 参数
angle_range (str, optional) – Angle representations. Defaults to ‘oc’.
- class mmrotate.core.bbox.GaussianMixture(n_components, n_features=2, mu_init=None, var_init=None, eps=1e-06, requires_grad=False)[源代码]¶
Initializes the Gaussian mixture model and brings all tensors into their required shape.
- 参数
n_components (int) – number of components.
n_features (int, optional) – number of features.
mu_init (torch.Tensor, optional) – (T, k, d)
var_init (torch.Tensor, optional) – (T, k, d) or (T, k, d, d)
eps (float, optional) – Defaults to 1e-6.
requires_grad (bool, optional) – Defaults to False.
- EM_step(x, log_resp)[源代码]¶
From the log-probabilities, computes new parameters pi, mu, var (that maximize the log-likelihood). This is the maximization step of the EM-algorithm.
- 参数
x (torch.Tensor) – (T, n, d) or (T, n, 1, d)
log_resp (torch.Tensor) – (T, n, k, 1)
- 返回
pi (torch.Tensor): (T, k, 1) mu (torch.Tensor): (T, k, d) var (torch.Tensor): (T, k, d) or (T, k, d, d)
- 返回类型
tuple
- check_size(x)[源代码]¶
Make sure that the shape of x is (T, n, 1, d).
- 参数
x (torch.Tensor) – input tensor.
- 返回
output tensor.
- 返回类型
torch.Tensor
- em_runner(x)[源代码]¶
Performs one iteration of the expectation-maximization algorithm by calling the respective subroutines.
- 参数
x (torch.Tensor) – (n, 1, d)
- estimate_log_prob(x)[源代码]¶
Estimate the log-likelihood probability that samples belong to the k-th Gaussian.
- 参数
x (torch.Tensor) – (T, n, d) or (T, n, 1, d)
- 返回
log-likelihood probability that samples belong to the k-th Gaussian with dimensions (T, n, k, 1).
- 返回类型
torch.Tensor
- fit(x, delta=0.001, n_iter=10)[源代码]¶
Fits Gaussian mixture model to the data.
- 参数
x (torch.Tensor) – input tensor.
delta (float, optional) – threshold.
n_iter (int, optional) – number of iterations.
- get_score(x, sum_data=True)[源代码]¶
Computes the log-likelihood of the data under the model.
- 参数
x (torch.Tensor) – (T, n, 1, d)
sum_data (bool,optional) – Flag of whether to sum scores.
- 返回
score or per_sample_score.
- 返回类型
torch.Tensor
- log_resp_step(x)[源代码]¶
Computes log-responses that indicate the (logarithmic) posterior belief (sometimes called responsibilities) that a data point was generated by one of the k mixture components. Also returns the mean of the mean of the logarithms of the probabilities (as is done in sklearn). This is the so-called expectation step of the EM-algorithm.
- 参数
x (torch.Tensor) – (T, n, d) or (T, n, 1, d)
- 返回
log_prob_norm (torch.Tensor): the mean of the mean of the logarithms of the probabilities. log_resp (torch.Tensor): log-responses that indicate the posterior belief.
- 返回类型
tuple
- class mmrotate.core.bbox.MaxConvexIoUAssigner(pos_iou_thr, neg_iou_thr, min_pos_iou=0.0, gt_max_assign_all=True, ignore_iof_thr=- 1, ignore_wrt_candidates=True, gpu_assign_thr=- 1)[源代码]¶
Assign a corresponding gt bbox or background to each bbox. Each proposals will be assigned with -1, or a semi-positive integer indicating the ground truth index.
-1: negative sample, no assigned gt
semi-positive integer: positive sample, index (0-based) of assigned gt
- 参数
pos_iou_thr (float) – IoU threshold for positive bboxes.
neg_iou_thr (float or tuple) – IoU threshold for negative bboxes.
min_pos_iou (float) – Minimum iou for a bbox to be considered as a positive bbox. Positive samples can have smaller IoU than pos_iou_thr due to the 4th step (assign max IoU sample to each gt).
gt_max_assign_all (bool) – Whether to assign all bboxes with the same highest overlap with some gt to that gt.
ignore_iof_thr (float) – IoF threshold for ignoring bboxes (if gt_bboxes_ignore is specified). Negative values mean not ignoring any bboxes.
ignore_wrt_candidates (bool) – Whether to compute the iof between bboxes and gt_bboxes_ignore, or the contrary.
gpu_assign_thr (int) – The upper bound of the number of GT for GPU assign. When the number of gt is above this threshold, will assign on CPU device. Negative values mean not assign on CPU.
- assign(points, gt_rbboxes, overlaps, gt_rbboxes_ignore=None, gt_labels=None)[源代码]¶
Assign gt to bboxes.
The assignment is done in following steps
compute iou between all bbox (bbox of all pyramid levels) and gt
compute center distance between all bbox and gt
on each pyramid level, for each gt, select k bbox whose center are closest to the gt center, so we total select k*l bbox as candidates for each gt
get corresponding iou for the these candidates, and compute the mean and std, set mean + std as the iou threshold
select these candidates whose iou are greater than or equal to the threshold as positive
limit the positive sample’s center in gt
- 参数
points (torch.Tensor) – Points to be assigned, shape(n, 18).
gt_rbboxes (torch.Tensor) – Groundtruth polygons, shape (k, 8).
overlaps (torch.Tensor) – Overlaps between k gt_bboxes and n bboxes, shape(k, n).
gt_rbboxes_ignore (Tensor, optional) – Ground truth polygons that are labelled as ignored, e.g., crowd boxes in COCO.
gt_labels (Tensor, optional) – Label of gt_bboxes, shape (k, ).
- 返回
The assign result.
- 返回类型
AssignResult
- assign_wrt_overlaps(overlaps, gt_labels=None)[源代码]¶
Assign w.r.t.
the overlaps of bboxes with gts.
- 参数
overlaps (torch.Tensor) – Overlaps between k gt_bboxes and n bboxes, shape(k, n).
gt_labels (Tensor, optional) – Labels of k gt_bboxes, shape (k, ).
- 返回
The assign result.
- 返回类型
AssignResult
- convex_overlaps(gt_rbboxes, points)[源代码]¶
Compute overlaps between polygons and points.
- 参数
gt_rbboxes (torch.Tensor) – Groundtruth polygons, shape (k, 8).
points (torch.Tensor) – Points to be assigned, shape(n, 18).
- 返回
Overlaps between k gt_bboxes and n bboxes, shape(k, n).
- 返回类型
overlaps (torch.Tensor)
- class mmrotate.core.bbox.MidpointOffsetCoder(target_means=(0.0, 0.0, 0.0, 0.0, 0.0, 0.0), target_stds=(1.0, 1.0, 1.0, 1.0, 1.0, 1.0), angle_range='oc')[源代码]¶
Mid point offset coder. This coder encodes bbox (x1, y1, x2, y2) into delta (dx, dy, dw, dh, da, db) and decodes delta (dx, dy, dw, dh, da, db) back to original bbox (x1, y1, x2, y2).
- 参数
target_means (Sequence[float]) – Denormalizing means of target for delta coordinates
target_stds (Sequence[float]) – Denormalizing standard deviation of target for delta coordinates
angle_range (str, optional) – Angle representations. Defaults to ‘oc’.
- decode(bboxes, pred_bboxes, max_shape=None, wh_ratio_clip=0.016)[源代码]¶
Apply transformation pred_bboxes to bboxes.
- 参数
bboxes (torch.Tensor) – Basic boxes. Shape (B, N, 4) or (N, 4)
pred_bboxes (torch.Tensor) – Encoded offsets with respect to each roi. Has shape (B, N, 5) or (N, 5). Note N = num_anchors * W * H when rois is a grid of anchors.
(Sequence[int] or torch.Tensor or Sequence[ (max_shape) – Sequence[int]],optional): Maximum bounds for boxes, specifies (H, W, C) or (H, W). If bboxes shape is (B, N, 6), then the max_shape should be a Sequence[Sequence[int]] and the length of max_shape should also be B.
wh_ratio_clip (float, optional) – The allowed ratio between width and height.
- 返回
Decoded boxes.
- 返回类型
torch.Tensor
- encode(bboxes, gt_bboxes)[源代码]¶
Get box regression transformation deltas that can be used to transform the
bboxes
into thegt_bboxes
.- 参数
bboxes (torch.Tensor) – Source boxes, e.g., object proposals.
gt_bboxes (torch.Tensor) – Target of the transformation, e.g., ground-truth boxes.
- 返回
Box transformation deltas
- 返回类型
torch.Tensor
- class mmrotate.core.bbox.RRandomSampler(num, pos_fraction, neg_pos_ub=- 1, add_gt_as_proposals=True, **kwargs)[源代码]¶
Random sampler.
- 参数
num (int) – Number of samples
pos_fraction (float) – Fraction of positive samples
neg_pos_up (int, optional) – Upper bound number of negative and positive samples. Defaults to -1.
add_gt_as_proposals (bool, optional) – Whether to add ground truth boxes as proposals. Defaults to True.
- random_choice(gallery, num)[源代码]¶
Random select some elements from the gallery.
If gallery is a Tensor, the returned indices will be a Tensor; If gallery is a ndarray or list, the returned indices will be a ndarray.
- 参数
gallery (Tensor | ndarray | list) – indices pool.
num (int) – expected sample num.
- 返回
sampled indices.
- 返回类型
Tensor or ndarray
- sample(assign_result, bboxes, gt_bboxes, gt_labels=None, **kwargs)[源代码]¶
Sample positive and negative bboxes.
This is a simple implementation of bbox sampling given candidates, assigning results and ground truth bboxes.
- 参数
assign_result (
AssignResult
) – Bbox assigning results.bboxes (torch.Tensor) – Boxes to be sampled from.
gt_bboxes (torch.Tensor) – Ground truth bboxes.
gt_labels (Tensor, optional) – Class labels of ground truth bboxes.
- 返回
Sampling result.
- 返回类型
SamplingResult
示例
>>> from mmdet.core.bbox import RandomSampler >>> from mmdet.core.bbox import AssignResult >>> from mmdet.core.bbox.demodata import ensure_rng, random_boxes >>> rng = ensure_rng(None) >>> assign_result = AssignResult.random(rng=rng) >>> bboxes = random_boxes(assign_result.num_preds, rng=rng) >>> gt_bboxes = random_boxes(assign_result.num_gts, rng=rng) >>> gt_labels = None >>> self = RandomSampler(num=32, pos_fraction=0.5, neg_pos_ub=-1, >>> add_gt_as_proposals=False) >>> self = self.sample(assign_result, bboxes, gt_bboxes, gt_labels)
- class mmrotate.core.bbox.SASAssigner(topk)[源代码]¶
Assign a corresponding gt bbox or background to each bbox. Each proposals will be assigned with 0 or a positive integer indicating the ground truth index.
0: negative sample, no assigned gt
positive integer: positive sample, index (1-based) of assigned gt
- 参数
scale (float) – IoU threshold for positive bboxes.
pos_num (float) – find the nearest pos_num points to gt center in this
level. –
- assign(bboxes, num_level_bboxes, gt_bboxes, gt_bboxes_ignore=None, gt_labels=None)[源代码]¶
Assign gt to bboxes.
The assignment is done in following steps
compute iou between all bbox (bbox of all pyramid levels) and gt
compute center distance between all bbox and gt
on each pyramid level, for each gt, select k bbox whose center are closest to the gt center, so we total select k*l bbox as candidates for each gt
get corresponding iou for the these candidates, and compute the mean and std, set mean + std as the iou threshold
select these candidates whose iou are greater than or equal to the threshold as positive
limit the positive sample’s center in gt
- 参数
bboxes (torch.Tensor) – Bounding boxes to be assigned, shape(n, 4).
num_level_bboxes (List) – num of bboxes in each level
gt_bboxes (torch.Tensor) – Groundtruth boxes, shape (k, 4).
gt_bboxes_ignore (Tensor, optional) – Ground truth bboxes that are labelled as ignored, e.g., crowd boxes in COCO.
gt_labels (Tensor, optional) – Label of gt_bboxes, shape (k, ).
- 返回
The assign result.
- 返回类型
AssignResult
- mmrotate.core.bbox.bbox_mapping_back(bboxes, img_shape, scale_factor, flip, flip_direction='horizontal')[源代码]¶
Map bboxes from testing scale to original image scale.
- mmrotate.core.bbox.gaussian2bbox(gmm)[源代码]¶
Convert Gaussian distribution to polygons by SVD.
- 参数
gmm (dict[str, torch.Tensor]) – Dict of Gaussian distribution.
- 返回
Polygons.
- 返回类型
torch.Tensor
- mmrotate.core.bbox.gt2gaussian(target)[源代码]¶
Convert polygons to Gaussian distributions.
- 参数
target (torch.Tensor) – Polygons with shape (N, 8).
- 返回
Gaussian distributions.
- 返回类型
dict[str, torch.Tensor]
- mmrotate.core.bbox.hbb2obb(hbboxes, version='oc')[源代码]¶
Convert horizontal bounding boxes to oriented bounding boxes.
- 参数
hbbs (torch.Tensor) – [x_lt,y_lt,x_rb,y_rb]
version (Str) – angle representations.
- 返回
[x_ctr,y_ctr,w,h,angle]
- 返回类型
obbs (torch.Tensor)
- mmrotate.core.bbox.norm_angle(angle, angle_range)[源代码]¶
Limit the range of angles.
- 参数
angle (ndarray) – shape(n, ).
angle_range (Str) – angle representations.
- 返回
shape(n, ).
- 返回类型
angle (ndarray)
- mmrotate.core.bbox.obb2hbb(rbboxes, version='oc')[源代码]¶
Convert oriented bounding boxes to horizontal bounding boxes.
- 参数
obbs (torch.Tensor) – [x_ctr,y_ctr,w,h,angle]
version (Str) – angle representations.
- 返回
[x_ctr,y_ctr,w,h,-pi/2]
- 返回类型
hbbs (torch.Tensor)
- mmrotate.core.bbox.obb2poly(rbboxes, version='oc')[源代码]¶
Convert oriented bounding boxes to polygons.
- 参数
obbs (torch.Tensor) – [x_ctr,y_ctr,w,h,angle]
version (Str) – angle representations.
- 返回
[x0,y0,x1,y1,x2,y2,x3,y3]
- 返回类型
polys (torch.Tensor)
- mmrotate.core.bbox.obb2poly_np(rbboxes, version='oc')[源代码]¶
Convert oriented bounding boxes to polygons.
- 参数
obbs (ndarray) – [x_ctr,y_ctr,w,h,angle]
version (Str) – angle representations.
- 返回
[x0,y0,x1,y1,x2,y2,x3,y3]
- 返回类型
polys (ndarray)
- mmrotate.core.bbox.obb2xyxy(rbboxes, version='oc')[源代码]¶
Convert oriented bounding boxes to horizontal bounding boxes.
- 参数
obbs (torch.Tensor) – [x_ctr,y_ctr,w,h,angle]
version (Str) – angle representations.
- 返回
[x_lt,y_lt,x_rb,y_rb]
- 返回类型
hbbs (torch.Tensor)
- mmrotate.core.bbox.poly2obb(polys, version='oc')[源代码]¶
Convert polygons to oriented bounding boxes.
- 参数
polys (torch.Tensor) – [x0,y0,x1,y1,x2,y2,x3,y3]
version (Str) – angle representations.
- 返回
[x_ctr,y_ctr,w,h,angle]
- 返回类型
obbs (torch.Tensor)
- mmrotate.core.bbox.poly2obb_np(polys, version='oc')[源代码]¶
Convert polygons to oriented bounding boxes.
- 参数
polys (ndarray) – [x0,y0,x1,y1,x2,y2,x3,y3]
version (Str) – angle representations.
- 返回
[x_ctr,y_ctr,w,h,angle]
- 返回类型
obbs (ndarray)
- mmrotate.core.bbox.rbbox2result(bboxes, labels, num_classes)[源代码]¶
Convert detection results to a list of numpy arrays.
- 参数
bboxes (torch.Tensor) – shape (n, 6)
labels (torch.Tensor) – shape (n, )
num_classes (int) – class number, including background class
- 返回
bbox results of each class
- 返回类型
list(ndarray)
- mmrotate.core.bbox.rbbox2roi(bbox_list)[源代码]¶
Convert a list of bboxes to roi format.
- 参数
bbox_list (list[Tensor]) – a list of bboxes corresponding to a batch of images.
- 返回
shape (n, 6), [batch_ind, cx, cy, w, h, a]
- 返回类型
Tensor
- mmrotate.core.bbox.rbbox_overlaps(bboxes1, bboxes2, mode='iou', is_aligned=False)[源代码]¶
Calculate overlap between two set of bboxes.
- 参数
bboxes1 (torch.Tensor) – shape (B, m, 5) in <cx, cy, w, h, a> format or empty.
bboxes2 (torch.Tensor) – shape (B, n, 5) in <cx, cy, w, h, a> format or empty.
mode (str) – “iou” (intersection over union), “iof” (intersection over foreground) or “giou” (generalized intersection over union). Default “iou”.
is_aligned (bool, optional) – If True, then m and n must be equal. Default False.
- 返回
shape (m, n) if
is_aligned
is False else shape (m,)- 返回类型
Tensor
patch¶
- mmrotate.core.patch.get_multiscale_patch(sizes, steps, ratios)[源代码]¶
Get multiscale patch sizes and steps.
- 参数
sizes (list) – A list of patch sizes.
steps (list) – A list of steps to slide patches.
ratios (list) – Multiscale ratios. devidie to each size and step and generate patches in new scales.
- 返回
A list of multiscale patch sizes. new_steps (list): A list of steps corresponding to new_sizes.
- 返回类型
new_sizes (list)
- mmrotate.core.patch.merge_results(results, offsets, img_shape, iou_thr=0.1, device='cpu')[源代码]¶
Merge patch results via nms.
- 参数
results (list[np.ndarray] | list[tuple]) – A list of patches results.
offsets (np.ndarray) – Positions of the left top points of patches.
img_shape (tuple) – A tuple of the huge image’s width and height.
iou_thr (float) – The IoU threshold of NMS.
device (str) – The device to call nms.
- Retunrns:
list[np.ndarray]: Detection results after merging.
- mmrotate.core.patch.slide_window(width, height, sizes, steps, img_rate_thr=0.6)[源代码]¶
Slide windows in images and get window position.
- 参数
width (int) – The width of the image.
height (int) – The height of the image.
sizes (list) – List of window’s sizes.
steps (list) – List of window’s steps.
img_rate_thr (float) – Threshold of window area divided by image area.
- 返回
Information of valid windows.
- 返回类型
np.ndarray
evaluation¶
- mmrotate.core.evaluation.eval_rbbox_map(det_results, annotations, scale_ranges=None, iou_thr=0.5, use_07_metric=True, dataset=None, logger=None, nproc=4)[源代码]¶
Evaluate mAP of a rotated dataset.
- 参数
det_results (list[list]) – [[cls1_det, cls2_det, …], …]. The outer list indicates images, and the inner list indicates per-class detected bboxes.
annotations (list[dict]) –
Ground truth annotations where each item of the list indicates an image. Keys of annotations are:
bboxes: numpy array of shape (n, 5)
labels: numpy array of shape (n, )
bboxes_ignore (optional): numpy array of shape (k, 5)
labels_ignore (optional): numpy array of shape (k, )
scale_ranges (list[tuple] | None) – Range of scales to be evaluated, in the format [(min1, max1), (min2, max2), …]. A range of (32, 64) means the area range between (32**2, 64**2). Default: None.
iou_thr (float) – IoU threshold to be considered as matched. Default: 0.5.
use_07_metric (bool) – Whether to use the voc07 metric.
dataset (list[str] | str | None) – Dataset name or dataset classes, there are minor differences in metrics for different datasets, e.g. “voc07”, “imagenet_det”, etc. Default: None.
logger (logging.Logger | str | None) – The way to print the mAP summary. See mmcv.utils.print_log() for details. Default: None.
nproc (int) – Processes used for computing TP and FP. Default: 4.
- 返回
(mAP, [dict, dict, …])
- 返回类型
tuple
post_processing¶
- mmrotate.core.post_processing.aug_multiclass_nms_rotated(merged_bboxes, merged_labels, score_thr, nms, max_num, classes)[源代码]¶
NMS for aug multi-class bboxes.
- 参数
multi_bboxes (torch.Tensor) – shape (n, #class*5) or (n, 5)
multi_scores (torch.Tensor) – shape (n, #class), where the last column contains scores of the background class, but this will be ignored.
score_thr (float) – bbox threshold, bboxes with scores lower than it will not be considered.
nms (float) – Config of NMS.
max_num (int, optional) – if there are more than max_num bboxes after NMS, only top max_num will be kept. Default to -1.
classes (int) – number of classes.
- 返回
- tensors of shape (k, 5), and (k). Dets are boxes
with scores. Labels are 0-based.
- 返回类型
tuple (dets, labels)
- mmrotate.core.post_processing.multiclass_nms_rotated(multi_bboxes, multi_scores, score_thr, nms, max_num=- 1, score_factors=None, return_inds=False)[源代码]¶
NMS for multi-class bboxes.
- 参数
multi_bboxes (torch.Tensor) – shape (n, #class*5) or (n, 5)
multi_scores (torch.Tensor) – shape (n, #class), where the last column contains scores of the background class, but this will be ignored.
score_thr (float) – bbox threshold, bboxes with scores lower than it will not be considered.
nms (float) – Config of NMS.
max_num (int, optional) – if there are more than max_num bboxes after NMS, only top max_num will be kept. Default to -1.
score_factors (Tensor, optional) – The factors multiplied to scores before applying NMS. Default to None.
return_inds (bool, optional) – Whether return the indices of kept bboxes. Default to False.
- 返回
tensors of shape (k, 5), (k), and (k). Dets are boxes with scores. Labels are 0-based.
- 返回类型
tuple (dets, labels, indices (optional))
visualization¶
- mmrotate.core.visualization.get_palette(palette, num_classes)[源代码]¶
Get palette from various inputs.
- 参数
palette (list[tuple] | str | tuple |
Color
) – palette inputs.num_classes (int) – the number of classes.
- 返回
A list of color tuples.
- 返回类型
list[tuple[int]]
- mmrotate.core.visualization.imshow_det_rbboxes(img, bboxes=None, labels=None, segms=None, class_names=None, score_thr=0, bbox_color='green', text_color='green', mask_color=None, thickness=2, font_size=13, win_name='', show=True, wait_time=0, out_file=None)[源代码]¶
Draw bboxes and class labels (with scores) on an image.
- 参数
img (str | ndarray) – The image to be displayed.
bboxes (ndarray) – Bounding boxes (with scores), shaped (n, 5) or (n, 6).
labels (ndarray) – Labels of bboxes.
segms (ndarray | None) – Masks, shaped (n,h,w) or None.
class_names (list[str]) – Names of each classes.
score_thr (float) – Minimum score of bboxes to be shown. Default: 0.
bbox_color (list[tuple] | tuple | str | None) – Colors of bbox lines. If a single color is given, it will be applied to all classes. The tuple of color should be in RGB order. Default: ‘green’.
text_color (list[tuple] | tuple | str | None) – Colors of texts. If a single color is given, it will be applied to all classes. The tuple of color should be in RGB order. Default: ‘green’.
mask_color (list[tuple] | tuple | str | None, optional) – Colors of masks. If a single color is given, it will be applied to all classes. The tuple of color should be in RGB order. Default: None.
thickness (int) – Thickness of lines. Default: 2.
font_size (int) – Font size of texts. Default: 13.
show (bool) – Whether to show the image. Default: True.
win_name (str) – The window name. Default: ‘’.
wait_time (float) – Value of waitKey param. Default: 0.
out_file (str, optional) – The filename to write the image. Default: None.
- 返回
The image with bboxes drawn on it.
- 返回类型
ndarray
mmrotate.datasets¶
datasets¶
- class mmrotate.datasets.DOTADataset(ann_file, pipeline, version='oc', difficulty=100, **kwargs)[源代码]¶
DOTA dataset for detection.
- 参数
ann_file (str) – Annotation file path.
pipeline (list[dict]) – Processing pipeline.
version (str, optional) – Angle representations. Defaults to ‘oc’.
difficulty (bool, optional) – The difficulty threshold of GT.
- evaluate(results, metric='mAP', logger=None, proposal_nums=(100, 300, 1000), iou_thr=0.5, scale_ranges=None, nproc=4)[源代码]¶
Evaluate the dataset.
- 参数
results (list) – Testing results of the dataset.
metric (str | list[str]) – Metrics to be evaluated.
logger (logging.Logger | None | str) – Logger used for printing related information during evaluation. Default: None.
proposal_nums (Sequence[int]) – Proposal number used for evaluating recalls, such as recall@100, recall@1000. Default: (100, 300, 1000).
iou_thr (float | list[float]) – IoU threshold. It must be a float when evaluating mAP, and can be a list when evaluating recall. Default: 0.5.
scale_ranges (list[tuple] | None) – Scale ranges for evaluating mAP. Default: None.
nproc (int) – Processes used for computing TP and FP. Default: 4.
- format_results(results, submission_dir=None, nproc=4, **kwargs)[源代码]¶
Format the results to submission text (standard format for DOTA evaluation).
- 参数
results (list) – Testing results of the dataset.
submission_dir (str, optional) – The folder that contains submission files. If not specified, a temp folder will be created. Default: None.
nproc (int, optional) – number of process.
- 返回
result_files (dict): a dict containing the json filepaths
tmp_dir (str): the temporal directory created for saving json files when submission_dir is not specified.
- 返回类型
tuple
- class mmrotate.datasets.HRSCDataset(ann_file, pipeline, img_subdir='JPEGImages', ann_subdir='Annotations', classwise=False, version='oc', **kwargs)[源代码]¶
HRSC dataset for detection.
- 参数
ann_file (str) – Annotation file path.
pipeline (list[dict]) – Processing pipeline.
img_subdir (str) – Subdir where images are stored. Default: JPEGImages.
ann_subdir (str) – Subdir where annotations are. Default: Annotations.
classwise (bool) – Whether to use all classes or only ship.
version (str, optional) – Angle representations. Defaults to ‘oc’.
- evaluate(results, metric='mAP', logger=None, proposal_nums=(100, 300, 1000), iou_thr=[0.5, 0.55, 0.6, 0.65, 0.7, 0.75, 0.8, 0.85, 0.9, 0.95], scale_ranges=None, use_07_metric=True, nproc=4)[源代码]¶
Evaluate the dataset.
- 参数
results (list) – Testing results of the dataset.
metric (str | list[str]) – Metrics to be evaluated.
logger (logging.Logger | None | str) – Logger used for printing related information during evaluation. Default: None.
proposal_nums (Sequence[int]) – Proposal number used for evaluating recalls, such as recall@100, recall@1000. Default: (100, 300, 1000).
iou_thr (float | list[float]) – IoU threshold. It must be a float when evaluating mAP, and can be a list when evaluating recall. Default: 0.5.
scale_ranges (list[tuple] | None) – Scale ranges for evaluating mAP. Default: None.
use_07_metric (bool) – Whether to use the voc07 metric.
nproc (int) – Processes used for computing TP and FP. Default: 4.
pipelines¶
- class mmrotate.datasets.pipelines.LoadPatchFromImage(to_float32=False, color_type='color', channel_order='bgr', file_client_args={'backend': 'disk'})[源代码]¶
Load an patch from the huge image.
Similar with
LoadImageFromFile
, but only reserve a patch ofresults['img']
according toresults['win']
.
- class mmrotate.datasets.pipelines.PolyRandomRotate(rotate_ratio=0.5, mode='range', angles_range=180, auto_bound=False, rect_classes=None, version='le90')[源代码]¶
Rotate img & bbox. Reference: https://github.com/hukaixuan19970627/OrientedRepPoints_DOTA
- 参数
rotate_ratio (float, optional) – The rotating probability. Default: 0.5.
mode (str, optional) – Indicates whether the angle is chosen in a random range (mode=’range’) or in a preset list of angles (mode=’value’). Defaults to ‘range’.
angles_range (int|list[int], optional) – The range of angles. If mode=’range’, angle_ranges is an int and the angle is chosen in (-angles_range, +angles_ranges). If mode=’value’, angles_range is a non-empty list of int and the angle is chosen in angles_range. Defaults to 180 as default mode is ‘range’.
auto_bound (bool, optional) – whether to find the new width and height bounds.
rect_classes (None|list, optional) – Specifies classes that needs to be rotated by a multiple of 90 degrees.
version (str, optional) – Angle representations. Defaults to ‘le90’.
- apply_coords(coords)[源代码]¶
coords should be a N * 2 array-like, containing N couples of (x, y) points
- apply_image(img, bound_h, bound_w, interp=1)[源代码]¶
img should be a numpy array, formatted as Height * Width * Nchannels
- filter_border(bboxes, h, w)[源代码]¶
Filter the box whose center point is outside or whose side length is less than 5.
- property is_rotate¶
Randomly decide whether to rotate.
- class mmrotate.datasets.pipelines.RRandomFlip(flip_ratio=None, direction='horizontal', version='oc')[源代码]¶
- 参数
flip_ratio (float | list[float], optional) – The flipping probability. Default: None.
direction (str | list[str], optional) – The flipping direction. Options are ‘horizontal’, ‘vertical’, ‘diagonal’.
version (str, optional) – Angle representations. Defaults to ‘oc’.
- class mmrotate.datasets.pipelines.RResize(img_scale=None, multiscale_mode='range', ratio_range=None)[源代码]¶
Resize images & rotated bbox Inherit Resize pipeline class to handle rotated bboxes.
- 参数
img_scale (tuple or list[tuple]) – Images scales for resizing.
multiscale_mode (str) – Either “range” or “value”.
ratio_range (tuple[float]) – (min_ratio, max_ratio).
mmrotate.models¶
detectors¶
- class mmrotate.models.detectors.GlidingVertex(backbone, rpn_head, roi_head, train_cfg, test_cfg, neck=None, pretrained=None, init_cfg=None)[源代码]¶
Implementation of Gliding Vertex on the Horizontal Bounding Box for Multi-Oriented Object Detection
- class mmrotate.models.detectors.OrientedRCNN(backbone, rpn_head, roi_head, train_cfg, test_cfg, neck=None, pretrained=None, init_cfg=None)[源代码]¶
Implementation of Oriented R-CNN for Object Detection.
- class mmrotate.models.detectors.R3Det(num_refine_stages, backbone, neck=None, bbox_head=None, frm_cfgs=None, refine_heads=None, train_cfg=None, test_cfg=None, pretrained=None, init_cfg=None)[源代码]¶
Rotated Refinement RetinaNet.
- simple_test(img, img_meta, rescale=False)[源代码]¶
Test function without test time augmentation.
- 参数
imgs (list[torch.Tensor]) – List of multiple images
img_metas (list[dict]) – List of image information.
rescale (bool, optional) – Whether to rescale the results. Defaults to False.
- 返回
BBox results of each image and classes. The outer list corresponds to each image. The inner list corresponds to each class.
- 返回类型
list[list[np.ndarray]]
- class mmrotate.models.detectors.ReDet(backbone, rpn_head, roi_head, train_cfg, test_cfg, neck=None, pretrained=None, init_cfg=None)[源代码]¶
Implementation of ReDet: A Rotation-equivariant Detector for Aerial Object Detection.
- class mmrotate.models.detectors.RoITransformer(backbone, rpn_head, roi_head, train_cfg, test_cfg, neck=None, pretrained=None, init_cfg=None)[源代码]¶
Implementation of Learning RoI Transformer for Oriented Object Detection in Aerial Images.
- class mmrotate.models.detectors.RotatedBaseDetector(init_cfg=None)[源代码]¶
Base class for rotated detectors.
- show_result(img, result, score_thr=0.3, bbox_color=(72, 101, 241), text_color=(72, 101, 241), mask_color=None, thickness=2, font_size=13, win_name='', show=False, wait_time=0, out_file=None, **kwargs)[源代码]¶
Draw result over img.
- 参数
img (str or Tensor) – The image to be displayed.
result (Tensor or tuple) – The results to draw over img bbox_result or (bbox_result, segm_result).
score_thr (float, optional) – Minimum score of bboxes to be shown. Default: 0.3.
bbox_color (str or tuple(int) or
Color
) – Color of bbox lines. The tuple of color should be in BGR order. Default: ‘green’text_color (str or tuple(int) or
Color
) – Color of texts. The tuple of color should be in BGR order. Default: ‘green’mask_color (None or str or tuple(int) or
Color
) – Color of masks. The tuple of color should be in BGR order. Default: Nonethickness (int) – Thickness of lines. Default: 2
font_size (int) – Font size of texts. Default: 13
win_name (str) – The window name. Default: ‘’
wait_time (float) – Value of waitKey param. Default: 0.
show (bool) – Whether to show the image. Default: False.
out_file (str or None) – The filename to write the image. Default: None.
- 返回
Only if not show or out_file
- 返回类型
img (torch.Tensor)
- class mmrotate.models.detectors.RotatedFCOS(backbone, neck, bbox_head, train_cfg=None, test_cfg=None, pretrained=None, init_cfg=None)[源代码]¶
Implementation of Rotated FCOS.
- class mmrotate.models.detectors.RotatedFasterRCNN(backbone, rpn_head, roi_head, train_cfg, test_cfg, neck=None, pretrained=None, init_cfg=None)[源代码]¶
Implementation of Rotated Faster R-CNN.
- class mmrotate.models.detectors.RotatedRepPoints(backbone, neck, bbox_head, train_cfg=None, test_cfg=None, pretrained=None)[源代码]¶
Implementation of Rotated RepPoints.
- class mmrotate.models.detectors.RotatedRetinaNet(backbone, neck, bbox_head, train_cfg=None, test_cfg=None, pretrained=None, init_cfg=None)[源代码]¶
Implementation of Rotated RetinaNet.
- class mmrotate.models.detectors.RotatedSingleStageDetector(backbone, neck=None, bbox_head=None, train_cfg=None, test_cfg=None, pretrained=None, init_cfg=None)[源代码]¶
Base class for rotated single-stage detectors.
Single-stage detectors directly and densely predict bounding boxes on the output features of the backbone+neck.
- aug_test(imgs, img_metas, rescale=False)[源代码]¶
Test function with test time augmentation.
- 参数
imgs (list[Tensor]) – the outer list indicates test-time augmentations and inner Tensor should have a shape NxCxHxW, which contains all images in the batch.
img_metas (list[list[dict]]) – the outer list indicates test-time augs (multiscale, flip, etc.) and the inner list indicates images in a batch. each dict has image information.
rescale (bool, optional) – Whether to rescale the results. Defaults to False.
- 返回
- BBox results of each image and classes. The outer list corresponds to each image. The inner list
corresponds to each class.
- 返回类型
list[list[np.ndarray]]
- forward_dummy(img)[源代码]¶
Used for computing network flops.
See mmdetection/tools/analysis_tools/get_flops.py
- forward_train(img, img_metas, gt_bboxes, gt_labels, gt_bboxes_ignore=None)[源代码]¶
- 参数
img (Tensor) – Input images of shape (N, C, H, W). Typically these should be mean centered and std scaled.
img_metas (list[dict]) – A List of image info dict where each dict has: ‘img_shape’, ‘scale_factor’, ‘flip’, and may also contain ‘filename’, ‘ori_shape’, ‘pad_shape’, and ‘img_norm_cfg’. For details on the values of these keys see
mmdet.datasets.pipelines.Collect
.gt_bboxes (list[Tensor]) – Each item are the truth boxes for each image in [tl_x, tl_y, br_x, br_y] format.
gt_labels (list[Tensor]) – Class indices corresponding to each box
gt_bboxes_ignore (None | list[Tensor]) – Specify which bounding boxes can be ignored when computing the loss.
- 返回
A dictionary of loss components.
- 返回类型
dict[str, Tensor]
- simple_test(img, img_metas, rescale=False)[源代码]¶
Test function without test time augmentation.
- 参数
imgs (list[torch.Tensor]) – List of multiple images
img_metas (list[dict]) – List of image information.
rescale (bool, optional) – Whether to rescale the results. Defaults to False.
- 返回
BBox results of each image and classes. The outer list corresponds to each image. The inner list corresponds to each class.
- 返回类型
list[list[np.ndarray]]
- class mmrotate.models.detectors.RotatedTwoStageDetector(backbone, neck=None, rpn_head=None, roi_head=None, train_cfg=None, test_cfg=None, pretrained=None, init_cfg=None)[源代码]¶
Base class for rotated two-stage detectors.
Two-stage detectors typically consisting of a region proposal network and a task-specific regression head.
- async async_simple_test(img, img_meta, proposals=None, rescale=False)[源代码]¶
Async test without augmentation.
- aug_test(imgs, img_metas, rescale=False)[源代码]¶
Test with augmentations.
If rescale is False, then returned bboxes and masks will fit the scale of imgs[0].
- forward_dummy(img)[源代码]¶
Used for computing network flops.
See mmdetection/tools/analysis_tools/get_flops.py
- forward_train(img, img_metas, gt_bboxes, gt_labels, gt_bboxes_ignore=None, gt_masks=None, proposals=None, **kwargs)[源代码]¶
- 参数
img (Tensor) – of shape (N, C, H, W) encoding input images. Typically these should be mean centered and std scaled.
img_metas (list[dict]) – list of image info dict where each dict has: ‘img_shape’, ‘scale_factor’, ‘flip’, and may also contain ‘filename’, ‘ori_shape’, ‘pad_shape’, and ‘img_norm_cfg’. For details on the values of these keys see mmdet/datasets/pipelines/formatting.py:Collect.
gt_bboxes (list[Tensor]) – Ground truth bboxes for each image with shape (num_gts, 5) in [cx, cy, w, h, a] format.
gt_labels (list[Tensor]) – class indices corresponding to each box
gt_bboxes_ignore (None | list[Tensor]) – specify which bounding boxes can be ignored when computing the loss.
gt_masks (None | Tensor) – true segmentation masks for each box used if the architecture supports a segmentation task.
proposals – override rpn proposals with custom proposals. Use when with_rpn is False.
- 返回
a dictionary of loss components
- 返回类型
dict[str, Tensor]
- property with_roi_head¶
whether the detector has a RoI head
- Type
bool
- property with_rpn¶
whether the detector has RPN
- Type
bool
- class mmrotate.models.detectors.S2ANet(backbone, neck=None, fam_head=None, align_cfgs=None, odm_head=None, train_cfg=None, test_cfg=None, pretrained=None)[源代码]¶
Implementation of Align Deep Features for Oriented Object Detection.
- forward_train(img, img_metas, gt_bboxes, gt_labels, gt_bboxes_ignore=None)[源代码]¶
Forward function of S2ANet.
- simple_test(img, img_meta, rescale=False)[源代码]¶
Test function without test time augmentation.
- 参数
imgs (list[torch.Tensor]) – List of multiple images
img_metas (list[dict]) – List of image information.
rescale (bool, optional) – Whether to rescale the results. Defaults to False.
- 返回
BBox results of each image and classes. The outer list corresponds to each image. The inner list corresponds to each class.
- 返回类型
list[list[np.ndarray]]
backbones¶
- class mmrotate.models.backbones.ReResNet(depth, in_channels=3, stem_channels=64, base_channels=64, expansion=None, num_stages=4, strides=(1, 2, 2, 2), dilations=(1, 1, 1, 1), out_indices=(3), style='pytorch', deep_stem=False, avg_down=False, frozen_stages=- 1, conv_cfg=None, norm_cfg={'requires_grad': True, 'type': 'BN'}, norm_eval=False, with_cp=False, zero_init_residual=True, pretrained=None, init_cfg=None)[源代码]¶
ReResNet backbone.
Please refer to the paper for details.
- 参数
depth (int) – Network depth, from {18, 34, 50, 101, 152}.
in_channels (int) – Number of input image channels. Default: 3.
stem_channels (int) – Output channels of the stem layer. Default: 64.
base_channels (int) – Middle channels of the first stage. Default: 64.
num_stages (int) – Stages of the network. Default: 4.
strides (Sequence[int]) – Strides of the first block of each stage. Default:
(1, 2, 2, 2)
.dilations (Sequence[int]) – Dilation of each stage. Default:
(1, 1, 1, 1)
.out_indices (Sequence[int]) – Output from which stages. If only one stage is specified, a single tensor (feature map) is returned, otherwise multiple stages are specified, a tuple of tensors will be returned. Default:
(3, )
.style (str) – pytorch or caffe. If set to “pytorch”, the stride-two layer is the 3x3 conv layer, otherwise the stride-two layer is the first 1x1 conv layer.
deep_stem (bool) – Replace 7x7 conv in input stem with 3 3x3 conv. Default: False.
avg_down (bool) – Use AvgPool instead of stride conv when downsampling in the bottleneck. Default: False.
frozen_stages (int) – Stages to be frozen (stop grad and set eval mode). -1 means not freezing any parameters. Default: -1.
conv_cfg (dict | None) – The config dict for conv layers. Default: None.
norm_cfg (dict) – The config dict for norm layers.
norm_eval (bool) – Whether to set norm layers to eval mode, namely, freeze running stats (mean and var). Note: Effect on Batch Norm and its variants only. Default: False.
with_cp (bool) – Use checkpoint or not. Using checkpoint will save some memory while slowing down the training speed. Default: False.
zero_init_residual (bool) – Whether to use zero init for last norm layer in resblocks to let them behave as identity. Default: True.
- property norm1¶
Get normalizion layer’s name.
necks¶
- class mmrotate.models.necks.ReFPN(in_channels, out_channels, num_outs, start_level=0, end_level=- 1, add_extra_convs=False, extra_convs_on_inputs=True, relu_before_extra_convs=False, no_norm_on_lateral=False, conv_cfg=None, norm_cfg=None, activation=None, init_cfg={'distribution': 'uniform', 'layer': 'Conv2d', 'type': 'Xavier'})[源代码]¶
ReFPN.
- 参数
in_channels (List[int]) – Number of input channels per scale.
out_channels (int) – Number of output channels (used at each scale)
num_outs (int) – Number of output scales.
start_level (int, optional) – Index of the start input backbone level used to build the feature pyramid. Default: 0.
end_level (int, optional) – Index of the end input backbone level (exclusive) to build the feature pyramid. Default: -1, which means the last level.
add_extra_convs (bool, optional) – It decides whether to add conv layers on top of the original feature maps. Default to False.
extra_convs_on_inputs (bool, optional) – It specifies the source feature map of the extra convs is the last feat map of neck inputs.
relu_before_extra_convs (bool) – Whether to apply relu before the extra conv. Default: False.
no_norm_on_lateral (bool) – Whether to apply norm on lateral. Default: False.
conv_cfg (dict, optional) – Config dict for convolution layer. Default: None.
norm_cfg (dict, optional) – Config dict for normalization layer. Default: None.
activation (str, optional) – Activation layer in ConvModule. Default: None.
init_cfg (dict or list[dict], optional) – Initialization config dict.
dense_heads¶
- class mmrotate.models.dense_heads.CSLRFCOSHead(separate_angle=True, scale_angle=False, angle_coder={'angle_version': 'le90', 'omega': 1, 'radius': 6, 'type': 'CSLCoder', 'window': 'gaussian'}, **kwargs)[源代码]¶
Use `Circular Smooth Label (CSL)
<https://link.springer.com/chapter/10.1007/978-3-030-58598-3_40>`_ . in FCOS.
- 参数
separate_angle (bool) –
If true, angle prediction is separated from bbox regression loss. In CSL only support True. Default: True. scale_angle (bool): If true, add scale to angle pred branch.
In CSL only support False. Default: False.
angle_coder (dict) – Config of angle coder.
- loss(cls_scores, bbox_preds, angle_preds, centernesses, gt_bboxes, gt_labels, img_metas, gt_bboxes_ignore=None)[源代码]¶
Compute loss of the head. :param cls_scores: Box scores for each scale level,
each is a 4D-tensor, the channel number is num_points * num_classes.
- 参数
bbox_preds (list[Tensor]) – Box energies / deltas for each scale level, each is a 4D-tensor, the channel number is num_points * 4.
angle_preds (list[Tensor]) – Box angle for each scale level, each is a 4D-tensor, the channel number is num_points * 1.
centernesses (list[Tensor]) – centerness for each scale level, each is a 4D-tensor, the channel number is num_points * 1.
gt_bboxes (list[Tensor]) – Ground truth bboxes for each image with shape (num_gts, 4) in [tl_x, tl_y, br_x, br_y] format.
gt_labels (list[Tensor]) – class indices corresponding to each box
img_metas (list[dict]) – Meta information of each image, e.g., image size, scaling factor, etc.
gt_bboxes_ignore (None | list[Tensor]) – specify which bounding boxes can be ignored when computing the loss.
- 返回
A dictionary of loss components.
- 返回类型
dict[str, Tensor]
- class mmrotate.models.dense_heads.CSLRRetinaHead(use_encoded_angle=True, shield_reg_angle=False, angle_coder={'angle_version': 'le90', 'omega': 1, 'radius': 6, 'type': 'CSLCoder', 'window': 'gaussian'}, loss_angle={'loss_weight': 1.0, 'type': 'CrossEntropyLoss', 'use_sigmoid': True}, init_cfg={'layer': 'Conv2d', 'override': [{'type': 'Normal', 'name': 'retina_cls', 'std': 0.01, 'bias_prob': 0.01}, {'type': 'Normal', 'name': 'retina_angle_cls', 'std': 0.01, 'bias_prob': 0.01}], 'std': 0.01, 'type': 'Normal'}, **kwargs)[源代码]¶
Rotational Anchor-based refine head.
- 参数
use_encoded_angle (bool) – Decide whether to use encoded angle or gt angle as target. Default: True.
shield_reg_angle (bool) – Decide whether to shield the angle loss from reg branch. Default: False.
angle_coder (dict) – Config of angle coder.
loss_angle (dict) – Config of angle classification loss.
init_cfg (dict or list[dict], optional) – Initialization config dict.
- forward_single(x)[源代码]¶
Forward feature of a single scale level.
- 参数
x (torch.Tensor) – Features of a single scale level.
- 返回
cls_score (torch.Tensor): Cls scores for a single scale level the channels number is num_anchors * num_classes.
bbox_pred (torch.Tensor): Box energies / deltas for a single scale level, the channels number is num_anchors * 5.
angle_cls (torch.Tensor): Angle for a single scale level the channels number is num_anchors * coding_len.
- 返回类型
tuple (torch.Tensor)
- get_bboxes(cls_scores, bbox_preds, angle_clses, img_metas, cfg=None, rescale=False, with_nms=True)[源代码]¶
Transform network output for a batch into bbox predictions.
- 参数
cls_scores (list[Tensor]) – Box scores for each scale level Has shape (N, num_anchors * num_classes, H, W)
bbox_preds (list[Tensor]) – Box energies / deltas for each scale level with shape (N, num_anchors * 5, H, W)
angle_clses (list[Tensor]) – Box angles for each scale level with shape (N, num_anchors * coding_len, H, W)
img_metas (list[dict]) – Meta information of each image, e.g., image size, scaling factor, etc.
cfg (mmcv.Config | None) – Test / postprocessing configuration, if None, test_cfg would be used
rescale (bool) – If True, return boxes in original image space. Default: False.
with_nms (bool) – If True, do nms before return boxes. Default: True.
- 返回
- Each item in result_list is 2-tuple.
The first item is an (n, 6) tensor, where the first 5 columns are bounding box positions (cx, cy, w, h, a) and the 6-th column is a score between 0 and 1. The second item is a (n,) tensor where each item is the predicted class label of the corresponding box.
- 返回类型
list[tuple[Tensor, Tensor]]
示例
>>> import mmcv >>> self = AnchorHead( >>> num_classes=9, >>> in_channels=1, >>> anchor_generator=dict( >>> type='AnchorGenerator', >>> scales=[8], >>> ratios=[0.5, 1.0, 2.0], >>> strides=[4,])) >>> img_metas = [{'img_shape': (32, 32, 3), 'scale_factor': 1}] >>> cfg = mmcv.Config(dict( >>> score_thr=0.00, >>> nms=dict(type='nms', iou_thr=1.0), >>> max_per_img=10)) >>> feat = torch.rand(1, 1, 3, 3) >>> cls_score, bbox_pred = self.forward_single(feat) >>> # Note the input lists are over different levels, not images >>> cls_scores, bbox_preds = [cls_score], [bbox_pred] >>> result_list = self.get_bboxes(cls_scores, bbox_preds, >>> img_metas, cfg) >>> det_bboxes, det_labels = result_list[0] >>> assert len(result_list) == 1 >>> assert det_bboxes.shape[1] == 5 >>> assert len(det_bboxes) == len(det_labels) == cfg.max_per_img
- loss(cls_scores, bbox_preds, angle_clses, gt_bboxes, gt_labels, img_metas, gt_bboxes_ignore=None)[源代码]¶
Compute losses of the head.
- 参数
cls_scores (list[Tensor]) – Box scores for each scale level Has shape (N, num_anchors * num_classes, H, W)
bbox_preds (list[Tensor]) – Box energies / deltas for each scale level with shape (N, num_anchors * 5, H, W)
angle_clses (list[Tensor]) – Box angles for each scale level with shape (N, num_anchors * coding_len, H, W)
gt_bboxes (list[Tensor]) – Ground truth bboxes for each image with shape (num_gts, 5) in [cx, cy, w, h, a] format.
gt_labels (list[Tensor]) – class indices corresponding to each box
img_metas (list[dict]) – Meta information of each image, e.g., image size, scaling factor, etc.
gt_bboxes_ignore (None | list[Tensor]) – specify which bounding boxes can be ignored when computing the loss. Default: None
- 返回
A dictionary of loss components.
- 返回类型
dict[str, Tensor]
- loss_single(cls_score, bbox_pred, angle_cls, anchors, labels, label_weights, bbox_targets, bbox_weights, angle_targets, angle_weights, num_total_samples)[源代码]¶
Compute loss of a single scale level.
- 参数
cls_score (torch.Tensor) – Box scores for each scale level Has shape (N, num_anchors * num_classes, H, W).
bbox_pred (torch.Tensor) – Box energies / deltas for each scale level with shape (N, num_anchors * 5, H, W).
anchors (torch.Tensor) – Box reference for each scale level with shape (N, num_total_anchors, 5).
labels (torch.Tensor) – Labels of each anchors with shape (N, num_total_anchors).
label_weights (torch.Tensor) – Label weights of each anchor with shape (N, num_total_anchors)
bbox_targets (torch.Tensor) – BBox regression targets of each anchor weight shape (N, num_total_anchors, 5).
bbox_weights (torch.Tensor) – BBox regression loss weights of each anchor with shape (N, num_total_anchors, 5).
angle_targets (torch.Tensor) – Angle classification targets of each anchor weight shape (N, num_total_anchors, coding_len).
angle_weights (torch.Tensor) – Angle classification loss weights of each anchor with shape (N, num_total_anchors, 1).
num_total_samples (int) – If sampling, num total samples equal to the number of total anchors; Otherwise, it is the number of positive anchors.
- 返回
loss_cls (torch.Tensor): cls. loss for each scale level.
loss_bbox (torch.Tensor): reg. loss for each scale level.
loss_angle (torch.Tensor): angle cls. loss for each scale level.
- 返回类型
tuple (torch.Tensor)
- class mmrotate.models.dense_heads.KFIoUODMRefineHead(num_classes, in_channels, stacked_convs=2, conv_cfg=None, norm_cfg=None, anchor_generator={'strides': [8, 16, 32, 64, 128], 'type': 'PseudoAnchorGenerator'}, init_cfg={'layer': 'Conv2d', 'override': {'bias_prob': 0.01, 'name': 'odm_cls', 'std': 0.01, 'type': 'Normal'}, 'std': 0.01, 'type': 'Normal'}, **kwargs)[源代码]¶
Rotated Anchor-based refine head for KFIoU. It’s a part of the Oriented Detection Module (ODM), which produces orientation-sensitive features for classification and orientation-invariant features for localization. The difference from ODMRefineHead is that its loss_bbox requires bbox_pred, bbox_targets, pred_decode and targets_decode as inputs.
- 参数
num_classes (int) – Number of categories excluding the background category.
in_channels (int) – Number of channels in the input feature map.
feat_channels (int) – Number of hidden channels. Used in child classes.
anchor_generator (dict) – Config dict for anchor generator
bbox_coder (dict) – Config of bounding box coder.
reg_decoded_bbox (bool) – If true, the regression loss would be applied on decoded bounding boxes. Default: False
background_label (int | None) – Label ID of background, set as 0 for RPN and num_classes for other heads. It will automatically set as num_classes if None is given.
loss_cls (dict) – Config of classification loss.
loss_bbox (dict) – Config of localization loss.
train_cfg (dict) – Training config of anchor head.
test_cfg (dict) – Testing config of anchor head.
init_cfg (dict or list[dict], optional) – Initialization config dict.
- forward_single(x)[源代码]¶
Forward feature of a single scale level.
- 参数
x (torch.Tensor) – Features of a single scale level.
- 返回
cls_score (torch.Tensor): Cls scores for a single scale level the channels number is num_anchors * num_classes.
bbox_pred (torch.Tensor): Box energies / deltas for a single scale level, the channels number is num_anchors * 4.
- 返回类型
tuple (torch.Tensor)
- get_anchors(featmap_sizes, img_metas, device='cuda')[源代码]¶
Get anchors according to feature map sizes.
- 参数
featmap_sizes (list[tuple]) – Multi-level feature map sizes.
img_metas (list[dict]) – Image meta info.
bboxes_as_anchors (list[list[Tensor]]) – before further regression just like anchors.
device (torch.device | str) – Device for returned tensors
- 返回
anchor_list (list[Tensor]): Anchors of each image
valid_flag_list (list[Tensor]): Valid flags of each image
- 返回类型
tuple
- get_bboxes(cls_scores, bbox_preds, img_metas, cfg=None, rescale=False, rois=None)[源代码]¶
Transform network output for a batch into labeled boxes.
- 参数
cls_scores (list[Tensor]) – Box scores for each scale level Has shape (N, num_anchors * num_classes, H, W)
bbox_preds (list[Tensor]) – Box energies / deltas for each scale level with shape (N, num_anchors * 5, H, W)
img_metas (list[dict]) – size / scale info for each image
cfg (mmcv.Config) – test / postprocessing configuration
rescale (bool) – if True, return boxes in original image space
rois (list[list[Tensor]]) – input rbboxes of each level of each image. rois output by former stages and are to be refined.
- 返回
- each item in result_list is 2-tuple.
The first item is an (n, 6) tensor, where the first 5 columns are bounding box positions (xc, yc, w, h, a) and the 6-th column is a score between 0 and 1. The second item is a (n,) tensor where each item is the class index of the corresponding box.
- 返回类型
list[tuple[Tensor, Tensor]]
- class mmrotate.models.dense_heads.KFIoURRetinaHead(num_classes, in_channels, stacked_convs=4, conv_cfg=None, norm_cfg=None, anchor_generator={'octave_base_scale': 4, 'ratios': [0.5, 1.0, 2.0], 'scales_per_octave': 3, 'strides': [8, 16, 32, 64, 128], 'type': 'AnchorGenerator'}, init_cfg={'layer': 'Conv2d', 'override': {'bias_prob': 0.01, 'name': 'retina_cls', 'std': 0.01, 'type': 'Normal'}, 'std': 0.01, 'type': 'Normal'}, **kwargs)[源代码]¶
Rotated Anchor-based head for KFIoU. The difference from RRetinaHead is that its loss_bbox requires bbox_pred, bbox_targets, pred_decode and targets_decode as inputs.
- 参数
num_classes (int) – Number of categories excluding the background category.
in_channels (int) – Number of channels in the input feature map.
stacked_convs (int, optional) – Number of stacked convolutions.
conv_cfg (dict, optional) – Config dict for convolution layer. Default: None.
norm_cfg (dict, optional) – Config dict for normalization layer. Default: None.
anchor_generator (dict) – Config dict for anchor generator
init_cfg (dict or list[dict], optional) – Initialization config dict.
- loss_single(cls_score, bbox_pred, anchors, labels, label_weights, bbox_targets, bbox_weights, num_total_samples)[源代码]¶
Compute loss of a single scale level.
- 参数
cls_score (torch.Tensor) – Box scores for each scale level Has shape (N, num_anchors * num_classes, H, W).
bbox_pred (torch.Tensor) – Box energies / deltas for each scale level with shape (N, num_anchors * 5, H, W).
anchors (torch.Tensor) – Box reference for each scale level with shape (N, num_total_anchors, 5).
labels (torch.Tensor) – Labels of each anchors with shape (N, num_total_anchors).
label_weights (torch.Tensor) – Label weights of each anchor with shape (N, num_total_anchors)
bbox_targets (torch.Tensor) – BBox regression targets of each anchor weight shape (N, num_total_anchors, 5).
bbox_weights (torch.Tensor) – BBox regression loss weights of each anchor with shape (N, num_total_anchors, 5).
num_total_samples (int) – If sampling, num total samples equal to the number of total anchors; Otherwise, it is the number of positive anchors.
- 返回
loss_cls (torch.Tensor): cls. loss for each scale level.
loss_bbox (torch.Tensor): reg. loss for each scale level.
- 返回类型
tuple (torch.Tensor)
- class mmrotate.models.dense_heads.KFIoURRetinaRefineHead(num_classes, in_channels, stacked_convs=4, conv_cfg=None, norm_cfg=None, anchor_generator={'strides': [8, 16, 32, 64, 128], 'type': 'PseudoAnchorGenerator'}, bbox_coder={'target_means': (0.0, 0.0, 0.0, 0.0, 0.0), 'target_stds': (1.0, 1.0, 1.0, 1.0, 1.0), 'type': 'DeltaXYWHABBoxCoder'}, init_cfg={'layer': 'Conv2d', 'override': {'bias_prob': 0.01, 'name': 'retina_cls', 'std': 0.01, 'type': 'Normal'}, 'std': 0.01, 'type': 'Normal'}, **kwargs)[源代码]¶
Rotational Anchor-based refine head. The difference from RRetinaRefineHead is that its loss_bbox requires bbox_pred, bbox_targets, pred_decode and targets_decode as inputs.
- 参数
num_classes (int) – Number of categories excluding the background category.
in_channels (int) – Number of channels in the input feature map.
stacked_convs (int, optional) – Number of stacked convolutions.
conv_cfg (dict, optional) – Config dict for convolution layer. Default: None.
norm_cfg (dict, optional) – Config dict for normalization layer. Default: None.
anchor_generator (dict) – Config dict for anchor generator
bbox_coder (dict) – Config of bounding box coder.
init_cfg (dict or list[dict], optional) – Initialization config dict.
- get_anchors(featmap_sizes, img_metas, device='cuda')[源代码]¶
Get anchors according to feature map sizes.
- 参数
featmap_sizes (list[tuple]) – Multi-level feature map sizes.
img_metas (list[dict]) – Image meta info.
bboxes_as_anchors (list[list[Tensor]]) – before further regression just like anchors.
device (torch.device | str) – Device for returned tensors
- 返回
anchor_list (list[Tensor]): Anchors of each image
valid_flag_list (list[Tensor]): Valid flags of each image
- 返回类型
tuple (list[Tensor])
- get_bboxes(cls_scores, bbox_preds, img_metas, cfg=None, rescale=False, rois=None)[源代码]¶
Transform network output for a batch into labeled boxes.
- 参数
cls_scores (list[Tensor]) – Box scores for each scale level Has shape (N, num_anchors * num_classes, H, W)
bbox_preds (list[Tensor]) – Box energies / deltas for each scale level with shape (N, num_anchors * 5, H, W)
img_metas (list[dict]) – size / scale info for each image
cfg (mmcv.Config) – test / postprocessing configuration
rois (list[list[Tensor]]) – input rbboxes of each level of each image. rois output by former stages and are to be refined
rescale (bool) – if True, return boxes in original image space
- 返回
- each item in result_list is 2-tuple.
The first item is an (n, 6) tensor, where the first 5 columns are bounding box positions (xc, yc, w, h, a) and the 6-th column is a score between 0 and 1. The second item is a (n,) tensor where each item is the class index of the corresponding box.
- 返回类型
list[tuple[Tensor, Tensor]]
- loss(cls_scores, bbox_preds, gt_bboxes, gt_labels, img_metas, rois=None, gt_bboxes_ignore=None)[源代码]¶
Loss function of KFIoURRetinaRefineHead.
- refine_bboxes(cls_scores, bbox_preds, rois)[源代码]¶
Refine predicted bounding boxes at each position of the feature maps. This method will be used in R3Det in refinement stages.
- 参数
cls_scores (list[Tensor]) – Box scores for each scale level Has shape (N, num_classes, H, W)
bbox_preds (list[Tensor]) – Box energies / deltas for each scale level with shape (N, 5, H, W)
rois (list[list[Tensor]]) – input rbboxes of each level of each image. rois output by former stages and are to be refined
- 返回
best or refined rbboxes of each level of each image.
- 返回类型
list[list[Tensor]]
- class mmrotate.models.dense_heads.ODMRefineHead(num_classes, in_channels, stacked_convs=2, conv_cfg=None, norm_cfg=None, anchor_generator={'strides': [8, 16, 32, 64, 128], 'type': 'PseudoAnchorGenerator'}, init_cfg={'layer': 'Conv2d', 'override': {'bias_prob': 0.01, 'name': 'odm_cls', 'std': 0.01, 'type': 'Normal'}, 'std': 0.01, 'type': 'Normal'}, **kwargs)[源代码]¶
Rotated Anchor-based refine head. It’s a part of the Oriented Detection Module (ODM), which produces orientation-sensitive features for classification and orientation-invariant features for localization.
- 参数
num_classes (int) – Number of categories excluding the background category.
in_channels (int) – Number of channels in the input feature map.
stacked_convs (int, optional) – Number of stacked convolutions.
conv_cfg (dict, optional) – Config dict for convolution layer. Default: None.
norm_cfg (dict, optional) – Config dict for normalization layer. Default: None.
anchor_generator (dict) – Config dict for anchor generator
init_cfg (dict or list[dict], optional) – Initialization config dict.
- forward_single(x)[源代码]¶
Forward feature of a single scale level.
- 参数
x (torch.Tensor) – Features of a single scale level.
- 返回
cls_score (torch.Tensor): Cls scores for a single scale level the channels number is num_anchors * num_classes.
bbox_pred (torch.Tensor): Box energies / deltas for a single scale level, the channels number is num_anchors * 4.
- 返回类型
tuple (torch.Tensor)
- get_anchors(featmap_sizes, img_metas, device='cuda')[源代码]¶
Get anchors according to feature map sizes.
- 参数
featmap_sizes (list[tuple]) – Multi-level feature map sizes.
img_metas (list[dict]) – Image meta info.
bboxes_as_anchors (list[list[Tensor]]) – before further regression just like anchors.
device (torch.device | str) – Device for returned tensors
- 返回
anchor_list (list[Tensor]): Anchors of each image
valid_flag_list (list[Tensor]): Valid flags of each image
- 返回类型
tuple (list[Tensor])
- get_bboxes(cls_scores, bbox_preds, img_metas, cfg=None, rescale=False, rois=None)[源代码]¶
Transform network output for a batch into labeled boxes.
- 参数
cls_scores (list[Tensor]) – Box scores for each scale level Has shape (N, num_anchors * num_classes, H, W)
bbox_preds (list[Tensor]) – Box energies / deltas for each scale level with shape (N, num_anchors * 5, H, W)
img_metas (list[dict]) – size / scale info for each image
cfg (mmcv.Config) – test / postprocessing configuration
rois (list[list[Tensor]]) – input rbboxes of each level of
image. rois output by former stages and are to be refined (each) –
rescale (bool) – if True, return boxes in original image space
- 返回
- each item in result_list is 2-tuple.
The first item is an (n, 6) tensor, where the first 5 columns are bounding box positions (xc, yc, w, h, a) and the 6-th column is a score between 0 and 1. The second item is a (n,) tensor where each item is the class index of the corresponding box.
- 返回类型
list[tuple[Tensor, Tensor]]
- class mmrotate.models.dense_heads.OrientedRPNHead(in_channels, init_cfg={'layer': 'Conv2d', 'std': 0.01, 'type': 'Normal'}, version='oc', **kwargs)[源代码]¶
Oriented RPN head for Oriented R-CNN.
- loss_single(cls_score, bbox_pred, anchors, labels, label_weights, bbox_targets, bbox_weights, num_total_samples)[源代码]¶
Compute loss of a single scale level.
- 参数
cls_score (torch.Tensor) – Box scores for each scale level Has shape (N, num_anchors * num_classes, H, W).
bbox_pred (torch.Tensor) – Box energies / deltas for each scale level with shape (N, num_anchors * 5, H, W).
anchors (torch.Tensor) – Box reference for each scale level with shape (N, num_total_anchors, 4).
labels (torch.Tensor) – Labels of each anchors with shape (N, num_total_anchors).
label_weights (torch.Tensor) – Label weights of each anchor with shape (N, num_total_anchors)
bbox_targets (torch.Tensor) – BBox regression targets of each anchor
shape (weight) –
bbox_weights (torch.Tensor) – BBox regression loss weights of each anchor with shape (N, num_total_anchors, 4).
num_total_samples (int) – If sampling, num total samples equal to the number of total anchors; Otherwise, it is the number of positive anchors.
- 返回
loss_cls (torch.Tensor): cls. loss for each scale level.
loss_bbox (torch.Tensor): reg. loss for each scale level.
- 返回类型
tuple (torch.Tensor)
- class mmrotate.models.dense_heads.RotatedATSSHead(num_classes, in_channels, stacked_convs=4, conv_cfg=None, norm_cfg=None, anchor_generator={'octave_base_scale': 4, 'ratios': [0.5, 1.0, 2.0], 'scales_per_octave': 3, 'strides': [8, 16, 32, 64, 128], 'type': 'AnchorGenerator'}, init_cfg={'layer': 'Conv2d', 'override': {'bias_prob': 0.01, 'name': 'retina_cls', 'std': 0.01, 'type': 'Normal'}, 'std': 0.01, 'type': 'Normal'}, **kwargs)[源代码]¶
An anchor-based head used in ATSS.
The head contains two subnetworks. The first classifies anchor boxes and the second regresses deltas for the anchors.
- get_targets(anchor_list, valid_flag_list, gt_bboxes_list, img_metas, gt_bboxes_ignore_list=None, gt_labels_list=None, label_channels=1, unmap_outputs=True, return_sampling_results=False)[源代码]¶
Compute regression and classification targets for anchors in multiple images.
- 参数
anchor_list (list[list[Tensor]]) – Multi level anchors of each image. The outer list indicates images, and the inner list corresponds to feature levels of the image. Each element of the inner list is a tensor of shape (num_anchors, 5).
valid_flag_list (list[list[Tensor]]) – Multi level valid flags of each image. The outer list indicates images, and the inner list corresponds to feature levels of the image. Each element of the inner list is a tensor of shape (num_anchors, )
gt_bboxes_list (list[Tensor]) – Ground truth bboxes of each image.
img_metas (list[dict]) – Meta info of each image.
gt_bboxes_ignore_list (list[Tensor]) – Ground truth bboxes to be ignored.
gt_labels_list (list[Tensor]) – Ground truth labels of each box.
label_channels (int) – Channel of label.
unmap_outputs (bool) – Whether to map outputs back to the original set of anchors.
- 返回
Usually returns a tuple containing learning targets.
labels_list (list[Tensor]): Labels of each level.
label_weights_list (list[Tensor]): Label weights of each level
bbox_targets_list (list[Tensor]): BBox targets of each level
bbox_weights_list (list[Tensor]): BBox weights of each level
num_total_pos (int): Number of positive samples in all images
num_total_neg (int): Number of negative samples in all images
- additional_returns: This function enables user-defined returns from self._get_targets_single`. These returns are currently refined to properties at each feature map (HxW dimension).
The results will be concatenated after the end
- 返回类型
tuple
- class mmrotate.models.dense_heads.RotatedAnchorFreeHead(num_classes, in_channels, feat_channels=256, stacked_convs=4, strides=(4, 8, 16, 32, 64), dcn_on_last_conv=False, conv_bias='auto', loss_cls={'alpha': 0.25, 'gamma': 2.0, 'loss_weight': 1.0, 'type': 'FocalLoss', 'use_sigmoid': True}, loss_bbox={'loss_weight': 1.0, 'type': 'IoULoss'}, bbox_coder={'type': 'DistancePointBBoxCoder'}, conv_cfg=None, norm_cfg=None, train_cfg=None, test_cfg=None, init_cfg={'layer': 'Conv2d', 'override': {'bias_prob': 0.01, 'name': 'conv_cls', 'std': 0.01, 'type': 'Normal'}, 'std': 0.01, 'type': 'Normal'})[源代码]¶
Rotated Anchor-free head (Rotated FCOS, etc.).
- 参数
num_classes (int) – Number of categories excluding the background category.
in_channels (int) – Number of channels in the input feature map.
feat_channels (int) – Number of hidden channels. Used in child classes.
stacked_convs (int) – Number of stacking convs of the head.
strides (tuple) – Downsample factor of each feature map.
dcn_on_last_conv (bool) – If true, use dcn in the last layer of towers. Default: False.
conv_bias (bool | str) – If specified as auto, it will be decided by the norm_cfg. Bias of conv will be set as True if norm_cfg is None, otherwise False. Default: “auto”.
loss_cls (dict) – Config of classification loss.
loss_bbox (dict) – Config of localization loss.
bbox_coder (dict) – Config of bbox coder. Defaults ‘DistancePointBBoxCoder’.
conv_cfg (dict) – Config dict for convolution layer. Default: None.
norm_cfg (dict) – Config dict for normalization layer. Default: None.
train_cfg (dict) – Training config of anchor head.
test_cfg (dict) – Testing config of anchor head.
init_cfg (dict or list[dict], optional) – Initialization config dict.
- class mmrotate.models.dense_heads.RotatedAnchorHead(num_classes, in_channels, feat_channels=256, anchor_generator={'octave_base_scale': 4, 'ratios': [1.0, 0.5, 2.0], 'scales_per_octave': 3, 'strides': [8, 16, 32, 64, 128], 'type': 'RotatedAnchorGenerator'}, bbox_coder={'target_means': (0.0, 0.0, 0.0, 0.0, 0.0), 'target_stds': (1.0, 1.0, 1.0, 1.0, 1.0), 'type': 'DeltaXYWHAOBBoxCoder'}, reg_decoded_bbox=False, assign_by_circumhbbox='oc', loss_cls={'alpha': 0.25, 'gamma': 2.0, 'loss_weight': 1.0, 'type': 'FocalLoss', 'use_sigmoid': True}, loss_bbox={'loss_weight': 1.0, 'type': 'L1Loss'}, train_cfg=None, test_cfg=None, init_cfg={'layer': 'Conv2d', 'std': 0.01, 'type': 'Normal'})[源代码]¶
Rotated Anchor-based head (RotatedRPN, RotatedRetinaNet, etc.).
- 参数
num_classes (int) – Number of categories excluding the background category.
in_channels (int) – Number of channels in the input feature map.
feat_channels (int) – Number of hidden channels. Used in child classes.
anchor_generator (dict) – Config dict for anchor generator
bbox_coder (dict) – Config of bounding box coder.
reg_decoded_bbox (bool) – If true, the regression loss would be applied on decoded bounding boxes. Default: False
assign_by_circumhbbox (str) – If None, assigner will assign according to the IoU between anchor and GT (OBB), called RetinaNet-OBB. If angle definition method, assigner will assign according to the IoU between anchor and GT’s circumbox (HBB), called RetinaNet-HBB.
loss_cls (dict) – Config of classification loss.
loss_bbox (dict) – Config of localization loss.
train_cfg (dict) – Training config of anchor head.
test_cfg (dict) – Testing config of anchor head.
init_cfg (dict or list[dict], optional) – Initialization config dict.
- aug_test(feats, img_metas, rescale=False)[源代码]¶
Test det bboxes with test time augmentation, can be applied in DenseHead except for
RPNHead
and its variants, e.g.,GARPNHead
, etc.- 参数
feats (list[Tensor]) – the outer list indicates test-time augmentations and inner Tensor should have a shape NxCxHxW, which contains features for all images in the batch.
img_metas (list[list[dict]]) – the outer list indicates test-time augs (multiscale, flip, etc.) and the inner list indicates images in a batch. each dict has image information.
rescale (bool, optional) – Whether to rescale the results. Defaults to False.
- 返回
- Each item in result_list is 2-tuple.
The first item is
bboxes
with shape (n, 6), where 6 represent (x, y, w, h, a, score). The shape of the second tensor in the tuple islabels
with shape (n,). The length of list should always be 1.
- 返回类型
list[tuple[Tensor, Tensor]]
- forward(feats)[源代码]¶
Forward features from the upstream network.
- 参数
feats (tuple[Tensor]) – Features from the upstream network, each is a 4D-tensor.
- 返回
A tuple of classification scores and bbox prediction.
cls_scores (list[Tensor]): Classification scores for all scale levels, each is a 4D-tensor, the channels number is num_anchors * num_classes.
bbox_preds (list[Tensor]): Box energies / deltas for all scale levels, each is a 4D-tensor, the channels number is num_anchors * 5.
- 返回类型
tuple
- forward_single(x)[源代码]¶
Forward feature of a single scale level.
- 参数
x (torch.Tensor) – Features of a single scale level.
- 返回
cls_score (torch.Tensor): Cls scores for a single scale level the channels number is num_anchors * num_classes.
bbox_pred (torch.Tensor): Box energies / deltas for a single scale level, the channels number is num_anchors * 5.
- 返回类型
tuple (torch.Tensor)
- get_anchors(featmap_sizes, img_metas, device='cuda')[源代码]¶
Get anchors according to feature map sizes.
- 参数
featmap_sizes (list[tuple]) – Multi-level feature map sizes.
img_metas (list[dict]) – Image meta info.
device (torch.device | str) – Device for returned tensors
- 返回
anchor_list (list[Tensor]): Anchors of each image.
valid_flag_list (list[Tensor]): Valid flags of each image.
- 返回类型
tuple (list[Tensor])
- get_bboxes(cls_scores, bbox_preds, img_metas, cfg=None, rescale=False, with_nms=True)[源代码]¶
Transform network output for a batch into bbox predictions.
- 参数
cls_scores (list[Tensor]) – Box scores for each scale level Has shape (N, num_anchors * num_classes, H, W)
bbox_preds (list[Tensor]) – Box energies / deltas for each scale level with shape (N, num_anchors * 5, H, W)
img_metas (list[dict]) – Meta information of each image, e.g., image size, scaling factor, etc.
cfg (mmcv.Config | None) – Test / postprocessing configuration, if None, test_cfg would be used
rescale (bool) – If True, return boxes in original image space. Default: False.
with_nms (bool) – If True, do nms before return boxes. Default: True.
- 返回
- Each item in result_list is 2-tuple.
The first item is an (n, 6) tensor, where the first 5 columns are bounding box positions (cx, cy, w, h, a) and the 6-th column is a score between 0 and 1. The second item is a (n,) tensor where each item is the predicted class label of the corresponding box.
- 返回类型
list[tuple[Tensor, Tensor]]
示例
>>> import mmcv >>> self = AnchorHead( >>> num_classes=9, >>> in_channels=1, >>> anchor_generator=dict( >>> type='AnchorGenerator', >>> scales=[8], >>> ratios=[0.5, 1.0, 2.0], >>> strides=[4,])) >>> img_metas = [{'img_shape': (32, 32, 3), 'scale_factor': 1}] >>> cfg = mmcv.Config(dict( >>> score_thr=0.00, >>> nms=dict(type='nms', iou_thr=1.0), >>> max_per_img=10)) >>> feat = torch.rand(1, 1, 3, 3) >>> cls_score, bbox_pred = self.forward_single(feat) >>> # note the input lists are over different levels, not images >>> cls_scores, bbox_preds = [cls_score], [bbox_pred] >>> result_list = self.get_bboxes(cls_scores, bbox_preds, >>> img_metas, cfg) >>> det_bboxes, det_labels = result_list[0] >>> assert len(result_list) == 1 >>> assert det_bboxes.shape[1] == 5 >>> assert len(det_bboxes) == len(det_labels) == cfg.max_per_img
- get_targets(anchor_list, valid_flag_list, gt_bboxes_list, img_metas, gt_bboxes_ignore_list=None, gt_labels_list=None, label_channels=1, unmap_outputs=True, return_sampling_results=False)[源代码]¶
Compute regression and classification targets for anchors in multiple images.
- 参数
anchor_list (list[list[Tensor]]) – Multi level anchors of each image. The outer list indicates images, and the inner list corresponds to feature levels of the image. Each element of the inner list is a tensor of shape (num_anchors, 5).
valid_flag_list (list[list[Tensor]]) – Multi level valid flags of each image. The outer list indicates images, and the inner list corresponds to feature levels of the image. Each element of the inner list is a tensor of shape (num_anchors, )
gt_bboxes_list (list[Tensor]) – Ground truth bboxes of each image.
img_metas (list[dict]) – Meta info of each image.
gt_bboxes_ignore_list (list[Tensor]) – Ground truth bboxes to be ignored.
gt_labels_list (list[Tensor]) – Ground truth labels of each box.
label_channels (int) – Channel of label.
unmap_outputs (bool) – Whether to map outputs back to the original set of anchors.
- 返回
Usually returns a tuple containing learning targets.
labels_list (list[Tensor]): Labels of each level.
label_weights_list (list[Tensor]): Label weights of each level.
bbox_targets_list (list[Tensor]): BBox targets of each level.
bbox_weights_list (list[Tensor]): BBox weights of each level.
num_total_pos (int): Number of positive samples in all images.
num_total_neg (int): Number of negative samples in all images.
- additional_returns: This function enables user-defined returns from
self._get_targets_single. These returns are currently refined to properties at each feature map (i.e. having HxW dimension). The results will be concatenated after the end
- 返回类型
tuple
- loss(cls_scores, bbox_preds, gt_bboxes, gt_labels, img_metas, gt_bboxes_ignore=None)[源代码]¶
Compute losses of the head.
- 参数
cls_scores (list[Tensor]) – Box scores for each scale level Has shape (N, num_anchors * num_classes, H, W)
bbox_preds (list[Tensor]) – Box energies / deltas for each scale level with shape (N, num_anchors * 5, H, W)
gt_bboxes (list[Tensor]) – Ground truth bboxes for each image with shape (num_gts, 5) in [cx, cy, w, h, a] format.
gt_labels (list[Tensor]) – class indices corresponding to each box
img_metas (list[dict]) – Meta information of each image, e.g., image size, scaling factor, etc.
gt_bboxes_ignore (None | list[Tensor]) – specify which bounding boxes can be ignored when computing the loss. Default: None
- 返回
A dictionary of loss components.
- 返回类型
dict[str, Tensor]
- loss_single(cls_score, bbox_pred, anchors, labels, label_weights, bbox_targets, bbox_weights, num_total_samples)[源代码]¶
Compute loss of a single scale level.
- 参数
cls_score (torch.Tensor) – Box scores for each scale level Has shape (N, num_anchors * num_classes, H, W).
bbox_pred (torch.Tensor) – Box energies / deltas for each scale level with shape (N, num_anchors * 5, H, W).
anchors (torch.Tensor) – Box reference for each scale level with shape (N, num_total_anchors, 5).
labels (torch.Tensor) – Labels of each anchors with shape (N, num_total_anchors).
label_weights (torch.Tensor) – Label weights of each anchor with shape (N, num_total_anchors)
bbox_targets (torch.Tensor) – BBox regression targets of each anchor
shape (weight) –
bbox_weights (torch.Tensor) – BBox regression loss weights of each anchor with shape (N, num_total_anchors, 5).
num_total_samples (int) – If sampling, num total samples equal to the number of total anchors; Otherwise, it is the number of positive anchors.
- 返回
loss_cls (torch.Tensor): cls. loss for each scale level.
loss_bbox (torch.Tensor): reg. loss for each scale level.
- 返回类型
tuple (torch.Tensor)
- merge_aug_bboxes(aug_bboxes, aug_scores, img_metas)[源代码]¶
Merge augmented detection bboxes and scores.
- 参数
aug_bboxes (list[Tensor]) – shape (n, 4*#class)
aug_scores (list[Tensor] or None) – shape (n, #class)
img_shapes (list[Tensor]) – shape (3, ).
- 返回
bboxes
with shape (n,4), where 4 represent (tl_x, tl_y, br_x, br_y) andscores
with shape (n,).- 返回类型
tuple[Tensor]
- class mmrotate.models.dense_heads.RotatedFCOSHead(num_classes, in_channels, regress_ranges=((- 1, 64), (64, 128), (128, 256), (256, 512), (512, 100000000.0)), center_sampling=False, center_sample_radius=1.5, norm_on_bbox=False, centerness_on_reg=False, separate_angle=False, scale_angle=True, h_bbox_coder={'type': 'DistancePointBBoxCoder'}, loss_cls={'alpha': 0.25, 'gamma': 2.0, 'loss_weight': 1.0, 'type': 'FocalLoss', 'use_sigmoid': True}, loss_bbox={'loss_weight': 1.0, 'type': 'IoULoss'}, loss_angle={'loss_weight': 1.0, 'type': 'L1Loss'}, loss_centerness={'loss_weight': 1.0, 'type': 'CrossEntropyLoss', 'use_sigmoid': True}, norm_cfg={'num_groups': 32, 'requires_grad': True, 'type': 'GN'}, init_cfg={'layer': 'Conv2d', 'override': {'bias_prob': 0.01, 'name': 'conv_cls', 'std': 0.01, 'type': 'Normal'}, 'std': 0.01, 'type': 'Normal'}, **kwargs)[源代码]¶
Anchor-free head used in FCOS. The FCOS head does not use anchor boxes. Instead bounding boxes are predicted at each pixel and a centerness measure is used to suppress low-quality predictions. Here norm_on_bbox, centerness_on_reg, dcn_on_last_conv are training tricks used in official repo, which will bring remarkable mAP gains of up to 4.9. Please see https://github.com/tianzhi0549/FCOS for more detail. :param num_classes: Number of categories excluding the background
category.
- 参数
in_channels (int) – Number of channels in the input feature map.
strides (list[int] | list[tuple[int, int]]) – Strides of points in multiple feature levels. Default: (4, 8, 16, 32, 64).
regress_ranges (tuple[tuple[int, int]]) – Regress range of multiple level points.
center_sampling (bool) – If true, use center sampling. Default: False.
center_sample_radius (float) – Radius of center sampling. Default: 1.5.
norm_on_bbox (bool) – If true, normalize the regression targets with FPN strides. Default: False.
centerness_on_reg (bool) – If true, position centerness on the regress branch. Please refer to https://github.com/tianzhi0549/FCOS/issues/89#issuecomment-516877042. Default: False.
separate_angle (bool) – If true, angle prediction is separated from bbox regression loss. Default: False.
scale_angle (bool) – If true, add scale to angle pred branch. Default: True.
h_bbox_coder (dict) – Config of horzional bbox coder, only used when seprate_angle is True.
conv_bias (bool | str) – If specified as auto, it will be decided by the norm_cfg. Bias of conv will be set as True if norm_cfg is None, otherwise False. Default: “auto”.
loss_cls (dict) – Config of classification loss.
loss_bbox (dict) – Config of localization loss.
loss_angle (dict) – Config of angle loss, only used when seprate_angle is True.
loss_centerness (dict) – Config of centerness loss.
norm_cfg (dict) – dictionary to construct and config norm layer. Default: norm_cfg=dict(type=’GN’, num_groups=32, requires_grad=True).
init_cfg (dict or list[dict], optional) – Initialization config dict.
示例
>>> self = RotatedFCOSHead(11, 7) >>> feats = [torch.rand(1, 7, s, s) for s in [4, 8, 16, 32, 64]] >>> cls_score, bbox_pred, angle_pred, centerness = self.forward(feats) >>> assert len(cls_score) == len(self.scales)
- centerness_target(pos_bbox_targets)[源代码]¶
Compute centerness targets.
- 参数
pos_bbox_targets (Tensor) – BBox targets of positive bboxes in shape (num_pos, 4)
- 返回
Centerness target.
- 返回类型
Tensor
- forward(feats)[源代码]¶
Forward features from the upstream network. :param feats: Features from the upstream network, each is
a 4D-tensor.
- 返回
cls_scores (list[Tensor]): Box scores for each scale level, each is a 4D-tensor, the channel number is num_points * num_classes. bbox_preds (list[Tensor]): Box energies / deltas for each scale level, each is a 4D-tensor, the channel number is num_points * 4. angle_preds (list[Tensor]): Box angle for each scale level, each is a 4D-tensor, the channel number is num_points * 1. centernesses (list[Tensor]): centerness for each scale level, each is a 4D-tensor, the channel number is num_points * 1.
- 返回类型
tuple
- forward_single(x, scale, stride)[源代码]¶
Forward features of a single scale level.
- 参数
x (Tensor) – FPN feature maps of the specified stride.
( (scale) – obj: mmcv.cnn.Scale): Learnable scale module to resize the bbox prediction.
stride (int) – The corresponding stride for feature maps, only used to normalize the bbox prediction when self.norm_on_bbox is True.
- 返回
scores for each class, bbox predictions, angle predictions and centerness predictions of input feature maps.
- 返回类型
tuple
- get_bboxes(cls_scores, bbox_preds, angle_preds, centernesses, img_metas, cfg=None, rescale=None)[源代码]¶
Transform network output for a batch into bbox predictions.
- 参数
cls_scores (list[Tensor]) – Box scores for each scale level Has shape (N, num_points * num_classes, H, W)
bbox_preds (list[Tensor]) – Box energies / deltas for each scale level with shape (N, num_points * 4, H, W)
angle_preds (list[Tensor]) – Box angle for each scale level with shape (N, num_points * 1, H, W)
centernesses (list[Tensor]) – Centerness for each scale level with shape (N, num_points * 1, H, W)
img_metas (list[dict]) – Meta information of each image, e.g., image size, scaling factor, etc.
cfg (mmcv.Config) – Test / postprocessing configuration, if None, test_cfg would be used
rescale (bool) – If True, return boxes in original image space
- 返回
- Each item in result_list is 2-tuple.
The first item is an (n, 6) tensor, where the first 5 columns are bounding box positions (x, y, w, h, angle) and the 6-th column is a score between 0 and 1. The second item is a (n,) tensor where each item is the predicted class label of the corresponding box.
- 返回类型
list[tuple[Tensor, Tensor]]
- get_targets(points, gt_bboxes_list, gt_labels_list)[源代码]¶
Compute regression, classification and centerness targets for points in multiple images.
- 参数
points (list[Tensor]) – Points of each fpn level, each has shape (num_points, 2).
gt_bboxes_list (list[Tensor]) – Ground truth bboxes of each image, each has shape (num_gt, 4).
gt_labels_list (list[Tensor]) – Ground truth labels of each box, each has shape (num_gt,).
- 返回
concat_lvl_labels (list[Tensor]): Labels of each level. concat_lvl_bbox_targets (list[Tensor]): BBox targets of each level. concat_lvl_angle_targets (list[Tensor]): Angle targets of each level.
- 返回类型
tuple
- loss(cls_scores, bbox_preds, angle_preds, centernesses, gt_bboxes, gt_labels, img_metas, gt_bboxes_ignore=None)[源代码]¶
Compute loss of the head. :param cls_scores: Box scores for each scale level,
each is a 4D-tensor, the channel number is num_points * num_classes.
- 参数
bbox_preds (list[Tensor]) – Box energies / deltas for each scale level, each is a 4D-tensor, the channel number is num_points * 4.
angle_preds (list[Tensor]) – Box angle for each scale level, each is a 4D-tensor, the channel number is num_points * 1.
centernesses (list[Tensor]) – centerness for each scale level, each is a 4D-tensor, the channel number is num_points * 1.
gt_bboxes (list[Tensor]) – Ground truth bboxes for each image with shape (num_gts, 4) in [tl_x, tl_y, br_x, br_y] format.
gt_labels (list[Tensor]) – class indices corresponding to each box
img_metas (list[dict]) – Meta information of each image, e.g., image size, scaling factor, etc.
gt_bboxes_ignore (None | list[Tensor]) – specify which bounding boxes can be ignored when computing the loss.
- 返回
A dictionary of loss components.
- 返回类型
dict[str, Tensor]
- class mmrotate.models.dense_heads.RotatedRPNHead(in_channels, init_cfg={'layer': 'Conv2d', 'std': 0.01, 'type': 'Normal'}, version='oc', **kwargs)[源代码]¶
Rotated RPN head for rotated bboxes.
- 参数
in_channels (int) – Number of channels in the input feature map.
init_cfg (dict or list[dict], optional) – Initialization config dict.
- get_bboxes(cls_scores, bbox_preds, img_metas, cfg=None, rescale=False, with_nms=True)[源代码]¶
Transform network output for a batch into bbox predictions.
- 参数
cls_scores (list[Tensor]) – Box scores for each scale level Has shape (N, num_anchors * num_classes, H, W)
bbox_preds (list[Tensor]) – Box energies / deltas for each scale level with shape (N, num_anchors * 5, H, W)
img_metas (list[dict]) – Meta information of each image, e.g., image size, scaling factor, etc.
cfg (mmcv.Config | None) – Test / postprocessing configuration, if None, test_cfg would be used
rescale (bool) – If True, return boxes in original image space. Default: False.
with_nms (bool) – If True, do nms before return boxes. Default: True.
- 返回
- Each item in result_list is 2-tuple.
The first item is an (n, 6) tensor, where the first 5 columns are bounding box positions (cx, cy, w, h, a) and the 6-th column is a score between 0 and 1. The second item is a (n,) tensor where each item is the predicted class label of the corresponding box.
- 返回类型
list[tuple[Tensor, Tensor]]
- get_targets(anchor_list, valid_flag_list, gt_bboxes_list, img_metas, gt_bboxes_ignore_list=None, gt_labels_list=None, label_channels=1, unmap_outputs=True, return_sampling_results=False)[源代码]¶
Compute regression and classification targets for anchors in multiple images.
- 参数
anchor_list (list[list[Tensor]]) – Multi level anchors of each image. The outer list indicates images, and the inner list corresponds to feature levels of the image. Each element of the inner list is a tensor of shape (num_anchors, 4).
valid_flag_list (list[list[Tensor]]) – Multi level valid flags of each image. The outer list indicates images, and the inner list corresponds to feature levels of the image. Each element of the inner list is a tensor of shape (num_anchors, )
gt_bboxes_list (list[Tensor]) – Ground truth bboxes of each image.
img_metas (list[dict]) – Meta info of each image.
gt_bboxes_ignore_list (list[Tensor]) – Ground truth bboxes to be ignored.
gt_labels_list (list[Tensor]) – Ground truth labels of each box.
label_channels (int) – Channel of label.
unmap_outputs (bool) – Whether to map outputs back to the original set of anchors.
- 返回
Usually returns a tuple containing learning targets.
labels_list (list[Tensor]): Labels of each level.
label_weights_list (list[Tensor]): Label weights of each level.
bbox_targets_list (list[Tensor]): BBox targets of each level.
bbox_weights_list (list[Tensor]): BBox weights of each level.
num_total_pos (int): Number of positive samples in all images.
num_total_neg (int): Number of negative samples in all images.
- additional_returns: This function enables user-defined returns from
self._get_targets_single. These returns are currently refined to properties at each feature map (i.e. having HxW dimension). The results will be concatenated after the end
- 返回类型
tuple
- loss(cls_scores, bbox_preds, gt_bboxes, img_metas, gt_bboxes_ignore=None)[源代码]¶
Compute losses of the head.
- 参数
cls_scores (list[Tensor]) – Box scores for each scale level Has shape (N, num_anchors * num_classes, H, W)
bbox_preds (list[Tensor]) – Box energies / deltas for each scale level with shape (N, num_anchors * 5, H, W)
gt_bboxes (list[Tensor]) – Ground truth bboxes for each image with shape (num_gts, 5) in [cx, cy, w, h, a] format.
gt_labels (list[Tensor]) – class indices corresponding to each box
img_metas (list[dict]) – Meta information of each image, e.g., image size, scaling factor, etc.
gt_bboxes_ignore (None | list[Tensor]) – specify which bounding boxes can be ignored when computing the loss. Default: None
- 返回
A dictionary of loss components.
- 返回类型
dict[str, Tensor]
- loss_single(cls_score, bbox_pred, anchors, labels, label_weights, bbox_targets, bbox_weights, num_total_samples)[源代码]¶
Compute loss of a single scale level.
- 参数
cls_score (torch.Tensor) – Box scores for each scale level Has shape (N, num_anchors * num_classes, H, W).
bbox_pred (torch.Tensor) – Box energies / deltas for each scale level with shape (N, num_anchors * 5, H, W).
anchors (torch.Tensor) – Box reference for each scale level with shape (N, num_total_anchors, 4).
labels (torch.Tensor) – Labels of each anchors with shape (N, num_total_anchors).
label_weights (torch.Tensor) – Label weights of each anchor with shape (N, num_total_anchors)
bbox_targets (torch.Tensor) – BBox regression targets of each anchor
shape (weight) –
bbox_weights (torch.Tensor) – BBox regression loss weights of each anchor with shape (N, num_total_anchors, 4).
num_total_samples (int) – If sampling, num total samples equal to the number of total anchors; Otherwise, it is the number of positive anchors.
- 返回
A dictionary of loss components.
- 返回类型
dict[str, Tensor]
- class mmrotate.models.dense_heads.RotatedRepPointsHead(num_classes, in_channels, feat_channels, point_feat_channels=256, stacked_convs=3, num_points=9, gradient_mul=0.1, point_strides=[8, 16, 32, 64, 128], point_base_scale=4, conv_bias='auto', loss_cls={'alpha': 0.25, 'gamma': 2.0, 'loss_weight': 1.0, 'type': 'FocalLoss', 'use_sigmoid': True}, loss_bbox_init={'beta': 0.1111111111111111, 'loss_weight': 0.5, 'type': 'SmoothL1Loss'}, loss_bbox_refine={'beta': 0.1111111111111111, 'loss_weight': 1.0, 'type': 'SmoothL1Loss'}, conv_cfg=None, norm_cfg=None, train_cfg=None, test_cfg=None, center_init=True, transform_method='rotrect', use_reassign=False, topk=6, anti_factor=0.75, version='oc', init_cfg={'layer': 'Conv2d', 'override': {'bias_prob': 0.01, 'name': 'reppoints_cls_out', 'std': 0.01, 'type': 'Normal'}, 'std': 0.01, 'type': 'Normal'}, **kwargs)[源代码]¶
Rotated RepPoints head.
- 参数
num_classes (int) – Number of classes.
in_channels (int) – Number of input channels.
feat_channels (int) – Number of feature channels.
point_feat_channels (int, optional) – Number of channels of points features.
stacked_convs (int, optional) – Number of stacked convolutions.
num_points (int, optional) – Number of points in points set.
gradient_mul (float, optional) – The multiplier to gradients from points refinement and recognition.
point_strides (Iterable, optional) – points strides.
point_base_scale (int, optional) – Bbox scale for assigning labels.
conv_bias (str, optional) – The bias of convolution.
loss_cls (dict, optional) – Config of classification loss.
loss_bbox_init (dict, optional) – Config of initial points loss.
loss_bbox_refine (dict, optional) – Config of points loss in refinement.
conv_cfg (dict, optional) – The config of convolution.
norm_cfg (dict, optional) – The config of normlization.
train_cfg (dict, optional) – The config of train.
test_cfg (dict, optional) – The config of test.
center_init (bool, optional) – Whether to use center point assignment.
transform_method (str, optional) – The methods to transform RepPoints to bbox.
use_reassign (bool, optional) – Whether to reassign samples.
topk (int, optional) – Number of the highest topk points. Defaults to 9.
anti_factor (float, optional) – Feature anti-aliasing coefficient.
version (str, optional) – Angle representations. Defaults to ‘oc’.
init_cfg (dict or list[dict], optional) – Initialization config dict.
- get_bboxes(cls_scores, pts_preds_init, pts_preds_refine, img_metas, cfg=None, rescale=False, with_nms=True, **kwargs)[源代码]¶
Transform network outputs of a batch into bbox results.
- 参数
cls_scores (list[Tensor]) – Classification scores for all scale levels, each is a 4D-tensor, has shape (batch_size, num_priors * num_classes, H, W).
pts_preds_init (list[Tensor]) – Box energies / deltas for all scale levels, each is a 18D-tensor, has shape (batch_size, num_points * 2, H, W).
pts_preds_refine (list[Tensor]) – Box energies / deltas for all scale levels, each is a 18D-tensor, has shape (batch_size, num_points * 2, H, W).
img_metas (list[dict], Optional) – Image meta info. Default None.
cfg (mmcv.Config, Optional) – Test / postprocessing configuration, if None, test_cfg would be used. Default None.
rescale (bool) – If True, return boxes in original image space. Default False.
with_nms (bool) – If True, do nms before return boxes. Default True.
- 返回
- Each item in result_list is 2-tuple.
The first item is an (n, 6) tensor, where the first 4 columns are bounding box positions (cx, cy, w, h, a) and the 6-th column is a score between 0 and 1. The second item is a (n,) tensor where each item is the predicted class label of the corresponding box.
- 返回类型
list[list[Tensor, Tensor]]
- get_cfa_targets(proposals_list, valid_flag_list, gt_bboxes_list, img_metas, gt_bboxes_ignore_list=None, gt_labels_list=None, stage='init', label_channels=1, unmap_outputs=True)[源代码]¶
Compute corresponding GT box and classification targets for proposals.
- 参数
proposals_list (list[list]) – Multi level points/bboxes of each image.
valid_flag_list (list[list]) – Multi level valid flags of each image.
gt_bboxes_list (list[Tensor]) – Ground truth bboxes of each image.
img_metas (list[dict]) – Meta info of each image.
gt_bboxes_ignore_list (list[Tensor]) – Ground truth bboxes to be ignored.
gt_bboxes_list – Ground truth labels of each box.
stage (str) – init or refine. Generate target for init stage or refine stage
label_channels (int) – Channel of label.
unmap_outputs (bool) – Whether to map outputs back to the original set of anchors.
- 返回
all_labels (list[Tensor]): Labels of each level.
all_label_weights (list[Tensor]): Label weights of each level.
all_bbox_gt (list[Tensor]): Ground truth bbox of each level.
all_proposals (list[Tensor]): Proposals(points/bboxes) of each level.
all_proposal_weights (list[Tensor]): Proposal weights of each level.
pos_inds (list[Tensor]): Index of positive samples in all images.
gt_inds (list[Tensor]): Index of ground truth bbox in all images.
- 返回类型
tuple
- get_points(featmap_sizes, img_metas, device)[源代码]¶
Get points according to feature map sizes.
- 参数
featmap_sizes (list[tuple]) – Multi-level feature map sizes.
img_metas (list[dict]) – Image meta info.
- 返回
points of each image, valid flags of each image
- 返回类型
tuple
- get_pos_loss(cls_score, pts_pred, label, bbox_gt, label_weight, convex_weight, pos_inds)[源代码]¶
Calculate loss of all potential positive samples obtained from first match process.
- 参数
cls_score (Tensor) – Box scores of single image with shape (num_anchors, num_classes)
pts_pred (Tensor) – Box energies / deltas of single image with shape (num_anchors, 4)
label (Tensor) – classification target of each anchor with shape (num_anchors,)
bbox_gt (Tensor) – Ground truth box.
label_weight (Tensor) – Classification loss weight of each anchor with shape (num_anchors).
convex_weight (Tensor) – Bbox weight of each anchor with shape (num_anchors, 4).
pos_inds (Tensor) – Index of all positive samples got from first assign process.
- 返回
Losses of all positive samples in single image.
- 返回类型
Tensor
- get_targets(proposals_list, valid_flag_list, gt_bboxes_list, img_metas, gt_bboxes_ignore_list=None, gt_labels_list=None, stage='init', label_channels=1, unmap_outputs=True)[源代码]¶
Compute corresponding GT box and classification targets for proposals.
- 参数
proposals_list (list[list]) – Multi level points/bboxes of each image.
valid_flag_list (list[list]) – Multi level valid flags of each image.
gt_bboxes_list (list[Tensor]) – Ground truth bboxes of each image.
img_metas (list[dict]) – Meta info of each image.
gt_bboxes_ignore_list (list[Tensor]) – Ground truth bboxes to be ignored.
gt_bboxes_list – Ground truth labels of each box.
stage (str) – init or refine. Generate target for init stage or refine stage
label_channels (int) – Channel of label.
unmap_outputs (bool) – Whether to map outputs back to the original set of anchors.
- 返回
labels_list (list[Tensor]): Labels of each level.
label_weights_list (list[Tensor]): Label weights of each level.
bbox_gt_list (list[Tensor]): Ground truth bbox of each level.
proposal_list (list[Tensor]): Proposals(points/bboxes) of each level.
proposal_weights_list (list[Tensor]): Proposal weights of each level.
num_total_pos (int): Number of positive samples in all images.
num_total_neg (int): Number of negative samples in all images.
- 返回类型
tuple (list[Tensor])
- loss(cls_scores, pts_preds_init, pts_preds_refine, gt_bboxes, gt_labels, img_metas, gt_bboxes_ignore=None)[源代码]¶
Loss function of CFA head.
- loss_single(cls_score, pts_pred_init, pts_pred_refine, labels, label_weights, rbbox_gt_init, convex_weights_init, rbbox_gt_refine, convex_weights_refine, stride, num_total_samples_refine)[源代码]¶
Single loss function.
- reassign(pos_losses, label, label_weight, pts_pred_init, convex_weight, gt_bbox, pos_inds, pos_gt_inds, num_proposals_each_level=None, num_level=None)[源代码]¶
CFA reassign process.
- 参数
pos_losses (Tensor) – Losses of all positive samples in single image.
label (Tensor) – classification target of each anchor with shape (num_anchors,)
label_weight (Tensor) – Classification loss weight of each anchor with shape (num_anchors).
pts_pred_init (Tensor) –
convex_weight (Tensor) – Bbox weight of each anchor with shape (num_anchors, 4).
gt_bbox (Tensor) – Ground truth box.
pos_inds (Tensor) – Index of all positive samples got from first assign process.
pos_gt_inds (Tensor) – Gt_index of all positive samples got from first assign process.
num_proposals_each_level (list, optional) – Number of proposals of each level.
num_level (int, optional) – Number of level.
- 返回
Usually returns a tuple containing learning targets.
label (Tensor): classification target of each anchor after paa assign, with shape (num_anchors,)
label_weight (Tensor): Classification loss weight of each anchor after paa assign, with shape (num_anchors).
convex_weight (Tensor): Bbox weight of each anchor with shape (num_anchors, 4).
pos_normalize_term (list): pos normalize term for refine points losses.
- 返回类型
tuple
- class mmrotate.models.dense_heads.RotatedRetinaHead(num_classes, in_channels, stacked_convs=4, conv_cfg=None, norm_cfg=None, anchor_generator={'octave_base_scale': 4, 'ratios': [0.5, 1.0, 2.0], 'scales_per_octave': 3, 'strides': [8, 16, 32, 64, 128], 'type': 'AnchorGenerator'}, init_cfg={'layer': 'Conv2d', 'override': {'bias_prob': 0.01, 'name': 'retina_cls', 'std': 0.01, 'type': 'Normal'}, 'std': 0.01, 'type': 'Normal'}, **kwargs)[源代码]¶
An anchor-based head used in RotatedRetinaNet.
The head contains two subnetworks. The first classifies anchor boxes and the second regresses deltas for the anchors.
- 参数
num_classes (int) – Number of categories excluding the background category.
in_channels (int) – Number of channels in the input feature map.
stacked_convs (int, optional) – Number of stacked convolutions.
conv_cfg (dict, optional) – Config dict for convolution layer. Default: None.
norm_cfg (dict, optional) – Config dict for normalization layer. Default: None.
anchor_generator (dict) – Config dict for anchor generator
init_cfg (dict or list[dict], optional) – Initialization config dict.
- filter_bboxes(cls_scores, bbox_preds)[源代码]¶
Filter predicted bounding boxes at each position of the feature maps. Only one bounding boxes with highest score will be left at each position. This filter will be used in R3Det prior to the first feature refinement stage.
- 参数
cls_scores (list[Tensor]) – Box scores for each scale level Has shape (N, num_anchors * num_classes, H, W)
bbox_preds (list[Tensor]) – Box energies / deltas for each scale level with shape (N, num_anchors * 5, H, W)
- 返回
best or refined rbboxes of each level of each image.
- 返回类型
list[list[Tensor]]
- forward_single(x)[源代码]¶
Forward feature of a single scale level.
- 参数
x (torch.Tensor) – Features of a single scale level.
- 返回
cls_score (torch.Tensor): Cls scores for a single scale level the channels number is num_anchors * num_classes.
bbox_pred (torch.Tensor): Box energies / deltas for a single scale level, the channels number is num_anchors * 5.
- 返回类型
tuple (torch.Tensor)
- refine_bboxes(cls_scores, bbox_preds)[源代码]¶
This function will be used in S2ANet, whose num_anchors=1.
- 参数
cls_scores (list[Tensor]) – Box scores for each scale level Has shape (N, num_classes, H, W)
bbox_preds (list[Tensor]) – Box energies / deltas for each scale level with shape (N, 5, H, W)
- 返回
refined rbboxes of each level of each image.
- 返回类型
list[list[Tensor]]
- class mmrotate.models.dense_heads.RotatedRetinaRefineHead(num_classes, in_channels, stacked_convs=4, conv_cfg=None, norm_cfg=None, anchor_generator={'strides': [8, 16, 32, 64, 128], 'type': 'PseudoAnchorGenerator'}, bbox_coder={'target_means': (0.0, 0.0, 0.0, 0.0, 0.0), 'target_stds': (1.0, 1.0, 1.0, 1.0, 1.0), 'type': 'DeltaXYWHABBoxCoder'}, init_cfg={'layer': 'Conv2d', 'override': {'bias_prob': 0.01, 'name': 'retina_cls', 'std': 0.01, 'type': 'Normal'}, 'std': 0.01, 'type': 'Normal'}, **kwargs)[源代码]¶
Rotated Anchor-based refine head.
- 参数
num_classes (int) – Number of categories excluding the background category.
in_channels (int) – Number of channels in the input feature map.
stacked_convs (int, optional) – Number of stacked convolutions.
conv_cfg (dict, optional) – Config dict for convolution layer. Default: None.
norm_cfg (dict, optional) – Config dict for normalization layer. Default: None.
anchor_generator (dict) – Config dict for anchor generator
bbox_coder (dict) – Config of bounding box coder.
init_cfg (dict or list[dict], optional) – Initialization config dict.
- get_anchors(featmap_sizes, img_metas, device='cuda')[源代码]¶
Get anchors according to feature map sizes.
- 参数
featmap_sizes (list[tuple]) – Multi-level feature map sizes.
img_metas (list[dict]) – Image meta info.
bboxes_as_anchors (list[list[Tensor]]) – before further regression just like anchors.
device (torch.device | str) – Device for returned tensors
- 返回
anchor_list (list[Tensor]): Anchors of each image
valid_flag_list (list[Tensor]): Valid flags of each image
- 返回类型
tuple (list[Tensor])
- get_bboxes(cls_scores, bbox_preds, img_metas, cfg=None, rescale=False, rois=None)[源代码]¶
Transform network output for a batch into labeled boxes.
- 参数
cls_scores (list[Tensor]) – Box scores for each scale level Has shape (N, num_anchors * num_classes, H, W)
bbox_preds (list[Tensor]) – Box energies / deltas for each scale level with shape (N, num_anchors * 5, H, W)
img_metas (list[dict]) – size / scale info for each image
cfg (mmcv.Config) – test / postprocessing configuration
rois (list[list[Tensor]]) – input rbboxes of each level of each image. rois output by former stages and are to be refined
rescale (bool) – if True, return boxes in original image space
- 返回
- each item in result_list is 2-tuple.
The first item is an (n, 6) tensor, where the first 5 columns are bounding box positions (xc, yc, w, h, a) and the 6-th column is a score between 0 and 1. The second item is a (n,) tensor where each item is the class index of the corresponding box.
- 返回类型
list[tuple[Tensor, Tensor]]
- loss(cls_scores, bbox_preds, gt_bboxes, gt_labels, img_metas, rois=None, gt_bboxes_ignore=None)[源代码]¶
Loss function of RotatedRetinaRefineHead.
- refine_bboxes(cls_scores, bbox_preds, rois)[源代码]¶
Refine predicted bounding boxes at each position of the feature maps. This method will be used in R3Det in refinement stages.
- 参数
cls_scores (list[Tensor]) – Box scores for each scale level Has shape (N, num_classes, H, W)
bbox_preds (list[Tensor]) – Box energies / deltas for each scale level with shape (N, 5, H, W)
rois (list[list[Tensor]]) – input rbboxes of each level of each image. rois output by former stages and are to be refined
- 返回
best or refined rbboxes of each level of each image.
- 返回类型
list[list[Tensor]]
- class mmrotate.models.dense_heads.SAMRepPointsHead(num_classes, in_channels, feat_channels, point_feat_channels=256, stacked_convs=3, num_points=9, gradient_mul=0.1, point_strides=[8, 16, 32, 64, 128], point_base_scale=4, conv_bias='auto', loss_cls={'alpha': 0.25, 'gamma': 2.0, 'loss_weight': 1.0, 'type': 'FocalLoss', 'use_sigmoid': True}, loss_bbox_init={'beta': 0.1111111111111111, 'loss_weight': 0.5, 'type': 'SmoothL1Loss'}, loss_bbox_refine={'beta': 0.1111111111111111, 'loss_weight': 1.0, 'type': 'SmoothL1Loss'}, conv_cfg=None, norm_cfg=None, train_cfg=None, test_cfg=None, center_init=True, transform_method='rotrect', topk=6, anti_factor=0.75, version='oc', init_cfg={'layer': 'Conv2d', 'override': {'bias_prob': 0.01, 'name': 'reppoints_cls_out', 'std': 0.01, 'type': 'Normal'}, 'std': 0.01, 'type': 'Normal'}, **kwargs)[源代码]¶
Rotated RepPoints head for SASM.
- 参数
num_classes (int) – Number of classes.
in_channels (int) – Number of input channels.
feat_channels (int) – Number of feature channels.
point_feat_channels (int, optional) – Number of channels of points features.
stacked_convs (int, optional) – Number of stacked convolutions.
num_points (int, optional) – Number of points in points set.
gradient_mul (float, optional) – The multiplier to gradients from points refinement and recognition.
point_strides (Iterable, optional) – points strides.
point_base_scale (int, optional) – Bbox scale for assigning labels.
conv_bias (str, optional) – The bias of convolution.
loss_cls (dict, optional) – Config of classification loss.
loss_bbox_init (dict, optional) – Config of initial points loss.
loss_bbox_refine (dict, optional) – Config of points loss in refinement.
conv_cfg (dict, optional) – The config of convolution.
norm_cfg (dict, optional) – The config of normlization.
train_cfg (dict, optional) – The config of train.
test_cfg (dict, optional) – The config of test.
center_init (bool, optional) – Whether to use center point assignment.
transform_method (str, optional) – The methods to transform RepPoints to bbox.
topk (int, optional) – Number of the highest topk points. Defaults to 9.
anti_factor (float, optional) – Feature anti-aliasing coefficient.
version (str, optional) – Angle representations. Defaults to ‘oc’.
init_cfg (dict or list[dict], optional) – Initialization config dict.
- get_bboxes(cls_scores, pts_preds_init, pts_preds_refine, img_metas, cfg=None, rescale=False, with_nms=True, **kwargs)[源代码]¶
Transform network outputs of a batch into bbox results.
- 参数
cls_scores (list[Tensor]) – Classification scores for all scale levels, each is a 4D-tensor, has shape (batch_size, num_priors * num_classes, H, W).
pts_preds_init (list[Tensor]) – Box energies / deltas for all scale levels, each is a 18D-tensor, has shape (batch_size, num_points * 2, H, W).
pts_preds_refine (list[Tensor]) – Box energies / deltas for all scale levels, each is a 18D-tensor, has shape (batch_size, num_points * 2, H, W).
img_metas (list[dict], Optional) – Image meta info. Default None.
cfg (mmcv.Config, Optional) – Test / postprocessing configuration, if None, test_cfg would be used. Default None.
rescale (bool) – If True, return boxes in original image space. Default False.
with_nms (bool) – If True, do nms before return boxes. Default True.
- 返回
- Each item in result_list is 2-tuple.
The first item is an (n, 6) tensor, where the first 4 columns are bounding box positions (cx, cy, w, h, a) and the 6-th column is a score between 0 and 1. The second item is a (n,) tensor where each item is the predicted class label of the corresponding box.
- 返回类型
list[list[Tensor, Tensor]]
- get_points(featmap_sizes, img_metas, device)[源代码]¶
Get points according to feature map sizes.
- 参数
featmap_sizes (list[tuple]) – Multi-level feature map sizes.
img_metas (list[dict]) – Image meta info.
- 返回
points of each image, valid flags of each image
- 返回类型
tuple
- get_targets(proposals_list, valid_flag_list, gt_bboxes_list, img_metas, gt_bboxes_ignore_list=None, gt_labels_list=None, stage='init', label_channels=1, unmap_outputs=True)[源代码]¶
Compute corresponding GT box and classification targets for proposals.
- 参数
proposals_list (list[list]) – Multi level points/bboxes of each image.
valid_flag_list (list[list]) – Multi level valid flags of each image.
gt_bboxes_list (list[Tensor]) – Ground truth bboxes of each image.
img_metas (list[dict]) – Meta info of each image.
gt_bboxes_ignore_list (list[Tensor]) – Ground truth bboxes to be ignored.
gt_bboxes_list – Ground truth labels of each box.
stage (str) – init or refine. Generate target for init stage or refine stage
label_channels (int) – Channel of label.
unmap_outputs (bool) – Whether to map outputs back to the original set of anchors.
- 返回
labels_list (list[Tensor]): Labels of each level.
label_weights_list (list[Tensor]): Label weights of each level.
bbox_gt_list (list[Tensor]): Ground truth bbox of each level.
proposal_list (list[Tensor]): Proposals(points/bboxes) of each level.
proposal_weights_list (list[Tensor]): Proposal weights of each level.
num_total_pos (int): Number of positive samples in all images.
num_total_neg (int): Number of negative samples in all images.
- 返回类型
tuple (list[Tensor])
- loss(cls_scores, pts_preds_init, pts_preds_refine, gt_bboxes, gt_labels, img_metas, gt_bboxes_ignore=None)[源代码]¶
Loss function of SAM RepPoints head.
roi_heads¶
- class mmrotate.models.roi_heads.GVRatioRoIHead(bbox_roi_extractor=None, bbox_head=None, shared_head=None, train_cfg=None, test_cfg=None, pretrained=None, init_cfg=None, version='oc')[源代码]¶
Gliding vertex roi head including one bbox head.
- forward_dummy(x, proposals)[源代码]¶
Dummy forward function.
- 参数
x (list[Tensors]) – list of multi-level img features.
proposals (list[Tensors]) – list of region proposals.
- 返回
list of region of interest.
- 返回类型
list[Tensors]
- simple_test_bboxes(x, img_metas, proposals, rcnn_test_cfg, rescale=False)[源代码]¶
Test only det bboxes without augmentation.
- 参数
x (tuple[Tensor]) – Feature maps of all scale level.
img_metas (list[dict]) – Image meta info.
proposals (List[Tensor]) – Region proposals.
(obj (rcnn_test_cfg) – ConfigDict): test_cfg of R-CNN.
rescale (bool) – If True, return boxes in original image space. Default: False.
- 返回
The first list contains the boxes of the corresponding image in a batch, each tensor has the shape (num_boxes, 5) and last dimension 5 represent (cx, cy, w, h, a, score). Each Tensor in the second list is the labels with shape (num_boxes, ). The length of both lists should be equal to batch_size.
- 返回类型
tuple[list[Tensor], list[Tensor]]
- class mmrotate.models.roi_heads.OrientedStandardRoIHead(bbox_roi_extractor=None, bbox_head=None, shared_head=None, train_cfg=None, test_cfg=None, pretrained=None, init_cfg=None, version='oc')[源代码]¶
Oriented RCNN roi head including one bbox head.
- forward_dummy(x, proposals)[源代码]¶
Dummy forward function.
- 参数
x (list[Tensors]) – list of multi-level img features.
proposals (list[Tensors]) – list of region proposals.
- 返回
list of region of interest.
- 返回类型
list[Tensors]
- forward_train(x, img_metas, proposal_list, gt_bboxes, gt_labels, gt_bboxes_ignore=None, gt_masks=None)[源代码]¶
- 参数
x (list[Tensor]) – list of multi-level img features.
img_metas (list[dict]) – list of image info dict where each dict has: ‘img_shape’, ‘scale_factor’, ‘flip’, and may also contain ‘filename’, ‘ori_shape’, ‘pad_shape’, and ‘img_norm_cfg’. For details on the values of these keys see mmdet/datasets/pipelines/formatting.py:Collect.
proposals (list[Tensors]) – list of region proposals.
gt_bboxes (list[Tensor]) – Ground truth bboxes for each image with shape (num_gts, 5) in [cx, cy, w, h, a] format.
gt_labels (list[Tensor]) – class indices corresponding to each box
gt_bboxes_ignore (None | list[Tensor]) – specify which bounding boxes can be ignored when computing the loss.
gt_masks (None | Tensor) – true segmentation masks for each box used if the architecture supports a segmentation task. Always set to None.
- 返回
a dictionary of loss components
- 返回类型
dict[str, Tensor]
- simple_test_bboxes(x, img_metas, proposals, rcnn_test_cfg, rescale=False)[源代码]¶
Test only det bboxes without augmentation.
- 参数
x (tuple[Tensor]) – Feature maps of all scale level.
img_metas (list[dict]) – Image meta info.
proposals (List[Tensor]) – Region proposals.
(obj (rcnn_test_cfg) – ConfigDict): test_cfg of R-CNN.
rescale (bool) – If True, return boxes in original image space. Default: False.
- 返回
The first list contains the boxes of the corresponding image in a batch, each tensor has the shape (num_boxes, 5) and last dimension 5 represent (cx, cy, w, h, a, score). Each Tensor in the second list is the labels with shape (num_boxes, ). The length of both lists should be equal to batch_size.
- 返回类型
tuple[list[Tensor], list[Tensor]]
- class mmrotate.models.roi_heads.RoITransRoIHead(num_stages, stage_loss_weights, bbox_roi_extractor=None, bbox_head=None, shared_head=None, train_cfg=None, test_cfg=None, pretrained=None, version='oc', init_cfg=None)[源代码]¶
RoI Trans cascade roi head including one bbox head.
- 参数
num_stages (int) – number of cascade stages.
stage_loss_weights (list[float]) – loss weights of cascade stages.
bbox_roi_extractor (dict, optional) – Config of
bbox_roi_extractor
.bbox_head (dict, optional) – Config of
bbox_head
.shared_head (dict, optional) – Config of
shared_head
.train_cfg (dict, optional) – Config of train.
test_cfg (dict, optional) – Config of test.
pretrained (str, optional) – Path of pretrained weight.
version (str, optional) – Angle representations. Defaults to ‘oc’.
init_cfg (dict, optional) – Config of initialization.
- forward_dummy(x, proposals)[源代码]¶
Dummy forward function.
- 参数
x (list[Tensors]) – list of multi-level img features.
proposals (list[Tensors]) – list of region proposals.
- 返回
list of region of interest.
- 返回类型
list[Tensors]
- forward_train(x, img_metas, proposal_list, gt_bboxes, gt_labels, gt_bboxes_ignore=None, gt_masks=None)[源代码]¶
- 参数
x (list[Tensor]) – list of multi-level img features.
img_metas (list[dict]) – list of image info dict where each dict has: ‘img_shape’, ‘scale_factor’, ‘flip’, and may also contain ‘filename’, ‘ori_shape’, ‘pad_shape’, and ‘img_norm_cfg’. For details on the values of these keys see mmdet/datasets/pipelines/formatting.py:Collect.
proposals (list[Tensors]) – list of region proposals.
gt_bboxes (list[Tensor]) – Ground truth bboxes for each image with shape (num_gts, 5) in [cx, cy, w, h, a] format.
gt_labels (list[Tensor]) – class indices corresponding to each box
gt_bboxes_ignore (None | list[Tensor]) – specify which bounding boxes can be ignored when computing the loss.
gt_masks (None | Tensor) – true segmentation masks for each box used if the architecture supports a segmentation task. Always set to None.
- 返回
a dictionary of loss components
- 返回类型
dict[str, Tensor]
- init_bbox_head(bbox_roi_extractor, bbox_head)[源代码]¶
Initialize box head and box roi extractor.
- 参数
bbox_roi_extractor (dict) – Config of box roi extractor.
bbox_head (dict) – Config of box in box head.
- simple_test(x, proposal_list, img_metas, rescale=False)[源代码]¶
Test without augmentation.
- 参数
x (list[Tensor]) – list of multi-level img features.
proposal_list (list[Tensors]) – list of region proposals.
img_metas (list[dict]) – list of image info dict where each dict has: ‘img_shape’, ‘scale_factor’, ‘flip’, and may also contain ‘filename’, ‘ori_shape’, ‘pad_shape’, and ‘img_norm_cfg’.
rescale (bool) – If True, return boxes in original image space. Default: False.
- 返回
a dictionary of bbox_results.
- 返回类型
dict[str, Tensor]
- class mmrotate.models.roi_heads.RotatedBBoxHead(with_avg_pool=False, with_cls=True, with_reg=True, roi_feat_size=7, in_channels=256, num_classes=80, bbox_coder={'clip_border': True, 'target_means': [0.0, 0.0, 0.0, 0.0], 'target_stds': [0.1, 0.1, 0.2, 0.2], 'type': 'DeltaXYWHBBoxCoder'}, reg_class_agnostic=False, reg_decoded_bbox=False, reg_predictor_cfg={'type': 'Linear'}, cls_predictor_cfg={'type': 'Linear'}, loss_cls={'loss_weight': 1.0, 'type': 'CrossEntropyLoss', 'use_sigmoid': False}, loss_bbox={'beta': 1.0, 'loss_weight': 1.0, 'type': 'SmoothL1Loss'}, init_cfg=None)[源代码]¶
Simplest RoI head, with only two fc layers for classification and regression respectively.
- 参数
with_avg_pool (bool, optional) – If True, use
avg_pool
.with_cls (bool, optional) – If True, use classification branch.
with_reg (bool, optional) – If True, use regression branch.
roi_feat_size (int, optional) – Size of RoI features.
in_channels (int, optional) – Input channels.
num_classes (int, optional) – Number of classes.
bbox_coder (dict, optional) – Config of bbox coder.
reg_class_agnostic (bool, optional) – If True, regression branch are class agnostic.
reg_decoded_bbox (bool, optional) – If True, regression branch use decoded bbox to compute loss.
reg_predictor_cfg (dict, optional) – Config of regression predictor.
cls_predictor_cfg (dict, optional) – Config of classification predictor.
loss_cls (dict, optional) – Config of classification loss.
loss_bbox (dict, optional) – Config of regression loss.
init_cfg (dict, optional) – Config of initialization.
- property custom_accuracy¶
The custom accuracy.
- property custom_activation¶
The custom activation.
- property custom_cls_channels¶
The custom cls channels.
- get_bboxes(rois, cls_score, bbox_pred, img_shape, scale_factor, rescale=False, cfg=None)[源代码]¶
Transform network output for a batch into bbox predictions.
- 参数
rois (torch.Tensor) – Boxes to be transformed. Has shape (num_boxes, 5). last dimension 5 arrange as (batch_index, x1, y1, x2, y2).
cls_score (torch.Tensor) – Box scores, has shape (num_boxes, num_classes + 1).
bbox_pred (Tensor, optional) – Box energies / deltas. has shape (num_boxes, num_classes * 5).
img_shape (Sequence[int], optional) – Maximum bounds for boxes, specifies (H, W, C) or (H, W).
scale_factor (ndarray) – Scale factor of the image arrange as (w_scale, h_scale, w_scale, h_scale).
rescale (bool) – If True, return boxes in original image space. Default: False.
(obj (cfg) – ConfigDict): test_cfg of Bbox Head. Default: None
- 返回
First tensor is det_bboxes, has the shape (num_boxes, 6) and last dimension 6 represent (cx, cy, w, h, a, score). Second tensor is the labels with shape (num_boxes, ).
- 返回类型
tuple[Tensor, Tensor]
- get_targets(sampling_results, gt_bboxes, gt_labels, rcnn_train_cfg, concat=True)[源代码]¶
Calculate the ground truth for all samples in a batch according to the sampling_results.
Almost the same as the implementation in bbox_head, we passed additional parameters pos_inds_list and neg_inds_list to _get_target_single function.
- 参数
(List[obj (sampling_results) – SamplingResults]): Assign results of all images in a batch after sampling.
gt_bboxes (list[Tensor]) – Gt_bboxes of all images in a batch, each tensor has shape (num_gt, 5), the last dimension 5 represents [cx, cy, w, h, a].
gt_labels (list[Tensor]) – Gt_labels of all images in a batch, each tensor has shape (num_gt,).
(obj (rcnn_train_cfg) – ConfigDict): train_cfg of RCNN.
concat (bool) – Whether to concatenate the results of all the images in a single batch.
- 返回
Ground truth for proposals in a single image. Containing the following list of Tensors:
labels (list[Tensor],Tensor): Gt_labels for all proposals in a batch, each tensor in list has shape (num_proposals,) when concat=False, otherwise just a single tensor has shape (num_all_proposals,).
label_weights (list[Tensor]): Labels_weights for all proposals in a batch, each tensor in list has shape (num_proposals,) when concat=False, otherwise just a single tensor has shape (num_all_proposals,).
bbox_targets (list[Tensor],Tensor): Regression target for all proposals in a batch, each tensor in list has shape (num_proposals, 5) when concat=False, otherwise just a single tensor has shape (num_all_proposals, 5), the last dimension 4 represents [cx, cy, w, h, a].
bbox_weights (list[tensor],Tensor): Regression weights for all proposals in a batch, each tensor in list has shape (num_proposals, 5) when concat=False, otherwise just a single tensor has shape (num_all_proposals, 5).
- 返回类型
Tuple[Tensor]
- loss(cls_score, bbox_pred, rois, labels, label_weights, bbox_targets, bbox_weights, reduction_override=None)[源代码]¶
Loss function.
- 参数
cls_score (torch.Tensor) – Box scores, has shape (num_boxes, num_classes + 1).
bbox_pred (Tensor, optional) – Box energies / deltas. has shape (num_boxes, num_classes * 5).
rois (torch.Tensor) – Boxes to be transformed. Has shape (num_boxes, 5). last dimension 5 arrange as (batch_index, x1, y1, x2, y2).
labels (torch.Tensor) – Shape (n*bs, ).
label_weights (torch.Tensor) – Labels_weights for all proposals, has shape (num_proposals,).
bbox_targets (torch.Tensor) – Regression target for all proposals, has shape (num_proposals, 5), the last dimension 5 represents [cx, cy, w, h, a].
bbox_weights (list[tensor],Tensor) – Regression weights for all proposals in a batch, each tensor in list has shape (num_proposals, 5) when concat=False, otherwise just a single tensor has shape (num_all_proposals, 5).
reduction_override (str, optional) – The reduction method used to override the original reduction method of the loss. Defaults to None.
- refine_bboxes(rois, labels, bbox_preds, pos_is_gts, img_metas)[源代码]¶
Refine bboxes during training.
- 参数
rois (torch.Tensor) – Shape (n*bs, 5), where n is image number per GPU, and bs is the sampled RoIs per image. The first column is the image id and the next 4 columns are x1, y1, x2, y2.
labels (torch.Tensor) – Shape (n*bs, ).
bbox_preds (torch.Tensor) – Shape (n*bs, 5) or (n*bs, 5*#class).
pos_is_gts (list[Tensor]) – Flags indicating if each positive bbox is a gt bbox.
img_metas (list[dict]) – Meta info of each image.
- 返回
Refined bboxes of each image in a mini-batch.
- 返回类型
list[Tensor]
- regress_by_class(rois, label, bbox_pred, img_meta)[源代码]¶
Regress the bbox for the predicted class. Used in Cascade R-CNN.
- 参数
rois (torch.Tensor) – shape (n, 4) or (n, 5)
label (torch.Tensor) – shape (n, )
bbox_pred (torch.Tensor) – shape (n, 5*(#class)) or (n, 5)
img_meta (dict) – Image meta info.
- 返回
Regressed bboxes, the same shape as input rois.
- 返回类型
Tensor
- class mmrotate.models.roi_heads.RotatedConvFCBBoxHead(num_shared_convs=0, num_shared_fcs=0, num_cls_convs=0, num_cls_fcs=0, num_reg_convs=0, num_reg_fcs=0, conv_out_channels=256, fc_out_channels=1024, conv_cfg=None, norm_cfg=None, init_cfg=None, *args, **kwargs)[源代码]¶
More general bbox head, with shared conv and fc layers and two optional separated branches.
/-> cls convs -> cls fcs -> cls shared convs -> shared fcs \-> reg convs -> reg fcs -> reg
- 参数
num_shared_convs (int, optional) – number of
shared_convs
.num_shared_fcs (int, optional) – number of
shared_fcs
.num_cls_convs (int, optional) – number of
cls_convs
.num_cls_fcs (int, optional) – number of
cls_fcs
.num_reg_convs (int, optional) – number of
reg_convs
.num_reg_fcs (int, optional) – number of
reg_fcs
.conv_out_channels (int, optional) – output channels of convolution.
fc_out_channels (int, optional) – output channels of fc.
conv_cfg (dict, optional) – Config of convolution.
norm_cfg (dict, optional) – Config of normalization.
init_cfg (dict, optional) – Config of initialization.
Shared2FC RBBox head.
- class mmrotate.models.roi_heads.RotatedSingleRoIExtractor(roi_layer, out_channels, featmap_strides, finest_scale=56, init_cfg=None)[源代码]¶
Extract RoI features from a single level feature map.
If there are multiple input feature levels, each RoI is mapped to a level according to its scale. The mapping rule is proposed in FPN.
- 参数
roi_layer (dict) – Specify RoI layer type and arguments.
out_channels (int) – Output channels of RoI layers.
featmap_strides (List[int]) – Strides of input feature maps.
finest_scale (int) – Scale threshold of mapping to level 0. Default: 56.
init_cfg (dict or list[dict], optional) – Initialization config dict. Default: None
- build_roi_layers(layer_cfg, featmap_strides)[源代码]¶
Build RoI operator to extract feature from each level feature map.
- 参数
layer_cfg (dict) – Dictionary to construct and config RoI layer operation. Options are modules under
mmcv/ops
such asRoIAlign
.featmap_strides (List[int]) – The stride of input feature map w.r.t to the original image size, which would be used to scale RoI coordinate (original image coordinate system) to feature coordinate system.
- 返回
The RoI extractor modules for each level feature map.
- 返回类型
nn.ModuleList
- forward(feats, rois, roi_scale_factor=None)[源代码]¶
Forward function.
- 参数
feats (torch.Tensor) – Input features.
rois (torch.Tensor) – Input RoIs, shape (k, 5).
scale_factor (float) – Scale factor that RoI will be multiplied by.
- 返回
Scaled RoI features.
- 返回类型
torch.Tensor
- map_roi_levels(rois, num_levels)[源代码]¶
Map rois to corresponding feature levels by scales.
scale < finest_scale * 2: level 0
finest_scale * 2 <= scale < finest_scale * 4: level 1
finest_scale * 4 <= scale < finest_scale * 8: level 2
scale >= finest_scale * 8: level 3
- 参数
rois (torch.Tensor) – Input RoIs, shape (k, 5).
num_levels (int) – Total level number.
- 返回
Level index (0-based) of each RoI, shape (k, )
- 返回类型
Tensor
- class mmrotate.models.roi_heads.RotatedStandardRoIHead(bbox_roi_extractor=None, bbox_head=None, shared_head=None, train_cfg=None, test_cfg=None, pretrained=None, init_cfg=None, version='oc')[源代码]¶
Simplest base rotated roi head including one bbox head.
- 参数
bbox_roi_extractor (dict, optional) – Config of
bbox_roi_extractor
.bbox_head (dict, optional) – Config of
bbox_head
.shared_head (dict, optional) – Config of
shared_head
.train_cfg (dict, optional) – Config of train.
test_cfg (dict, optional) – Config of test.
pretrained (str, optional) – Path of pretrained weight.
init_cfg (dict, optional) – Config of initialization.
version (str, optional) – Angle representations. Defaults to ‘oc’.
- async async_simple_test(x, proposal_list, img_metas, rescale=False)[源代码]¶
Async test without augmentation.
- 参数
x (list[Tensor]) – list of multi-level img features.
proposal_list (list[Tensors]) – list of region proposals.
img_metas (list[dict]) – list of image info dict where each dict has: ‘img_shape’, ‘scale_factor’, ‘flip’, and may also contain ‘filename’, ‘ori_shape’, ‘pad_shape’, and ‘img_norm_cfg’.
rescale (bool) – If True, return boxes in original image space. Default: False.
- 返回
a dictionary of bbox_results.
- 返回类型
dict[str, Tensor]
- forward_dummy(x, proposals)[源代码]¶
Dummy forward function.
- 参数
x (list[Tensors]) – list of multi-level img features.
proposals (list[Tensors]) – list of region proposals.
- 返回
list of region of interest.
- 返回类型
list[Tensors]
- forward_train(x, img_metas, proposal_list, gt_bboxes, gt_labels, gt_bboxes_ignore=None, gt_masks=None)[源代码]¶
- 参数
x (list[Tensor]) – list of multi-level img features.
img_metas (list[dict]) – list of image info dict where each dict has: ‘img_shape’, ‘scale_factor’, ‘flip’, and may also contain ‘filename’, ‘ori_shape’, ‘pad_shape’, and ‘img_norm_cfg’. For details on the values of these keys see mmdet/datasets/pipelines/formatting.py:Collect.
proposals (list[Tensors]) – list of region proposals.
gt_bboxes (list[Tensor]) – Ground truth bboxes for each image with shape (num_gts, 5) in [cx, cy, w, h, a] format.
gt_labels (list[Tensor]) – class indices corresponding to each box
gt_bboxes_ignore (None | list[Tensor]) – specify which bounding boxes can be ignored when computing the loss.
gt_masks (None | Tensor) – true segmentation masks for each box used if the architecture supports a segmentation task. Always set to None.
- 返回
a dictionary of loss components.
- 返回类型
dict[str, Tensor]
- init_bbox_head(bbox_roi_extractor, bbox_head)[源代码]¶
Initialize
bbox_head
.- 参数
bbox_roi_extractor (dict) – Config of
bbox_roi_extractor
.bbox_head (dict) – Config of
bbox_head
.
- simple_test(x, proposal_list, img_metas, rescale=False)[源代码]¶
Test without augmentation.
- 参数
x (list[Tensor]) – list of multi-level img features.
proposal_list (list[Tensors]) – list of region proposals.
img_metas (list[dict]) – list of image info dict where each dict has: ‘img_shape’, ‘scale_factor’, ‘flip’, and may also contain ‘filename’, ‘ori_shape’, ‘pad_shape’, and ‘img_norm_cfg’.
rescale (bool) – If True, return boxes in original image space. Default: False.
- 返回
a dictionary of bbox_results.
- 返回类型
dict[str, Tensor]
- simple_test_bboxes(x, img_metas, proposals, rcnn_test_cfg, rescale=False)[源代码]¶
Test only det bboxes without augmentation.
- 参数
x (tuple[Tensor]) – Feature maps of all scale level.
img_metas (list[dict]) – Image meta info.
proposals (List[Tensor]) – Region proposals.
(obj (rcnn_test_cfg) – ConfigDict): test_cfg of R-CNN.
rescale (bool) – If True, return boxes in original image space. Default: False.
- 返回
The first list contains the boxes of the corresponding image in a batch, each tensor has the shape (num_boxes, 5) and last dimension 5 represent (tl_x, tl_y, br_x, br_y, score). Each Tensor in the second list is the labels with shape (num_boxes, ). The length of both lists should be equal to batch_size.
- 返回类型
tuple[list[Tensor], list[Tensor]]
losses¶
- class mmrotate.models.losses.BCConvexGIoULoss(reduction='mean', loss_weight=1.0)[源代码]¶
BCConvex GIoU loss.
Computing the BCConvex GIoU loss between a set of predicted convexes and target convexes.
- 参数
reduction (str, optional) – The reduction method of the loss. Defaults to ‘mean’.
loss_weight (float, optional) – The weight of loss. Defaults to 1.0.
- 返回
Loss tensor.
- 返回类型
torch.Tensor
- forward(pred, target, weight=None, avg_factor=None, reduction_override=None, **kwargs)[源代码]¶
Forward function.
- 参数
pred (torch.Tensor) – Predicted convexes.
target (torch.Tensor) – Corresponding gt convexes.
weight (torch.Tensor, optional) – The weight of loss for each prediction. Defaults to None.
avg_factor (int, optional) – Average factor that is used to average the loss. Defaults to None.
reduction_override (str, optional) – The reduction method used to override the original reduction method of the loss. Defaults to None.
- class mmrotate.models.losses.ConvexGIoULoss(reduction='mean', loss_weight=1.0)[源代码]¶
Convex GIoU loss.
Computing the Convex GIoU loss between a set of predicted convexes and target convexes.
- 参数
reduction (str, optional) – The reduction method of the loss. Defaults to ‘mean’.
loss_weight (float, optional) – The weight of loss. Defaults to 1.0.
- 返回
Loss tensor.
- 返回类型
torch.Tensor
- forward(pred, target, weight=None, avg_factor=None, reduction_override=None, **kwargs)[源代码]¶
Forward function.
- 参数
pred (torch.Tensor) – Predicted convexes.
target (torch.Tensor) – Corresponding gt convexes.
weight (torch.Tensor, optional) – The weight of loss for each prediction. Defaults to None.
avg_factor (int, optional) – Average factor that is used to average the loss. Defaults to None.
reduction_override (str, optional) – The reduction method used to override the original reduction method of the loss. Defaults to None.
- class mmrotate.models.losses.GDLoss(loss_type, representation='xy_wh_r', fun='log1p', tau=0.0, alpha=1.0, reduction='mean', loss_weight=1.0, **kwargs)[源代码]¶
Gaussian based loss.
- 参数
loss_type (str) – Type of loss.
representation (str, optional) – Coordinate System.
fun (str, optional) – The function applied to distance. Defaults to ‘log1p’.
tau (float, optional) – Defaults to 1.0.
alpha (float, optional) – Defaults to 1.0.
reduction (str, optional) – The reduction method of the loss. Defaults to ‘mean’.
loss_weight (float, optional) – The weight of loss. Defaults to 1.0.
- 返回
loss (torch.Tensor)
- forward(pred, target, weight=None, avg_factor=None, reduction_override=None, **kwargs)[源代码]¶
Forward function.
- 参数
pred (torch.Tensor) – Predicted convexes.
target (torch.Tensor) – Corresponding gt convexes.
weight (torch.Tensor, optional) – The weight of loss for each prediction. Defaults to None.
avg_factor (int, optional) – Average factor that is used to average the loss. Defaults to None.
reduction_override (str, optional) – The reduction method used to override the original reduction method of the loss. Defaults to None.
- class mmrotate.models.losses.GDLoss_v1(loss_type, fun='sqrt', tau=1.0, reduction='mean', loss_weight=1.0, **kwargs)[源代码]¶
Gaussian based loss.
- 参数
loss_type (str) – Type of loss.
fun (str, optional) – The function applied to distance. Defaults to ‘log1p’.
tau (float, optional) – Defaults to 1.0.
reduction (str, optional) – The reduction method of the loss. Defaults to ‘mean’.
loss_weight (float, optional) – The weight of loss. Defaults to 1.0.
- 返回
loss (torch.Tensor)
- forward(pred, target, weight=None, avg_factor=None, reduction_override=None, **kwargs)[源代码]¶
Forward function.
- 参数
pred (torch.Tensor) – Predicted convexes.
target (torch.Tensor) – Corresponding gt convexes.
weight (torch.Tensor, optional) – The weight of loss for each prediction. Defaults to None.
avg_factor (int, optional) – Average factor that is used to average the loss. Defaults to None.
reduction_override (str, optional) – The reduction method used to override the original reduction method of the loss. Defaults to None.
- class mmrotate.models.losses.KFLoss(fun='none', reduction='mean', loss_weight=1.0, **kwargs)[源代码]¶
Kalman filter based loss.
- 参数
fun (str, optional) – The function applied to distance. Defaults to ‘log1p’.
reduction (str, optional) – The reduction method of the loss. Defaults to ‘mean’.
loss_weight (float, optional) – The weight of loss. Defaults to 1.0.
- 返回
loss (torch.Tensor)
- forward(pred, target, weight=None, avg_factor=None, pred_decode=None, targets_decode=None, reduction_override=None, **kwargs)[源代码]¶
Forward function.
- 参数
pred (torch.Tensor) – Predicted convexes.
target (torch.Tensor) – Corresponding gt convexes.
weight (torch.Tensor, optional) – The weight of loss for each prediction. Defaults to None.
avg_factor (int, optional) – Average factor that is used to average the loss. Defaults to None.
pred_decode (torch.Tensor) – Predicted decode bboxes.
targets_decode (torch.Tensor) – Corresponding gt decode bboxes.
reduction_override (str, optional) – The reduction method used to override the original reduction method of the loss. Defaults to None.
- 返回
loss (torch.Tensor)
- class mmrotate.models.losses.KLDRepPointsLoss(eps=1e-06, reduction='mean', loss_weight=1.0)[源代码]¶
Kullback-Leibler Divergence loss for RepPoints.
- 参数
eps (float) – Defaults to 1e-6.
reduction (str, optional) – The reduction method of the loss. Defaults to ‘mean’.
loss_weight (float, optional) – The weight of loss. Defaults to 1.0.
- forward(pred, target, weight=None, avg_factor=None, reduction_override=None, **kwargs)[源代码]¶
Forward function.
- 参数
pred (torch.Tensor) – Predicted convexes.
target (torch.Tensor) – Corresponding gt convexes.
weight (torch.Tensor, optional) – The weight of loss for each prediction. Defaults to None.
avg_factor (int, optional) – Average factor that is used to average the loss. Defaults to None.
reduction_override (str, optional) – The reduction method used to override the original reduction method of the loss. Defaults to None.
- 返回
loss (torch.Tensor)
- class mmrotate.models.losses.RotatedIoULoss(linear=False, eps=1e-06, reduction='mean', loss_weight=1.0, mode='log')[源代码]¶
RotatedIoULoss.
Computing the IoU loss between a set of predicted rbboxes and target rbboxes. :param linear: If True, use linear scale of loss else determined
by mode. Default: False.
- 参数
eps (float) – Eps to avoid log(0).
reduction (str) – Options are “none”, “mean” and “sum”.
loss_weight (float) – Weight of loss.
mode (str) – Loss scaling mode, including “linear”, “square”, and “log”. Default: ‘log’
- forward(pred, target, weight=None, avg_factor=None, reduction_override=None, **kwargs)[源代码]¶
Forward function.
- 参数
pred (torch.Tensor) – The prediction.
target (torch.Tensor) – The learning target of the prediction.
weight (torch.Tensor, optional) – The weight of loss for each prediction. Defaults to None.
avg_factor (int, optional) – Average factor that is used to average the loss. Defaults to None.
reduction_override (str, optional) – The reduction method used to override the original reduction method of the loss. Defaults to None. Options are “none”, “mean” and “sum”.
- class mmrotate.models.losses.SmoothFocalLoss(gamma=2.0, alpha=0.25, reduction='mean', loss_weight=1.0)[源代码]¶
Smooth Focal Loss. Implementation of Circular Smooth Label (CSL).
- 参数
gamma (float, optional) – The gamma for calculating the modulating factor. Defaults to 2.0.
alpha (float, optional) – A balanced form for Focal Loss. Defaults to 0.25.
reduction (str, optional) – The method used to reduce the loss into a scalar. Defaults to ‘mean’. Options are “none”, “mean” and “sum”.
loss_weight (float, optional) – Weight of loss. Defaults to 1.0.
- 返回
loss (torch.Tensor)
- forward(pred, target, weight=None, avg_factor=None, reduction_override=None)[源代码]¶
Forward function.
- 参数
pred (torch.Tensor) – The prediction.
target (torch.Tensor) – The learning label of the prediction.
weight (torch.Tensor, optional) – The weight of loss for each prediction. Defaults to None.
avg_factor (int, optional) – Average factor that is used to average the loss. Defaults to None.
reduction_override (str, optional) – The reduction method used to override the original reduction method of the loss. Options are “none”, “mean” and “sum”.
- 返回
The calculated loss
- 返回类型
torch.Tensor
utils¶
- class mmrotate.models.utils.ORConv2d(in_channels, out_channels, kernel_size=3, arf_config=None, stride=1, padding=0, dilation=1, groups=1, bias=True)[源代码]¶
Oriented 2-D convolution.
- 参数
in_channels (List[int]) – Number of input channels per scale.
out_channels (int) – Number of output channels (used at each scale).
kernel_size (int, optional) – The size of kernel.
arf_config (tuple, optional) – a tuple consist of nOrientation and nRotation.
stride (int, optional) – Stride of the convolution. Default: 1.
padding (int or tuple) – Zero-padding added to both sides of the input. Default: 0.
dilation (int or tuple) – Spacing between kernel elements. Default: 1.
groups (int) – Number of blocked connections from input. channels to output channels. Default: 1.
bias (bool) – If True, adds a learnable bias to the output. Default: False.
- class mmrotate.models.utils.RotationInvariantPooling(nInputPlane, nOrientation=8)[源代码]¶
Rotating invariant pooling module.
- 参数
nInputPlane (int) – The number of Input plane.
nOrientation (int, optional) – The number of oriented channels.
- mmrotate.models.utils.build_enn_divide_feature(planes)[源代码]¶
build a enn regular feature map with the specified number of channels divided by N.
- mmrotate.models.utils.build_enn_feature(planes)[源代码]¶
build a enn regular feature map with the specified number of channels.
- mmrotate.models.utils.build_enn_norm_layer(num_features, postfix='')[源代码]¶
build an enn normalizion layer.
- mmrotate.models.utils.build_enn_trivial_feature(planes)[源代码]¶
build a enn trivial feature map with the specified number of channels.
- mmrotate.models.utils.ennAvgPool(inplanes, kernel_size=1, stride=None, padding=0, ceil_mode=False)[源代码]¶
enn Average Pooling.
- 参数
inplanes (int) – The number of input channel.
kernel_size (int, optional) – The size of kernel.
stride (int, optional) – Stride of the convolution. Default: 1.
padding (int or tuple) – Zero-padding added to both sides of the input. Default: 0.
ceil_mode (bool, optional) – if True, keep information in the corner of feature map.
- mmrotate.models.utils.ennConv(inplanes, outplanes, kernel_size=3, stride=1, padding=0, groups=1, bias=False, dilation=1)[源代码]¶
enn convolution.
- 参数
in_channels (List[int]) – Number of input channels per scale.
out_channels (int) – Number of output channels (used at each scale).
kernel_size (int, optional) – The size of kernel.
stride (int, optional) – Stride of the convolution. Default: 1.
padding (int or tuple) – Zero-padding added to both sides of the input. Default: 0.
groups (int) – Number of blocked connections from input. channels to output channels. Default: 1.
bias (bool) – If True, adds a learnable bias to the output. Default: False.
dilation (int or tuple) – Spacing between kernel elements. Default: 1.
- mmrotate.models.utils.ennInterpolate(inplanes, scale_factor, mode='nearest', align_corners=False)[源代码]¶
enn Interpolate.
- mmrotate.models.utils.ennTrivialConv(inplanes, outplanes, kernel_size=3, stride=1, padding=0, groups=1, bias=False, dilation=1)[源代码]¶
enn convolution with trivial input featurn.
- 参数
in_channels (List[int]) – Number of input channels per scale.
out_channels (int) – Number of output channels (used at each scale).
kernel_size (int, optional) – The size of kernel.
stride (int, optional) – Stride of the convolution. Default: 1.
padding (int or tuple) – Zero-padding added to both sides of the input. Default: 0.
groups (int) – Number of blocked connections from input. channels to output channels. Default: 1.
bias (bool) – If True, adds a learnable bias to the output. Default: False.
dilation (int or tuple) – Spacing between kernel elements. Default: 1.
mmrotate.utils¶
- mmrotate.utils.compat_cfg(cfg)[源代码]¶
This function would modify some filed to keep the compatibility of config.
For example, it will move some args which will be deprecated to the correct fields.
- mmrotate.utils.find_latest_checkpoint(path, suffix='pth')[源代码]¶
Find the latest checkpoint from the working directory.
- 参数
path (str) – The path to find checkpoints.
suffix (str) – File extension. Defaults to pth.
- 返回
File path of the latest checkpoint.
- 返回类型
latest_path(str | None)
引用
- 1
https://github.com/microsoft/SoftTeacher /blob/main/ssod/utils/patch.py