yanghai-1218 / radet Goto Github PK
View Code? Open in Web Editor NEWRigidity-Aware Detection for 6D Object Pose Estimation (CVPR 2023)
License: Apache License 2.0
Rigidity-Aware Detection for 6D Object Pose Estimation (CVPR 2023)
License: Apache License 2.0
Good work for 6D object pose estimation!
However, an error occurred, when I was running 'python tools/test.py --config configs/bop/r50_ycbv_pbr.py --checkpoint checkpoints/radet_ycbv_pbr.pth --format-only --eval-options jsonfile_prefix=work_dirs/results/radet_ycbv_pbr'. It seems to relate to the 'mmdetection' package. Could you please tell me how to install the correct 'mmdetection' package or some other solutions? Thank you!
Detailed information:
Traceback (most recent call last):
File "/data3/tantao/my_projects/RADet-main/test.py", line 13, in
from radet.apis import multi_gpu_test, single_gpu_test
File "/data3/tantao/my_projects/RADet-main/radet/apis/init.py", line 1, in
from .inference import (async_inference_detector, inference_detector,
File "/data3/tantao/my_projects/RADet-main/radet/apis/inference.py", line 10, in
from radet.core import get_classes
File "/data3/tantao/my_projects/RADet-main/radet/core/init.py", line 7, in
from .post_processing import * # noqa: F401, F403
File "/data3/tantao/my_projects/RADet-main/radet/core/post_processing/init.py", line 1, in
from .bbox_nms import fast_nms, multiclass_nms, multiclass_vote
File "/data3/tantao/my_projects/RADet-main/radet/core/post_processing/bbox_nms.py", line 3, in
from mmdet.ops import vote_nms
ModuleNotFoundError: No module named 'mmdet.ops'
Pip list:
Package Version Editable project location
addict 2.4.0
certifi 2023.5.7
contourpy 1.1.0
cycler 0.11.0
Cython 0.29.35
fonttools 4.41.1
importlib-metadata 6.8.0
importlib-resources 6.0.0
Jinja2 3.1.2
kiwisolver 1.4.4
markdown-it-py 3.0.0
MarkupSafe 2.1.3
matplotlib 3.4.3
mdurl 0.1.2
mkl-fft 1.3.1
mkl-random 1.2.2
mkl-service 2.4.0
mmcv 1.3.18
mmcv-full 1.3.18
mmdet 3.1.0 /data3/xx/download/mmdetection-main
mmengine 0.8.2
mmpycocotools 12.0.3
numpy 1.24.3
opencv-python 4.8.0.74
packaging 23.1
Pillow 10.0.0
pip 23.1.2
platformdirs 3.9.1
pycocotools 2.0.6
Pygments 2.15.1
pyparsing 3.0.9
pyproject 1.3.1
python-dateutil 2.8.2
PyYAML 6.0.1
RADet 1.0.0 /data3/xx/my_projects/RADet-main
rich 13.4.2
scipy 1.11.1
setuptools 67.8.0
shapely 2.0.1
six 1.16.0
termcolor 2.3.0
terminaltables 3.1.10
tomli 2.0.1
torch 1.10.0
torchaudio 0.10.0
torchvision 0.11.0
tornado 6.3.2
typing_extensions 4.7.1
wheel 0.38.4
yapf 0.40.1
zipp 3.16.2
The itodd data set does not seem to give the true value of the test. What should I do?
RADet/radet/datasets/pipelines/loading.py
Line 543 in 22296c7
If with_gt_mask=True
, then distancemap will be the "mask_visible" in the dataset, not the distancemap generated using the method in the paper.
Hello, author
Thanks for your great work and sharing
I tried to prepare data using ycbv_pbr datastet, but there is an error
usage: collect_image_list.py [-h] [--source-dir SOURCE_DIR]
[--save-path SAVE_PATH] [--pattern PATTERN]
collect_image_list.py: error: unrecognized arguments: --image_list /home/ivc2411/path/data/image_lists/train_pbr.txt
Can you give me some advice?
Best
did someone meet the same question? and how to fix it
Hello,
So to begin, I'm trying to get the YCB-V dataset setup with the annotations, and I started off by running python tools/collect_image_list.py --source-dir data/ycbv/train_pbr --save-path data/ycbv/train_pbr/train_pbr.txt --pattern */rgb/*.jpg
to get the images list.
Now I'm trying to run the tools/bop_to_coco.py script and I've ran into a few errors regarding the command line arguments. The first problem I ran into was this error:
$ python tools/bop_to_coco.py --images-dir data/ycbv/train_pbr --images-list data/ycbv/train_pbr/train_pbr.txt --save-path data/ycbv/detector_annotations/train_pbr.json --dataset ycbv
Traceback (most recent call last):
File "/data2/6d-pose-estimation/radet-attempt-2/RADet/tools/bop_to_coco.py", line 238, in <module>
data_root, txt, seg_collect, thread_num = args.images_dir, args.images_list, args.segmentation, args.thread_num
AttributeError: 'Namespace' object has no attribute 'thread_num'
I then changed the line data_root, txt, seg_collect, thread_num = args.images_dir, args.images_list, args.segmentation, args.thread_num
to => data_root, txt, seg_collect = args.images_dir, args.images_list, args.segmentation
which solved that first error (however I'm not sure if the thread_num is critical, it is not used in the script so I'm assuming it isn't). But then I ran into a new error:
$ python tools/bop_to_coco.py --images-dir data/ycbv/train_pbr --images-list data/ycbv/train_pbr/train_pbr.txt --save-path data/ycbv/detector_annotations/train_pbr.json --dataset ycbv
Traceback (most recent call last):
File "/data2/6d-pose-estimation/radet-attempt-2/RADet/tools/bop_to_coco.py", line 241, in <module>
if args.amodal:
AttributeError: 'Namespace' object has no attribute 'amodal'
So I added the line parser.add_argument('--amodal', action='store_true')
to the parse_args()
function in bop_to_coco.py:
def parse_args():
parser = ArgumentParser(description='Extract ground annotations from BOP format to COCO format')
parser.add_argument('--images-dir', default='data/hb/train_pbr', type=str)
parser.add_argument('--images-list',default='data/hb/image_lists/train_pbr.txt' ,type=str)
parser.add_argument('--save-path', default='data/hb/detector_annotations/train_pbr.json', type=str)
parser.add_argument('--segmentation', action='store_true', help='collect segmentation info or not')
parser.add_argument('--without-gt', action='store_true')
parser.add_argument('--amodal', action='store_true') # <= New line added here!!!
parser.add_argument('--dataset', choices=['icbin', 'tudl', 'tless', 'lmo', 'itodd', 'hb', 'ycbv'])
args = parser.parse_args()
return args
This fixed the issue with the script not being aware of the amodal command line argument. However, I did some further debugging and noticed that the annotations weren't being generated. I managed to track down the issue to being in the make_coco_anno()
function and the construct_gt_info()
function. The construct_gt_info()
function removes the leading "/" from the image paths with the line image_path = image_path[1:]
so that you get paths that look something like 000000/rgb/000000.jpg
. It then updates the annos_info
dict with the line annos_info[image_path] = dict(id=image_id, gts_info=per_img_info)
and finally returns annos_info
which the if __name__ == '__main__':
section uses to update the collect_info
dict with the line collect_info.update(construct_gt_info(seq, anno_start_end_id, image_start_end_id))
. The main problem now stems in the make_coco_anno()
function which constructs a list of paths with the block of code:
with open(txt_path, 'r') as f:
paths = f.read()
paths = list(paths.split())
After this, the paths
variable will contain a list that looks some thing like [ ..., '/000049/rgb/000558.jpg', '/000049/rgb/000574.jpg', '/000049/rgb/000575.jpg', ...]
. Basically the paths have the leading "/" which causes the check if path in collect_annos:
to fail everytime. The solution I have is to add the line path = path[1:]
to the beginning of the for loop in make_coco_anno()
:
def make_coco_anno(txt_path, collect_annos, coco_annos_dict):
with open(txt_path, 'r') as f:
paths = f.read()
paths = list(paths.split())
images_info = []
annos_info = []
for path in paths:
path = path[1:] # <= New line added here!!!
if path in collect_annos:
images_info.append(dict(file_name=path, id=collect_annos[path]['id'], width=image_w, height=image_h))
annos_info.extend(collect_annos[path]['gts_info'])
coco_annos_dict['images'].extend(images_info)
coco_annos_dict['annotations'].extend(annos_info)
return coco_annos_dict
So overall, here's the updated bop_to_coco.py file that I made changes to and I tested to make sure it works:
import os
import json
import cv2
import numpy as np
from os import path as osp
from tqdm import tqdm
from argparse import ArgumentParser
class_names_cfg = dict(
icbin=('coffee_cup', 'juice_carton'),
tudl= ('dragon', 'frog', 'can'),
lmo=('ape', 'benchvise', 'bowl', 'cam', 'can', 'cat', 'cup', 'driller', 'duck', 'eggbox', 'glue', 'holepuncher', 'iron','lamp', 'phone'),
ycbv= ('master_chef_can', 'cracker_box', 'sugar_box', 'tomato_soup_can', 'mustard_bottle', 'tuna_fish_can', 'pudding_box', 'gelatin_box',
'potted_meat_can', 'banana', 'pitcher_base', 'bleach_cleanser', 'bowl', 'mug', 'power_drill', 'wood_block', 'scissors', 'large_marker',
'large_clamp', 'extra_large_clamp', 'foam_brick'),
hb=tuple([i+1 for i in range(33)]),
itodd=tuple([i+1 for i in range(28)]),
tless=tuple([i+1 for i in range(30)]),
)
image_resolution_cfg = dict(
icbin=(640, 480),
tudl=(640, 480),
ycbv=(640, 480),
lmo=(640, 480),
hb=(640, 480),
itodd=(1280, 960),
tless=(720, 540), # train_primesense (400, 400), train_pbr (720, 540)
)
def parse_args():
parser = ArgumentParser(description='Extract ground annotations from BOP format to COCO format')
parser.add_argument('--images-dir', default='data/hb/train_pbr', type=str)
parser.add_argument('--images-list',default='data/hb/image_lists/train_pbr.txt' ,type=str)
parser.add_argument('--save-path', default='data/hb/detector_annotations/train_pbr.json', type=str)
parser.add_argument('--segmentation', action='store_true', help='collect segmentation info or not')
parser.add_argument('--without-gt', action='store_true')
parser.add_argument('--amodal', action='store_true') # <= New line added here!!!
parser.add_argument('--dataset', choices=['icbin', 'tudl', 'tless', 'lmo', 'itodd', 'hb', 'ycbv'])
args = parser.parse_args()
return args
def close_contour(contour):
if not np.array_equal(contour[0], contour[-1]):
contour = np.vstack((contour, contour[0]))
return contour
def binary_mask_to_polygon(binary_mask, tolerance=0):
from skimage import measure
from shapely.geometry import Polygon, MultiPolygon
"""Converts a binary mask to COCO polygon representation
Args:
binary_mask: a 2D binary numpy array where '1's represent the object
tolerance: Maximum distance from original points of polygon to approximated
polygonal chain. If tolerance is 0, the original coordinate array is returned.
"""
# pad mask to close contours of shapes which start and end at an edge
padded_binary_mask = np.pad(binary_mask, pad_width=1, mode='constant', constant_values=0)
contours = measure.find_contours(padded_binary_mask, 0.5)
segmentations = []
polygons = []
for contour in contours:
# Flip from (row, col) representation to (x, y)
# and subtract the padding pixel
for i in range(len(contour)):
row, col = contour[i]
contour[i] = (col - 1, row - 1)
# Make a polygon and simplify it
# if len(contour) < 3:
# continue
poly = Polygon(contour)
poly = poly.simplify(1.0, preserve_topology=False)
polygons.append(poly)
if isinstance(poly, MultiPolygon):
poly = max(poly, key=lambda a: a.area)
segmentation = np.array(poly.exterior.coords).ravel().tolist()
segmentations.append(segmentation)
# contours = np.subtract(contours, 1)
# for contour in contours:
# contour = close_contour(contour)
# contour = measure.approximate_polygon(contour, tolerance)
# if len(contour) < 3:
# continue
# contour = np.flip(contour, axis=1)
# segmentation = contour.ravel().tolist()
# # after padding and subtracting 1 we may get -0.5 points in our segmentation
# segmentation = [0 if i < 0 else i for i in segmentation]
# polygons.append(segmentation)
return segmentations
def construct_gt_info(sequence_dir, start_end_anno_id, start_end_img_id):
sequence_gt_info_path = os.path.join(sequence_dir, 'scene_gt_info.json')
sequence_gt_path = os.path.join(sequence_dir, 'scene_gt.json')
with open(sequence_gt_info_path, 'r') as f:
sequence_gt_info = json.load(f)
with open(sequence_gt_path, 'r') as f:
sequence_gt = json.load(f)
image_id, anno_id = start_end_img_id[0], start_end_anno_id[0]
annos_info = dict()
pbar = tqdm(sequence_gt_info.keys())
pbar.set_description(os.path.basename(sequence_dir))
for id in pbar:
image_id += 1
image_path = os.path.join(sequence_dir, 'rgb', id.zfill(6)+'.jpg')
if os.path.exists(image_path):
# relative path
image_path = os.path.join(sequence_dir.split(data_root)[-1], 'rgb', id.zfill(6)+'.jpg')
else:
# check png path
image_path = image_path.replace('jpg', 'png')
assert os.path.exists(image_path)
image_path = os.path.join(sequence_dir.split(data_root)[-1], 'rgb', id.zfill(6)+'.png')
# filter '/'
image_path = image_path[1:]
per_img_info = []
bbox_info_per_image = sequence_gt_info[id]
category_info_per_image = sequence_gt[id]
visib_fract_per_image = [f['visib_fract'] for f in bbox_info_per_image]
bbox_info_per_image = [b[bbox_key] for b in bbox_info_per_image]
category_info_per_image = [c['obj_id'] for c in category_info_per_image]
for obj_id, (bbox_info_per_obj, category_info_per_obj, visib_fract_per_obj) in enumerate(zip(bbox_info_per_image, category_info_per_image, visib_fract_per_image)):
anno_id += 1
area = bbox_info_per_obj[2] * bbox_info_per_obj[3]
if seg_collect:
mask_path = os.path.join(sequence_dir, 'mask_visib', id.zfill(6)+'_'+str(obj_id).zfill(6)+'.png')
mask_per_obj = cv2.cvtColor(cv2.imread(mask_path), cv2.COLOR_BGR2GRAY)
mask_per_obj = (mask_per_obj / 255).astype(np.byte)
polygons = binary_mask_to_polygon(mask_per_obj)
polygons = [p for p in polygons if len(p)>0]
if len(polygons) == 0:
continue
per_obj_info = dict(
id=anno_id,
image_id=image_id,
category_id=category_info_per_obj,
visib_fract=visib_fract_per_obj,
bbox=bbox_info_per_obj,
area=area,
iscrowd=0,
segmentation=polygons)
else:
per_obj_info = dict(
id=anno_id,
image_id=image_id,
category_id=category_info_per_obj,
visib_fract=visib_fract_per_obj,
bbox=bbox_info_per_obj,
area=area,
iscrowd=0)
per_img_info.append(per_obj_info)
annos_info[image_path] = dict(id=image_id, gts_info=per_img_info)
assert anno_id == start_end_anno_id[1]
assert image_id == start_end_img_id[1]
return annos_info
# with open(os.path.join(sequence_dir, 'collect_gt_info.pkl'), 'wb') as f:
# pickle.dump(annos_info, f)
def scan_imageid_and_annoid(sequence_dirs):
image_start_end_ids = []
anno_start_end_ids = []
start_image_id, start_anno_id = 0, 0
for sequence_dir in sequence_dirs:
sequence_gt_info_path = osp.join(sequence_dir, 'scene_gt_info.json')
with open(sequence_gt_info_path, 'r') as f:
sequence_gt_info = json.load(f)
image_num = len(sequence_gt_info)
anno_num = [len(v) for v in sequence_gt_info.values()]
anno_num = sum(anno_num)
end_image_id = start_image_id + image_num
end_anno_id = start_anno_id + anno_num
image_start_end_ids.append((start_image_id, end_image_id))
anno_start_end_ids.append((start_anno_id, end_anno_id))
start_anno_id = end_anno_id
start_image_id = end_image_id
return image_start_end_ids, anno_start_end_ids
def make_coco_anno(txt_path, collect_annos, coco_annos_dict):
with open(txt_path, 'r') as f:
paths = f.read()
paths = list(paths.split())
images_info = []
annos_info = []
for path in paths:
path = path[1:] # <= New line added here!!!
if path in collect_annos:
images_info.append(dict(file_name=path, id=collect_annos[path]['id'], width=image_w, height=image_h))
annos_info.extend(collect_annos[path]['gts_info'])
coco_annos_dict['images'].extend(images_info)
coco_annos_dict['annotations'].extend(annos_info)
return coco_annos_dict
def save_test_annotation(txt_file, save_path, category_info):
annotation = dict()
images_info = []
with open(txt_file, 'r') as f:
image_paths = f.readlines()
image_id = 0
for i in range(len(image_paths)):
image_path = image_paths[i].strip()
images_info.append(
dict(file_name=image_path, id=image_id, width=image_w, heigth=image_h)
)
image_id += 1
annotation['images'] = images_info
annotation['categories'] = category_info
with open(save_path, 'w') as f:
json.dump(annotation, f)
if __name__ == '__main__':
args = parse_args()
data_root, txt, seg_collect = args.images_dir, args.images_list, args.segmentation # <= Line updated here!!!
dataset = args.dataset
if args.amodal:
bbox_key = 'bbox_visib'
else:
bbox_key = 'bbox_obj'
class_names = class_names_cfg[dataset]
image_w, image_h = image_resolution_cfg[dataset]
category_info = []
# generate category info
for category_id, category_name in enumerate(class_names):
category_info.append(dict(id=category_id+1, name=category_name))
if args.without_gt:
save_test_annotation(txt, args.save_path, category_info)
else:
coco_annotations = dict(images=list(), annotations=list(), categories=category_info)
# generate anno
collect_info = dict()
image_id, anno_id = 0, 0
sequences = sorted(os.listdir(data_root))
sequences = [osp.join(data_root, s) for s in sequences]
sequences = [s for s in sequences if osp.isdir(s)]
image_start_end_ids, anno_start_end_ids = scan_imageid_and_annoid(sequences)
pbar = tqdm(zip(sequences, image_start_end_ids, anno_start_end_ids))
for seq, image_start_end_id, anno_start_end_id in pbar:
collect_info.update(construct_gt_info(seq, anno_start_end_id, image_start_end_id))
# convert annotations to coco format
coco_annotations = make_coco_anno(txt, collect_info, coco_annotations)
with open(args.save_path, 'w') as f:
json.dump(coco_annotations, f)
Thanks for your greate work
how you combinate it with PFA-Pose, can you give some advice or code?
thanks
Thank you very much for your work!!
I encountered the following error while loading train.json. May I know how to resolve it
sys.platform: linux
Python: 3.6.15 | packaged by conda-forge | (default, Dec 3 2021, 18:49:41) [GCC 9.4.0]
CUDA available: True
GPU 0: NVIDIA GeForce RTX 3060 Laptop GPU
CUDA_HOME: /usr/local/cuda-11.1
NVCC: Build cuda_11.1.TC455_06.29069683_0
GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
PyTorch: 1.8.0
PyTorch compiling details: PyTorch built with:
2023-10-25 15:10:33,091 - radet - INFO - Distributed training: False
2023-10-25 15:10:33,396 - radet - INFO - Config:
dataset_type = 'BOPDataset'
data_root = 'data/rcv/'
img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations', with_bbox=True, with_bop_mask=True),
dict(type='Resize', img_scale=(640, 480), keep_ratio=True),
dict(
type='CosyPoseAug',
p=0.8,
pipelines=[
dict(type='PillowBlur', p=1.0, factor_interval=(1, 3)),
dict(type='PillowSharpness', p=0.3, factor_interval=(0.0, 50.0)),
dict(type='PillowContrast', p=0.3, factor_interval=(0.2, 50.0)),
dict(type='PillowBrightness', p=0.5, factor_interval=(0.1, 6.0)),
dict(type='PillowColor', p=0.3, factor_interval=(0.0, 20.0))
]),
dict(type='RandomFlip', flip_ratio=0.5),
dict(type='GenerateDistanceMap'),
dict(
type='LabelAssignment',
anchor_generator_cfg=dict(
type='AnchorGenerator',
ratios=[1.0],
octave_base_scale=8,
scales_per_octave=1,
strides=[8, 16, 32, 64, 128]),
neg_threshold=0.2,
positive_num=10,
adapt_positive_num=False,
balance_sample=True),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=16),
dict(type='DefaultFormatBundle'),
dict(
type='Collect',
keys=[
'img', 'gt_bboxes', 'gt_labels', 'points_to_gt_index',
'points_weight'
])
]
test_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(640, 480),
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img'])
])
]
data = dict(
samples_per_gpu=16,
workers_per_gpu=8,
train=dict(
type='BOPDataset',
ann_file='data/rcv/detector_annotations/train1.json',
img_prefix='data/rcv/train/',
seg_prefix='data/rcv/train/',
pipeline=[
dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations', with_bbox=True, with_bop_mask=True),
dict(type='Resize', img_scale=(640, 480), keep_ratio=True),
dict(
type='CosyPoseAug',
p=0.8,
pipelines=[
dict(type='PillowBlur', p=1.0, factor_interval=(1, 3)),
dict(
type='PillowSharpness',
p=0.3,
factor_interval=(0.0, 50.0)),
dict(
type='PillowContrast',
p=0.3,
factor_interval=(0.2, 50.0)),
dict(
type='PillowBrightness',
p=0.5,
factor_interval=(0.1, 6.0)),
dict(
type='PillowColor', p=0.3, factor_interval=(0.0, 20.0))
]),
dict(type='RandomFlip', flip_ratio=0.5),
dict(type='GenerateDistanceMap'),
dict(
type='LabelAssignment',
anchor_generator_cfg=dict(
type='AnchorGenerator',
ratios=[1.0],
octave_base_scale=8,
scales_per_octave=1,
strides=[8, 16, 32, 64, 128]),
neg_threshold=0.2,
positive_num=10,
adapt_positive_num=False,
balance_sample=True),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=16),
dict(type='DefaultFormatBundle'),
dict(
type='Collect',
keys=[
'img', 'gt_bboxes', 'gt_labels', 'points_to_gt_index',
'points_weight'
])
],
classes=('obj_000001', 'obj_000002', 'obj_000003', 'obj_000004',
'obj_000005', 'obj_000006', 'obj_000007', 'obj_000008',
'obj_000009', 'obj_000010', 'obj_000011', 'obj_000012',
'obj_000013', 'obj_000014', 'obj_000015', 'obj_000016'),
min_visib_frac=0.1),
val=dict(
type='BOPDataset',
ann_file='data/rcv/detector_annotations/test_targets_bop19.json',
img_prefix='data/rcv/test/',
pipeline=[
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(640, 480),
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img'])
])
],
classes=('obj_000001', 'obj_000002', 'obj_000003', 'obj_000004',
'obj_000005', 'obj_000006', 'obj_000007', 'obj_000008',
'obj_000009', 'obj_000010', 'obj_000011', 'obj_000012',
'obj_000013', 'obj_000014', 'obj_000015', 'obj_000016')),
test=dict(
type='BOPDataset',
ann_file='data/rcv/detector_annotations/test_targets_bop19.json',
img_prefix='data/rcv/test/',
pipeline=[
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(640, 480),
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img'])
])
],
bop_submission=True,
classes=('obj_000001', 'obj_000002', 'obj_000003', 'obj_000004',
'obj_000005', 'obj_000006', 'obj_000007', 'obj_000008',
'obj_000009', 'obj_000010', 'obj_000011', 'obj_000012',
'obj_000013', 'obj_000014', 'obj_000015', 'obj_000016')))
optimizer = dict(
type='AdamW',
lr=0.0004,
betas=(0.9, 0.999),
weight_decay=0.05,
eps=1e-08,
amsgrad=False)
lr_config = dict(
policy='OneCycle',
max_lr=0.0004,
total_steps=100100,
pct_start=0.05,
anneal_strategy='linear')
runner = dict(type='IterBasedRunner', max_iters=100000)
checkpoint_config = dict(interval=10000)
evaluation = dict(interval=10000, metric='bbox')
optimizer_config = dict(grad_clip=dict(max_norm=35, norm_type=2))
log_config = dict(interval=50, hooks=[dict(type='TextLoggerHook')])
dist_params = dict(backend='nccl')
log_level = 'INFO'
workflow = [('train', 1)]
CLASS_NAMES = ('obj_000001', 'obj_000002', 'obj_000003', 'obj_000004',
'obj_000005', 'obj_000006', 'obj_000007', 'obj_000008',
'obj_000009', 'obj_000010', 'obj_000011', 'obj_000012',
'obj_000013', 'obj_000014', 'obj_000015', 'obj_000016')
model = dict(
type='RADet',
pretrained='torchvision://resnet50',
backbone=dict(
type='ResNet',
depth=50,
num_stages=4,
out_indices=(0, 1, 2, 3),
frozen_stages=1,
norm_cfg=dict(type='BN', requires_grad=True),
norm_eval=True,
style='pytorch'),
neck=dict(
type='FPN',
in_channels=[256, 512, 1024, 2048],
out_channels=256,
start_level=1,
add_extra_convs='on_output',
num_outs=5),
bbox_head=dict(
type='RADetHead',
num_classes=16,
in_channels=256,
stacked_convs=4,
feat_channels=256,
strides=[8, 16, 32, 64, 128],
anchor_generator=dict(
type='AnchorGenerator',
ratios=[1.0],
octave_base_scale=8,
scales_per_octave=1,
strides=[8, 16, 32, 64, 128]),
bbox_coder=dict(type='TBLRBBoxCoder', normalizer=0.125),
loss_cls=dict(
type='FocalLoss',
use_sigmoid=True,
gamma=2.0,
alpha=0.25,
loss_weight=1.0),
loss_bbox=dict(type='GIoULoss', loss_weight=2.0),
loss_centerness=dict(
type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0)))
train_cfg = dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.5,
neg_iou_thr=0.4,
min_pos_iou=0,
ignore_iof_thr=-1),
allowed_border=-1,
pos_weight=-1,
debug=False)
test_cfg = dict(
nms_pre=1000,
min_bbox_size=0,
score_thr=0.05,
nms=dict(
type='vote',
iou_threshold=0.65,
cluster_score=['cls', 'iou'],
vote_score=['iou', 'cls'],
iou_enable=False,
sima=0.025),
max_per_img=100)
work_dir = 'work_dirs/rcv_r50_radet_pbr'
gpu_ids = range(0, 1)
2023-10-25 15:10:33,749 - radet - INFO - load model from: torchvision://resnet50
2023-10-25 15:10:33,749 - radet - INFO - load checkpoint from torchvision path: torchvision://resnet50
2023-10-25 15:10:34,073 - radet - WARNING - The model and loaded state dict do not match exactly
unexpected key in source state_dict: fc.weight, fc.bias
loading annotations into memory...
Done (t=2.42s)
creating index...
index created!
fatal: not a git repository (or any parent up to mount point /media/xiaobingli)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
loading annotations into memory...
Traceback (most recent call last):
File "/home/xiaobingli/anaconda3/envs/RADet-main/lib/python3.6/site-packages/mmcv/utils/registry.py", line 52, in build_from_cfg
return obj_cls(**args)
File "/media/xiaobingli/myself/LXB/RADet-main/radet/datasets/bop.py", line 36, in init
filter_empty_gt)
File "/media/xiaobingli/myself/LXB/RADet-main/radet/datasets/custom.py", line 87, in init
self.data_infos = self.load_annotations(self.ann_file)
File "/media/xiaobingli/myself/LXB/RADet-main/radet/datasets/coco.py", line 57, in load_annotations
self.coco = COCO(ann_file)
File "/home/xiaobingli/anaconda3/envs/RADet-main/lib/python3.6/site-packages/pycocotools/coco.py", line 89, in init
type(dataset))
AssertionError: annotation file format <class 'list'> not supported
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "tools/train.py", line 186, in
main()
File "tools/train.py", line 182, in main
meta=meta)
File "/media/xiaobingli/myself/LXB/RADet-main/radet/apis/train.py", line 139, in train_detector
val_dataset = build_dataset(cfg.data.val, dict(test_mode=True))
File "/media/xiaobingli/myself/LXB/RADet-main/radet/datasets/builder.py", line 78, in build_dataset
dataset = build_from_cfg(cfg, DATASETS, default_args)
File "/home/xiaobingli/anaconda3/envs/RADet-main/lib/python3.6/site-packages/mmcv/utils/registry.py", line 55, in build_from_cfg
raise type(e)(f'{obj_cls.name}: {e}')
AssertionError: BOPDataset: annotation file format <class 'list'> not supported
Hey @YangHai-1218,
I ran into an issue with the RandomBackground augmentation step specified in configs/base/datasets/bop_detection.py
, which appears to be looking for background images in the data/coco
directory by default. Just by looking at the code it seems like you could potentially use any set of images as the background. But just for the sake of it, do you guys know exactly which images you used and where to download them?
Also for the time being, I've just commented out the RandomBackground augmentation step in the configs/base/datasets/bop_detection.py
file, would that be acceptable? Or is this a crucial augmentation step?
@YangHai-1218
Hi, hai:
Thanks for sharing so nice work. Could you mind providing the script to get the data of Fig.7 Performance w.r.t. different occlusion ratios in your paper? I cannot find the code to reproduce this work.
Hi, Congratulations on your excellent work! I have some questions about how to train my datasets. Now I have 1000 images, masks, 6D poses, CAD model and camera's intrinsics. Could you please give me some help with this question? Thanks:)
Thanks for your greate work
I have some questions about the paper
Thanks,
Best
Hello,
I'm trying to run the training script in tools/train.py and I ran into a few errors with mmcv (using version 1.3.18).
The first problem I ran into complained about mmcv._ext
:
$ python tools/train.py --config configs/bop/r50_ycbv_pbr_testing123.py
Traceback (most recent call last):
File "/data2/6d-pose-estimation/radet-attempt-2/RADet/tools/train.py", line 15, in <module>
from radet.apis import set_random_seed, train_detector
File "/data2/6d-pose-estimation/radet-attempt-2/RADet/radet/apis/__init__.py", line 1, in <module>
from .inference import (async_inference_detector, inference_detector,
File "/data2/6d-pose-estimation/radet-attempt-2/RADet/radet/apis/inference.py", line 6, in <module>
from mmcv.ops import RoIPool
File "/home/exx/anaconda3/envs/radet-attempt-2-0/lib/python3.9/site-packages/mmcv/ops/__init__.py", line 2, in <module>
from .assign_score_withk import assign_score_withk
File "/home/exx/anaconda3/envs/radet-attempt-2-0/lib/python3.9/site-packages/mmcv/ops/assign_score_withk.py", line 5, in <module>
ext_module = ext_loader.load_ext(
File "/home/exx/anaconda3/envs/radet-attempt-2-0/lib/python3.9/site-packages/mmcv/utils/ext_loader.py", line 13, in load_ext
ext = importlib.import_module('mmcv.' + name)
File "/home/exx/anaconda3/envs/radet-attempt-2-0/lib/python3.9/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
ModuleNotFoundError: No module named 'mmcv._ext'
But I was able to find this issue on GitHub (open-mmlab/mmcv#204) that provided a solution for it which basically boiled down to running MMCV_WITH_OPS=1 pip install -e .
in the root directory of the source code for mmcv==1.3.18. But then I ran into another issue:
$ python tools/train.py --config configs/bop/r50_ycbv_pbr_testing123.py
Traceback (most recent call last):
File "/data2/6d-pose-estimation/radet-attempt-2/RADet/tools/train.py", line 15, in <module>
from radet.apis import set_random_seed, train_detector
File "/data2/6d-pose-estimation/radet-attempt-2/RADet/radet/apis/__init__.py", line 1, in <module>
from .inference import (async_inference_detector, inference_detector,
File "/data2/6d-pose-estimation/radet-attempt-2/RADet/radet/apis/inference.py", line 6, in <module>
from mmcv.ops import RoIPool
File "/home/exx/anaconda3/envs/radet-attempt-2-0/lib/python3.9/site-packages/mmcv-1.3.18/mmcv/ops/__init__.py", line 2, in <module>
from .assign_score_withk import assign_score_withk
File "/home/exx/anaconda3/envs/radet-attempt-2-0/lib/python3.9/site-packages/mmcv-1.3.18/mmcv/ops/assign_score_withk.py", line 5, in <module>
ext_module = ext_loader.load_ext(
File "/home/exx/anaconda3/envs/radet-attempt-2-0/lib/python3.9/site-packages/mmcv-1.3.18/mmcv/utils/ext_loader.py", line 13, in load_ext
ext = importlib.import_module('mmcv.' + name)
File "/home/exx/anaconda3/envs/radet-attempt-2-0/lib/python3.9/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
ImportError: /home/exx/anaconda3/envs/radet-attempt-2-0/lib/python3.9/site-packages/mmcv-1.3.18/mmcv/_ext.cpython-39-x86_64-linux-gnu.so: undefined symbol: _Z27points_in_boxes_cpu_forwardN2at6TensorES0_S0_
I found this GitHub issue (open-mmlab/mmcv#1556) from the mmcv repository that claims this problem was fixed in mmcv==1.4.0
My question is, are you guys sure that mmcv==1.3.18 is the right version to be using? Am I not installing it correctly? Would it be preferable to use a version lower than 1.3.18, such as 1.3.17, or would it be better to use 1.4.0?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.