yanghai-1218 / radet Goto Github PK

View Code? Open in Web Editor NEW

41.0 41.0 4.0 937 KB

Rigidity-Aware Detection for 6D Object Pose Estimation (CVPR 2023)

License: Apache License 2.0

Python 97.66% C++ 2.34%

6d-pose-estimation deep-learning

radet's People

Contributors

Stargazers

Watchers

Forkers

peterrsong jfitzg7 hiyyg ashwinp-97

radet's Issues

Error about 'mmdet'

Good work for 6D object pose estimation!
However, an error occurred, when I was running 'python tools/test.py --config configs/bop/r50_ycbv_pbr.py --checkpoint checkpoints/radet_ycbv_pbr.pth --format-only --eval-options jsonfile_prefix=work_dirs/results/radet_ycbv_pbr'. It seems to relate to the 'mmdetection' package. Could you please tell me how to install the correct 'mmdetection' package or some other solutions? Thank you!
Detailed information:
Traceback (most recent call last):
File "/data3/tantao/my_projects/RADet-main/test.py", line 13, in
from radet.apis import multi_gpu_test, single_gpu_test
File "/data3/tantao/my_projects/RADet-main/radet/apis/init.py", line 1, in
from .inference import (async_inference_detector, inference_detector,
File "/data3/tantao/my_projects/RADet-main/radet/apis/inference.py", line 10, in
from radet.core import get_classes
File "/data3/tantao/my_projects/RADet-main/radet/core/init.py", line 7, in
from .post_processing import * # noqa: F401, F403
File "/data3/tantao/my_projects/RADet-main/radet/core/post_processing/init.py", line 1, in
from .bbox_nms import fast_nms, multiclass_nms, multiclass_vote
File "/data3/tantao/my_projects/RADet-main/radet/core/post_processing/bbox_nms.py", line 3, in
from mmdet.ops import vote_nms
ModuleNotFoundError: No module named 'mmdet.ops'
Pip list:
Package Version Editable project location

addict 2.4.0
certifi 2023.5.7
contourpy 1.1.0
cycler 0.11.0
Cython 0.29.35
fonttools 4.41.1
importlib-metadata 6.8.0
importlib-resources 6.0.0
Jinja2 3.1.2
kiwisolver 1.4.4
markdown-it-py 3.0.0
MarkupSafe 2.1.3
matplotlib 3.4.3
mdurl 0.1.2
mkl-fft 1.3.1
mkl-random 1.2.2
mkl-service 2.4.0
mmcv 1.3.18
mmcv-full 1.3.18
mmdet 3.1.0 /data3/xx/download/mmdetection-main
mmengine 0.8.2
mmpycocotools 12.0.3
numpy 1.24.3
opencv-python 4.8.0.74
packaging 23.1
Pillow 10.0.0
pip 23.1.2
platformdirs 3.9.1
pycocotools 2.0.6
Pygments 2.15.1
pyparsing 3.0.9
pyproject 1.3.1
python-dateutil 2.8.2
PyYAML 6.0.1
RADet 1.0.0 /data3/xx/my_projects/RADet-main
rich 13.4.2
scipy 1.11.1
setuptools 67.8.0
shapely 2.0.1
six 1.16.0
termcolor 2.3.0
terminaltables 3.1.10
tomli 2.0.1
torch 1.10.0
torchaudio 0.10.0
torchvision 0.11.0
tornado 6.3.2
typing_extensions 4.7.1
wheel 0.38.4
yapf 0.40.1
zipp 3.16.2

About itodd

The itodd data set does not seem to give the true value of the test. What should I do?

Did this code use the GenerateDistanceMap method in the article？

RADet/radet/datasets/pipelines/loading.py

Line 543 in 22296c7

with_gt_mask=True,

If with_gt_mask=True, then distancemap will be the "mask_visible" in the dataset, not the distancemap generated using the method in the paper.

some issues on data preparation stage

Hello, author
Thanks for your great work and sharing
I tried to prepare data using ycbv_pbr datastet, but there is an error

usage: collect_image_list.py [-h] [--source-dir SOURCE_DIR]
[--save-path SAVE_PATH] [--pattern PATTERN]
collect_image_list.py: error: unrecognized arguments: --image_list /home/ivc2411/path/data/image_lists/train_pbr.txt

Can you give me some advice?

Best

ModuleNotFoundError: No module named 'mmcv._ext'

did someone meet the same question? and how to fix it

Some problems with bop_to_coco.py

Hello,

So to begin, I'm trying to get the YCB-V dataset setup with the annotations, and I started off by running python tools/collect_image_list.py --source-dir data/ycbv/train_pbr --save-path data/ycbv/train_pbr/train_pbr.txt --pattern */rgb/*.jpg to get the images list.

Now I'm trying to run the tools/bop_to_coco.py script and I've ran into a few errors regarding the command line arguments. The first problem I ran into was this error:

$ python tools/bop_to_coco.py --images-dir data/ycbv/train_pbr --images-list data/ycbv/train_pbr/train_pbr.txt --save-path data/ycbv/detector_annotations/train_pbr.json --dataset ycbv
Traceback (most recent call last):
  File "/data2/6d-pose-estimation/radet-attempt-2/RADet/tools/bop_to_coco.py", line 238, in <module>
    data_root, txt, seg_collect, thread_num = args.images_dir, args.images_list, args.segmentation, args.thread_num
AttributeError: 'Namespace' object has no attribute 'thread_num'

I then changed the line data_root, txt, seg_collect, thread_num = args.images_dir, args.images_list, args.segmentation, args.thread_num to => data_root, txt, seg_collect = args.images_dir, args.images_list, args.segmentation which solved that first error (however I'm not sure if the thread_num is critical, it is not used in the script so I'm assuming it isn't). But then I ran into a new error:

$ python tools/bop_to_coco.py --images-dir data/ycbv/train_pbr --images-list data/ycbv/train_pbr/train_pbr.txt --save-path data/ycbv/detector_annotations/train_pbr.json --dataset ycbv
Traceback (most recent call last):
  File "/data2/6d-pose-estimation/radet-attempt-2/RADet/tools/bop_to_coco.py", line 241, in <module>
    if args.amodal:
AttributeError: 'Namespace' object has no attribute 'amodal'

So I added the line parser.add_argument('--amodal', action='store_true') to the parse_args() function in bop_to_coco.py:

def parse_args():
    parser = ArgumentParser(description='Extract ground annotations from BOP format to COCO format')
    parser.add_argument('--images-dir', default='data/hb/train_pbr', type=str)
    parser.add_argument('--images-list',default='data/hb/image_lists/train_pbr.txt' ,type=str)
    parser.add_argument('--save-path', default='data/hb/detector_annotations/train_pbr.json', type=str)
    parser.add_argument('--segmentation', action='store_true', help='collect segmentation info or not')
    parser.add_argument('--without-gt', action='store_true')
    parser.add_argument('--amodal', action='store_true') # <= New line added here!!!

    parser.add_argument('--dataset', choices=['icbin', 'tudl', 'tless', 'lmo', 'itodd', 'hb', 'ycbv'])
    args = parser.parse_args()
    return args

This fixed the issue with the script not being aware of the amodal command line argument. However, I did some further debugging and noticed that the annotations weren't being generated. I managed to track down the issue to being in the make_coco_anno() function and the construct_gt_info() function. The construct_gt_info() function removes the leading "/" from the image paths with the line image_path = image_path[1:] so that you get paths that look something like 000000/rgb/000000.jpg. It then updates the annos_info dict with the line annos_info[image_path] = dict(id=image_id, gts_info=per_img_info) and finally returns annos_info which the if __name__ == '__main__': section uses to update the collect_info dict with the line collect_info.update(construct_gt_info(seq, anno_start_end_id, image_start_end_id)). The main problem now stems in the make_coco_anno() function which constructs a list of paths with the block of code:

    with open(txt_path, 'r') as f:
        paths = f.read()
    paths = list(paths.split())

After this, the paths variable will contain a list that looks some thing like [ ..., '/000049/rgb/000558.jpg', '/000049/rgb/000574.jpg', '/000049/rgb/000575.jpg', ...]. Basically the paths have the leading "/" which causes the check if path in collect_annos: to fail everytime. The solution I have is to add the line path = path[1:] to the beginning of the for loop in make_coco_anno():

def make_coco_anno(txt_path, collect_annos, coco_annos_dict):
    with open(txt_path, 'r') as f:
        paths = f.read()
    paths = list(paths.split())

    images_info = []
    annos_info = []
    for path in paths:
        path = path[1:] # <= New line added here!!!
        if path in collect_annos:
            images_info.append(dict(file_name=path, id=collect_annos[path]['id'], width=image_w, height=image_h))
            annos_info.extend(collect_annos[path]['gts_info'])

    coco_annos_dict['images'].extend(images_info)
    coco_annos_dict['annotations'].extend(annos_info)
    return coco_annos_dict

So overall, here's the updated bop_to_coco.py file that I made changes to and I tested to make sure it works:

import os
import json
import cv2
import numpy as np
from os import path as osp
from tqdm import tqdm
from argparse import ArgumentParser





class_names_cfg = dict(
    icbin=('coffee_cup', 'juice_carton'),
    tudl= ('dragon', 'frog', 'can'),
    lmo=('ape', 'benchvise', 'bowl', 'cam', 'can', 'cat', 'cup', 'driller', 'duck', 'eggbox', 'glue', 'holepuncher', 'iron','lamp', 'phone'),
    ycbv= ('master_chef_can', 'cracker_box', 'sugar_box', 'tomato_soup_can', 'mustard_bottle', 'tuna_fish_can', 'pudding_box', 'gelatin_box',
            'potted_meat_can', 'banana', 'pitcher_base', 'bleach_cleanser', 'bowl', 'mug', 'power_drill',  'wood_block', 'scissors', 'large_marker',
            'large_clamp', 'extra_large_clamp', 'foam_brick'),
    hb=tuple([i+1 for i in range(33)]),
    itodd=tuple([i+1 for i in range(28)]),
    tless=tuple([i+1 for i in range(30)]),
)

image_resolution_cfg = dict(
    icbin=(640, 480),
    tudl=(640, 480),
    ycbv=(640, 480),
    lmo=(640, 480),
    hb=(640, 480),
    itodd=(1280, 960),
    tless=(720, 540), # train_primesense (400, 400), train_pbr (720, 540)
)


def parse_args():
    parser = ArgumentParser(description='Extract ground annotations from BOP format to COCO format')
    parser.add_argument('--images-dir', default='data/hb/train_pbr', type=str)
    parser.add_argument('--images-list',default='data/hb/image_lists/train_pbr.txt' ,type=str)
    parser.add_argument('--save-path', default='data/hb/detector_annotations/train_pbr.json', type=str)
    parser.add_argument('--segmentation', action='store_true', help='collect segmentation info or not')
    parser.add_argument('--without-gt', action='store_true')
    parser.add_argument('--amodal', action='store_true') # <= New line added here!!!
   
    parser.add_argument('--dataset', choices=['icbin', 'tudl', 'tless', 'lmo', 'itodd', 'hb', 'ycbv'])
    args = parser.parse_args()
    return args


def close_contour(contour):
    if not np.array_equal(contour[0], contour[-1]):
        contour = np.vstack((contour, contour[0]))
    return contour

def binary_mask_to_polygon(binary_mask, tolerance=0):
    from skimage import measure
    from shapely.geometry import Polygon, MultiPolygon
    """Converts a binary mask to COCO polygon representation
    Args:
        binary_mask: a 2D binary numpy array where '1's represent the object
        tolerance: Maximum distance from original points of polygon to approximated
            polygonal chain. If tolerance is 0, the original coordinate array is returned.
    """
    # pad mask to close contours of shapes which start and end at an edge
    padded_binary_mask = np.pad(binary_mask, pad_width=1, mode='constant', constant_values=0)
    contours = measure.find_contours(padded_binary_mask, 0.5)

    segmentations = []
    polygons = []
    for contour in contours:
        # Flip from (row, col) representation to (x, y)
        # and subtract the padding pixel
        for i in range(len(contour)):
            row, col = contour[i]
            contour[i] = (col - 1, row - 1)

        # Make a polygon and simplify it
        # if len(contour) < 3:
        #     continue
        poly = Polygon(contour)
        poly = poly.simplify(1.0, preserve_topology=False)
        polygons.append(poly)
        if isinstance(poly, MultiPolygon):
            poly = max(poly, key=lambda a: a.area)
        segmentation = np.array(poly.exterior.coords).ravel().tolist()
        segmentations.append(segmentation)

    # contours = np.subtract(contours, 1)
    # for contour in contours:
    #     contour = close_contour(contour)
    #     contour = measure.approximate_polygon(contour, tolerance)
    #     if len(contour) < 3:
    #         continue
    #     contour = np.flip(contour, axis=1)
    #     segmentation = contour.ravel().tolist()
    #     # after padding and subtracting 1 we may get -0.5 points in our segmentation
    #     segmentation = [0 if i < 0 else i for i in segmentation]
    #     polygons.append(segmentation)

    return segmentations

def construct_gt_info(sequence_dir, start_end_anno_id, start_end_img_id):
    sequence_gt_info_path = os.path.join(sequence_dir, 'scene_gt_info.json')
    sequence_gt_path = os.path.join(sequence_dir, 'scene_gt.json')
    with open(sequence_gt_info_path, 'r') as f:
        sequence_gt_info = json.load(f)
    with open(sequence_gt_path, 'r') as f:
        sequence_gt = json.load(f)

    image_id, anno_id = start_end_img_id[0], start_end_anno_id[0]
    annos_info = dict()
    pbar = tqdm(sequence_gt_info.keys())
    pbar.set_description(os.path.basename(sequence_dir))
    for id in pbar:
        image_id += 1

        image_path = os.path.join(sequence_dir, 'rgb', id.zfill(6)+'.jpg')
        if os.path.exists(image_path):
            # relative path
            image_path = os.path.join(sequence_dir.split(data_root)[-1], 'rgb', id.zfill(6)+'.jpg')
        else:
            # check png path
            image_path = image_path.replace('jpg', 'png')
            assert os.path.exists(image_path)
            image_path = os.path.join(sequence_dir.split(data_root)[-1], 'rgb', id.zfill(6)+'.png')

        # filter '/'
        image_path = image_path[1:]

        per_img_info = []
        bbox_info_per_image = sequence_gt_info[id]
        category_info_per_image = sequence_gt[id]
        visib_fract_per_image = [f['visib_fract'] for f in bbox_info_per_image]
        bbox_info_per_image = [b[bbox_key] for b in bbox_info_per_image]
        category_info_per_image = [c['obj_id'] for c in category_info_per_image]
        for obj_id, (bbox_info_per_obj, category_info_per_obj, visib_fract_per_obj) in enumerate(zip(bbox_info_per_image, category_info_per_image, visib_fract_per_image)):
            anno_id += 1
            area = bbox_info_per_obj[2] * bbox_info_per_obj[3]
            if seg_collect:
                mask_path = os.path.join(sequence_dir, 'mask_visib', id.zfill(6)+'_'+str(obj_id).zfill(6)+'.png')
                mask_per_obj = cv2.cvtColor(cv2.imread(mask_path), cv2.COLOR_BGR2GRAY)
                mask_per_obj = (mask_per_obj / 255).astype(np.byte)
                polygons = binary_mask_to_polygon(mask_per_obj)
                polygons = [p for p in polygons if len(p)>0]
                if len(polygons) == 0:
                    continue
                per_obj_info = dict(
                    id=anno_id, 
                    image_id=image_id, 
                    category_id=category_info_per_obj, 
                    visib_fract=visib_fract_per_obj,
                    bbox=bbox_info_per_obj,
                    area=area, 
                    iscrowd=0, 
                    segmentation=polygons)
            else:
                per_obj_info = dict(
                    id=anno_id, 
                    image_id=image_id, 
                    category_id=category_info_per_obj, 
                    visib_fract=visib_fract_per_obj,
                    bbox=bbox_info_per_obj,
                    area=area, 
                    iscrowd=0)
            per_img_info.append(per_obj_info)

        annos_info[image_path] = dict(id=image_id, gts_info=per_img_info)
    assert anno_id == start_end_anno_id[1]
    assert image_id == start_end_img_id[1]
    return annos_info
    # with open(os.path.join(sequence_dir, 'collect_gt_info.pkl'), 'wb') as f:
    #     pickle.dump(annos_info, f)


def scan_imageid_and_annoid(sequence_dirs):
    image_start_end_ids = []
    anno_start_end_ids = []
    start_image_id, start_anno_id = 0, 0
    for sequence_dir in sequence_dirs:
        sequence_gt_info_path = osp.join(sequence_dir, 'scene_gt_info.json')
        with open(sequence_gt_info_path, 'r') as f:
            sequence_gt_info = json.load(f)
        image_num = len(sequence_gt_info)
        anno_num = [len(v) for v in sequence_gt_info.values()]
        anno_num = sum(anno_num)
        end_image_id = start_image_id + image_num
        end_anno_id = start_anno_id + anno_num
        image_start_end_ids.append((start_image_id, end_image_id))
        anno_start_end_ids.append((start_anno_id, end_anno_id))
        start_anno_id = end_anno_id
        start_image_id = end_image_id
    return image_start_end_ids, anno_start_end_ids



def make_coco_anno(txt_path, collect_annos, coco_annos_dict):
    with open(txt_path, 'r') as f:
        paths = f.read()
    paths = list(paths.split())

    images_info = []
    annos_info = []
    for path in paths:
        path = path[1:] # <= New line added here!!!
        if path in collect_annos:
            images_info.append(dict(file_name=path, id=collect_annos[path]['id'], width=image_w, height=image_h))
            annos_info.extend(collect_annos[path]['gts_info'])

    coco_annos_dict['images'].extend(images_info)
    coco_annos_dict['annotations'].extend(annos_info)
    return coco_annos_dict

def save_test_annotation(txt_file, save_path, category_info):
    annotation = dict()
    images_info = []
    with open(txt_file, 'r') as f:
        image_paths = f.readlines()
    image_id = 0
    for i in range(len(image_paths)):
        image_path = image_paths[i].strip()
        images_info.append(
            dict(file_name=image_path, id=image_id, width=image_w, heigth=image_h)
        )
        image_id += 1
    annotation['images'] = images_info
    annotation['categories'] = category_info
    with open(save_path, 'w') as f:
        json.dump(annotation, f)









if __name__ == '__main__':
    args = parse_args()
    data_root, txt, seg_collect = args.images_dir, args.images_list, args.segmentation # <= Line updated here!!!
    dataset = args.dataset
    if args.amodal:
        bbox_key = 'bbox_visib'
    else:
        bbox_key = 'bbox_obj'
    
    class_names = class_names_cfg[dataset]
    image_w, image_h = image_resolution_cfg[dataset]

    category_info = []
    # generate category info
    for category_id, category_name in enumerate(class_names):
        category_info.append(dict(id=category_id+1, name=category_name))
    
    if args.without_gt:
        save_test_annotation(txt, args.save_path, category_info)
    else:
        coco_annotations = dict(images=list(), annotations=list(), categories=category_info)
        # generate anno
        collect_info = dict()
        image_id, anno_id = 0, 0
        sequences = sorted(os.listdir(data_root))
        sequences = [osp.join(data_root, s) for s in sequences]
        sequences = [s for s in sequences if osp.isdir(s)]
        image_start_end_ids, anno_start_end_ids = scan_imageid_and_annoid(sequences)
        pbar = tqdm(zip(sequences, image_start_end_ids, anno_start_end_ids))
        for seq, image_start_end_id, anno_start_end_id in pbar:
            collect_info.update(construct_gt_info(seq, anno_start_end_id, image_start_end_id))

        # convert annotations to coco format
        coco_annotations = make_coco_anno(txt, collect_info, coco_annotations)

        with open(args.save_path, 'w') as f:
            json.dump(coco_annotations, f)

combinate it with PFA-Pose

Thanks for your greate work

how you combinate it with PFA-Pose， can you give some advice or code?
thanks

AssertionError: BOPDataset: annotation file format <class 'list'> not supported

Thank you very much for your work！！
I encountered the following error while loading train.json. May I know how to resolve it

fatal: not a git repository (or any parent up to mount point /media/xiaobingli)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
2023-10-25 15:10:32,798 - radet - INFO - Environment info:

sys.platform: linux
Python: 3.6.15 | packaged by conda-forge | (default, Dec 3 2021, 18:49:41) [GCC 9.4.0]
CUDA available: True
GPU 0: NVIDIA GeForce RTX 3060 Laptop GPU
CUDA_HOME: /usr/local/cuda-11.1
NVCC: Build cuda_11.1.TC455_06.29069683_0
GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
PyTorch: 1.8.0
PyTorch compiling details: PyTorch built with:

GCC 7.3
C++ Version: 201402
Intel(R) Math Kernel Library Version 2020.0.2 Product Build 20200624 for Intel(R) 64 architecture applications
Intel(R) MKL-DNN v1.7.0 (Git Hash 7aed236906b1f7a05c0917e5257a1af05e9ff683)
OpenMP 201511 (a.k.a. OpenMP 4.5)
NNPACK is enabled
CPU capability usage: AVX2
CUDA Runtime 11.1
NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_37,code=compute_37
CuDNN 8.0.5
Magma 2.5.2
Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.1, CUDNN_VERSION=8.0.5, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.8.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON,

TorchVision: 0.9.0
OpenCV: 3.4.8
MMCV: 1.3.18
MMCV Compiler: GCC 7.3
MMCV CUDA Compiler: 11.1
MMDetection: 2.8.0+

2023-10-25 15:10:33,091 - radet - INFO - Distributed training: False
2023-10-25 15:10:33,396 - radet - INFO - Config:
dataset_type = 'BOPDataset'
data_root = 'data/rcv/'
img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations', with_bbox=True, with_bop_mask=True),
dict(type='Resize', img_scale=(640, 480), keep_ratio=True),
dict(
type='CosyPoseAug',
p=0.8,
pipelines=[
dict(type='PillowBlur', p=1.0, factor_interval=(1, 3)),
dict(type='PillowSharpness', p=0.3, factor_interval=(0.0, 50.0)),
dict(type='PillowContrast', p=0.3, factor_interval=(0.2, 50.0)),
dict(type='PillowBrightness', p=0.5, factor_interval=(0.1, 6.0)),
dict(type='PillowColor', p=0.3, factor_interval=(0.0, 20.0))
]),
dict(type='RandomFlip', flip_ratio=0.5),
dict(type='GenerateDistanceMap'),
dict(
type='LabelAssignment',
anchor_generator_cfg=dict(
type='AnchorGenerator',
ratios=[1.0],
octave_base_scale=8,
scales_per_octave=1,
strides=[8, 16, 32, 64, 128]),
neg_threshold=0.2,
positive_num=10,
adapt_positive_num=False,
balance_sample=True),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=16),
dict(type='DefaultFormatBundle'),
dict(
type='Collect',
keys=[
'img', 'gt_bboxes', 'gt_labels', 'points_to_gt_index',
'points_weight'
])
]
test_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(640, 480),
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img'])
])
]
data = dict(
samples_per_gpu=16,
workers_per_gpu=8,
train=dict(
type='BOPDataset',
ann_file='data/rcv/detector_annotations/train1.json',
img_prefix='data/rcv/train/',
seg_prefix='data/rcv/train/',
pipeline=[
dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations', with_bbox=True, with_bop_mask=True),
dict(type='Resize', img_scale=(640, 480), keep_ratio=True),
dict(
type='CosyPoseAug',
p=0.8,
pipelines=[
dict(type='PillowBlur', p=1.0, factor_interval=(1, 3)),
dict(
type='PillowSharpness',
p=0.3,
factor_interval=(0.0, 50.0)),
dict(
type='PillowContrast',
p=0.3,
factor_interval=(0.2, 50.0)),
dict(
type='PillowBrightness',
p=0.5,
factor_interval=(0.1, 6.0)),
dict(
type='PillowColor', p=0.3, factor_interval=(0.0, 20.0))
]),
dict(type='RandomFlip', flip_ratio=0.5),
dict(type='GenerateDistanceMap'),
dict(
type='LabelAssignment',
anchor_generator_cfg=dict(
type='AnchorGenerator',
ratios=[1.0],
octave_base_scale=8,
scales_per_octave=1,
strides=[8, 16, 32, 64, 128]),
neg_threshold=0.2,
positive_num=10,
adapt_positive_num=False,
balance_sample=True),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=16),
dict(type='DefaultFormatBundle'),
dict(
type='Collect',
keys=[
'img', 'gt_bboxes', 'gt_labels', 'points_to_gt_index',
'points_weight'
])
],
classes=('obj_000001', 'obj_000002', 'obj_000003', 'obj_000004',
'obj_000005', 'obj_000006', 'obj_000007', 'obj_000008',
'obj_000009', 'obj_000010', 'obj_000011', 'obj_000012',
'obj_000013', 'obj_000014', 'obj_000015', 'obj_000016'),
min_visib_frac=0.1),
val=dict(
type='BOPDataset',
ann_file='data/rcv/detector_annotations/test_targets_bop19.json',
img_prefix='data/rcv/test/',
pipeline=[
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(640, 480),
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img'])
])
],
classes=('obj_000001', 'obj_000002', 'obj_000003', 'obj_000004',
'obj_000005', 'obj_000006', 'obj_000007', 'obj_000008',
'obj_000009', 'obj_000010', 'obj_000011', 'obj_000012',
'obj_000013', 'obj_000014', 'obj_000015', 'obj_000016')),
test=dict(
type='BOPDataset',
ann_file='data/rcv/detector_annotations/test_targets_bop19.json',
img_prefix='data/rcv/test/',
pipeline=[
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(640, 480),
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img'])
])
],
bop_submission=True,
classes=('obj_000001', 'obj_000002', 'obj_000003', 'obj_000004',
'obj_000005', 'obj_000006', 'obj_000007', 'obj_000008',
'obj_000009', 'obj_000010', 'obj_000011', 'obj_000012',
'obj_000013', 'obj_000014', 'obj_000015', 'obj_000016')))
optimizer = dict(
type='AdamW',
lr=0.0004,
betas=(0.9, 0.999),
weight_decay=0.05,
eps=1e-08,
amsgrad=False)
lr_config = dict(
policy='OneCycle',
max_lr=0.0004,
total_steps=100100,
pct_start=0.05,
anneal_strategy='linear')
runner = dict(type='IterBasedRunner', max_iters=100000)
checkpoint_config = dict(interval=10000)
evaluation = dict(interval=10000, metric='bbox')
optimizer_config = dict(grad_clip=dict(max_norm=35, norm_type=2))
log_config = dict(interval=50, hooks=[dict(type='TextLoggerHook')])
dist_params = dict(backend='nccl')
log_level = 'INFO'
workflow = [('train', 1)]
CLASS_NAMES = ('obj_000001', 'obj_000002', 'obj_000003', 'obj_000004',
'obj_000005', 'obj_000006', 'obj_000007', 'obj_000008',
'obj_000009', 'obj_000010', 'obj_000011', 'obj_000012',
'obj_000013', 'obj_000014', 'obj_000015', 'obj_000016')
model = dict(
type='RADet',
pretrained='torchvision://resnet50',
backbone=dict(
type='ResNet',
depth=50,
num_stages=4,
out_indices=(0, 1, 2, 3),
frozen_stages=1,
norm_cfg=dict(type='BN', requires_grad=True),
norm_eval=True,
style='pytorch'),
neck=dict(
type='FPN',
in_channels=[256, 512, 1024, 2048],
out_channels=256,
start_level=1,
add_extra_convs='on_output',
num_outs=5),
bbox_head=dict(
type='RADetHead',
num_classes=16,
in_channels=256,
stacked_convs=4,
feat_channels=256,
strides=[8, 16, 32, 64, 128],
anchor_generator=dict(
type='AnchorGenerator',
ratios=[1.0],
octave_base_scale=8,
scales_per_octave=1,
strides=[8, 16, 32, 64, 128]),
bbox_coder=dict(type='TBLRBBoxCoder', normalizer=0.125),
loss_cls=dict(
type='FocalLoss',
use_sigmoid=True,
gamma=2.0,
alpha=0.25,
loss_weight=1.0),
loss_bbox=dict(type='GIoULoss', loss_weight=2.0),
loss_centerness=dict(
type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0)))
train_cfg = dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.5,
neg_iou_thr=0.4,
min_pos_iou=0,
ignore_iof_thr=-1),
allowed_border=-1,
pos_weight=-1,
debug=False)
test_cfg = dict(
nms_pre=1000,
min_bbox_size=0,
score_thr=0.05,
nms=dict(
type='vote',
iou_threshold=0.65,
cluster_score=['cls', 'iou'],
vote_score=['iou', 'cls'],
iou_enable=False,
sima=0.025),
max_per_img=100)
work_dir = 'work_dirs/rcv_r50_radet_pbr'
gpu_ids = range(0, 1)

2023-10-25 15:10:33,749 - radet - INFO - load model from: torchvision://resnet50
2023-10-25 15:10:33,749 - radet - INFO - load checkpoint from torchvision path: torchvision://resnet50
2023-10-25 15:10:34,073 - radet - WARNING - The model and loaded state dict do not match exactly

unexpected key in source state_dict: fc.weight, fc.bias

loading annotations into memory...
Done (t=2.42s)
creating index...
index created!
fatal: not a git repository (or any parent up to mount point /media/xiaobingli)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
loading annotations into memory...
Traceback (most recent call last):
File "/home/xiaobingli/anaconda3/envs/RADet-main/lib/python3.6/site-packages/mmcv/utils/registry.py", line 52, in build_from_cfg
return obj_cls(**args)
File "/media/xiaobingli/myself/LXB/RADet-main/radet/datasets/bop.py", line 36, in init
filter_empty_gt)
File "/media/xiaobingli/myself/LXB/RADet-main/radet/datasets/custom.py", line 87, in init
self.data_infos = self.load_annotations(self.ann_file)
File "/media/xiaobingli/myself/LXB/RADet-main/radet/datasets/coco.py", line 57, in load_annotations
self.coco = COCO(ann_file)
File "/home/xiaobingli/anaconda3/envs/RADet-main/lib/python3.6/site-packages/pycocotools/coco.py", line 89, in init
type(dataset))
AssertionError: annotation file format <class 'list'> not supported

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "tools/train.py", line 186, in
main()
File "tools/train.py", line 182, in main
meta=meta)
File "/media/xiaobingli/myself/LXB/RADet-main/radet/apis/train.py", line 139, in train_detector
val_dataset = build_dataset(cfg.data.val, dict(test_mode=True))
File "/media/xiaobingli/myself/LXB/RADet-main/radet/datasets/builder.py", line 78, in build_dataset
dataset = build_from_cfg(cfg, DATASETS, default_args)
File "/home/xiaobingli/anaconda3/envs/RADet-main/lib/python3.6/site-packages/mmcv/utils/registry.py", line 55, in build_from_cfg
raise type(e)(f'{obj_cls.name}: {e}')
AssertionError: BOPDataset: annotation file format <class 'list'> not supported

Where to download COCO background images for the RandomBackground augmentation step?

Hey @YangHai-1218,

I ran into an issue with the RandomBackground augmentation step specified in configs/base/datasets/bop_detection.py, which appears to be looking for background images in the data/coco directory by default. Just by looking at the code it seems like you could potentially use any set of images as the background. But just for the sake of it, do you guys know exactly which images you used and where to download them?

Also for the time being, I've just commented out the RandomBackground augmentation step in the configs/base/datasets/bop_detection.py file, would that be acceptable? Or is this a crucial augmentation step?

Performance w.r.t. different occlusion ratios

@YangHai-1218
Hi, hai:
Thanks for sharing so nice work. Could you mind providing the script to get the data of Fig.7 Performance w.r.t. different occlusion ratios in your paper? I cannot find the code to reproduce this work.

How to train custom datasets?

Hi, Congratulations on your excellent work! I have some questions about how to train my datasets. Now I have 1000 images, masks, 6D poses, CAD model and camera's intrinsics. Could you please give me some help with this question? Thanks:)

some questions about the paper

Thanks for your greate work
I have some questions about the paper

In the training process, the optimal result is to find the optimal distance between seeds and the target pixel?
2.In the selection of predicited bboxs, the paper mentions fusing the same category of predicted bboxs to obtain more accurate results. Specifically, by what method is the fusion done. What are the specific differences with nms.

Thanks,
Best

Running into problems with mmcv==1.3.18

Hello,

I'm trying to run the training script in tools/train.py and I ran into a few errors with mmcv (using version 1.3.18).

The first problem I ran into complained about mmcv._ext:

$ python tools/train.py --config configs/bop/r50_ycbv_pbr_testing123.py           
Traceback (most recent call last):                                                                                                                            
  File "/data2/6d-pose-estimation/radet-attempt-2/RADet/tools/train.py", line 15, in <module>                                                                 
    from radet.apis import set_random_seed, train_detector                                                                                                    
  File "/data2/6d-pose-estimation/radet-attempt-2/RADet/radet/apis/__init__.py", line 1, in <module>                                                          
    from .inference import (async_inference_detector, inference_detector,                                                                                     
  File "/data2/6d-pose-estimation/radet-attempt-2/RADet/radet/apis/inference.py", line 6, in <module>                                                         
    from mmcv.ops import RoIPool                                                                                                                              
  File "/home/exx/anaconda3/envs/radet-attempt-2-0/lib/python3.9/site-packages/mmcv/ops/__init__.py", line 2, in <module>                                     
    from .assign_score_withk import assign_score_withk                                                                                                        
  File "/home/exx/anaconda3/envs/radet-attempt-2-0/lib/python3.9/site-packages/mmcv/ops/assign_score_withk.py", line 5, in <module>                           
    ext_module = ext_loader.load_ext(                                                                                                                         
  File "/home/exx/anaconda3/envs/radet-attempt-2-0/lib/python3.9/site-packages/mmcv/utils/ext_loader.py", line 13, in load_ext                                
    ext = importlib.import_module('mmcv.' + name)                                                                                                             
  File "/home/exx/anaconda3/envs/radet-attempt-2-0/lib/python3.9/importlib/__init__.py", line 127, in import_module                                           
    return _bootstrap._gcd_import(name[level:], package, level)                                                                                               
ModuleNotFoundError: No module named 'mmcv._ext'

But I was able to find this issue on GitHub (open-mmlab/mmcv#204) that provided a solution for it which basically boiled down to running MMCV_WITH_OPS=1 pip install -e . in the root directory of the source code for mmcv==1.3.18. But then I ran into another issue:

$ python tools/train.py --config configs/bop/r50_ycbv_pbr_testing123.py 
Traceback (most recent call last):
  File "/data2/6d-pose-estimation/radet-attempt-2/RADet/tools/train.py", line 15, in <module>
    from radet.apis import set_random_seed, train_detector
  File "/data2/6d-pose-estimation/radet-attempt-2/RADet/radet/apis/__init__.py", line 1, in <module>
    from .inference import (async_inference_detector, inference_detector,
  File "/data2/6d-pose-estimation/radet-attempt-2/RADet/radet/apis/inference.py", line 6, in <module>
    from mmcv.ops import RoIPool
  File "/home/exx/anaconda3/envs/radet-attempt-2-0/lib/python3.9/site-packages/mmcv-1.3.18/mmcv/ops/__init__.py", line 2, in <module>
    from .assign_score_withk import assign_score_withk
  File "/home/exx/anaconda3/envs/radet-attempt-2-0/lib/python3.9/site-packages/mmcv-1.3.18/mmcv/ops/assign_score_withk.py", line 5, in <module>
    ext_module = ext_loader.load_ext(
  File "/home/exx/anaconda3/envs/radet-attempt-2-0/lib/python3.9/site-packages/mmcv-1.3.18/mmcv/utils/ext_loader.py", line 13, in load_ext
    ext = importlib.import_module('mmcv.' + name)
  File "/home/exx/anaconda3/envs/radet-attempt-2-0/lib/python3.9/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
ImportError: /home/exx/anaconda3/envs/radet-attempt-2-0/lib/python3.9/site-packages/mmcv-1.3.18/mmcv/_ext.cpython-39-x86_64-linux-gnu.so: undefined symbol: _Z27points_in_boxes_cpu_forwardN2at6TensorES0_S0_

I found this GitHub issue (open-mmlab/mmcv#1556) from the mmcv repository that claims this problem was fixed in mmcv==1.4.0

My question is, are you guys sure that mmcv==1.3.18 is the right version to be using? Am I not installing it correctly? Would it be preferable to use a version lower than 1.3.18, such as 1.3.17, or would it be better to use 1.4.0?

yanghai-1218 / radet Goto Github PK

radet's People

Contributors

Stargazers

Watchers

Forkers

radet's Issues

fatal: not a git repository (or any parent up to mount point /media/xiaobingli) Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set). 2023-10-25 15:10:32,798 - radet - INFO - Environment info:

TorchVision: 0.9.0 OpenCV: 3.4.8 MMCV: 1.3.18 MMCV Compiler: GCC 7.3 MMCV CUDA Compiler: 11.1 MMDetection: 2.8.0+

Recommend Projects

Recommend Topics

Recommend Org

fatal: not a git repository (or any parent up to mount point /media/xiaobingli)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
2023-10-25 15:10:32,798 - radet - INFO - Environment info:

TorchVision: 0.9.0
OpenCV: 3.4.8
MMCV: 1.3.18
MMCV Compiler: GCC 7.3
MMCV CUDA Compiler: 11.1
MMDetection: 2.8.0+