fuenwang / 3d-boundingbox Goto Github PK

View Code? Open in Web Editor NEW

127.0 10.0 38.0 15 KB

PyTorch implementation for 3D Bounding Box Estimation Using Deep Learning and Geometry

Python 100.00%

deep-learning self-driving-car pytorch kitti python car

3d-boundingbox's People

Contributors

Stargazers

Watchers

3d-boundingbox's Issues

Visualize the 3D coordinate

Hello I am very new in this research. How can we draw the 3D bounding boxes? I have read the paper and still don't understand how to map from our 3D coordinate to the image which is 2D? From your code we can find dimension, center point, also angle, but how can we draw it to the image using opencv?
-Thank You-

Should I crop image before input my own image to the model?

Hi, I am trying to test some of my own images to your code,
but I found that the input of your model is a crop image instead of a raw image.
So does it mean that I have to do 2D bounding box with my own raw image,
and then input each crop image to your model to get 3D bounding Box?
I am new to deep learning, hope you won't think it is a silly question~
Thanks.

Does it work for YOLOv3?

Hi @fuenwang ! I am new to deep learning. I am trying to output a 3D box from already detected 2D bounding box.

I used YOLOv3 and produced an output with a 2D bounding box. In order to generate a 3D bounding box from the image, how do I proceed using your code? Do I need to extract the co-ordinates of these 2D bounding boxes and then pass these co-ordinates along with the image to your code to get a 3D bounding box?
Thanks.

Confused about so many angles

I founded so many angles in your code ,KITTI label and the paper and I have some questions
1.

As shown in the image, theta(red )=theta_ray(black)+theta_local(bule)-360, but the paper said theta=theta_ray +theta_local, I am very confused about it.
2.Is the Alpha angle equal to theta_ray? if not, what is the different?
3. What is the local angle? What is the difference between local angle and local theta?

How to markup custom dataset?

Hi! What is a procedure to markup a custom dataset?

a bad result, training loss erro

thank you so much for your code,
but the loss is always not convergent,

and i test the result , its still bad,

so ...may i know how many epochs you have trained and the change of the loss?

that will be very helpful

Confused about confidence_multi and confidence

Thanks for your great work, I want to know what the meaning of intervalAngle / 2 + self.overlap and intervalAngle / 2, waht the different between confidence_multi and confidence in dataset.py。Thanks

Question for the

Hi，@fuenwang
Can you explain to me the function of these two codes? Does the execution of the second code depend on the execution of the first code?
Thanks man！

insert video

I am a student trying to implement a single camera based ranging system. I am trying to get information about 2D BBOX using kitti dataset using yolov5 and reduce the error in distance measurement through angle regression.
When I train the kitti data and then eval, I want to get the dimension, angle, orient information of the front object by inserting video, not a custom data set or kitti dataset. Can you ask for help on what part to change??
I'm still a college student, and I haven't been studying about vision for a while, so I don't know. I'd like to ask for help if you don't mind

pre-trained model

Can someone share a pre-trained model for reproducing some results?

how to plot 3d bounding box by predicted dimension and orientation and 2D bounding box

When I use the location in ground truth and dimension and R_y predicted by model, I can plot a better 3D bounding box in test image. However, I want to plot 3D bounding box without any assitance from ground truth. Then, how can I infer the location or eight corner point for each object ???

def plot_3d_bbox(img, label_info):
    print('current label info: ', label_info)
    alpha = label_info['alpha']
    # theta_ray = label_info['theta_ray']
    box_3d = []
    center = label_info['location']
    dims = label_info['dimension']
    cam_to_img = label_info['calib']
    rot_y = alpha / 180 * np.pi + np.arctan(center[0] / center[2])
    # import pdb; pdb.set_trace()

    for i in [1, -1]:
        for j in [1, -1]:
            for k in [0, 1]:
                point = np.copy(center)
                point[0] = center[0] + i * dims[1] / 2 * np.cos(-rot_y + np.pi / 2) + (j * i) * dims[2] / 2 * np.cos(
                    -rot_y)
                point[2] = center[2] + i * dims[1] / 2 * np.sin(-rot_y + np.pi / 2) + (j * i) * dims[2] / 2 * np.sin(
                    -rot_y)
                point[1] = center[1] - k * dims[0]

                point = np.append(point, 1)
                point = np.dot(cam_to_img, point)
                point = point[:2] / point[2]
                point = point.astype(np.int16)
                box_3d.append(point)

    front_mark = []
    for i in range(4):
        point_1_ = box_3d[2 * i]
        point_2_ = box_3d[2 * i + 1]
        cv2.line(img, (point_1_[0], point_1_[1]), (point_2_[0], point_2_[1]), (0, 255, 0), 1)
        if i == 0 or i == 3:
            front_mark.append((point_1_[0], point_1_[1]))
            front_mark.append((point_2_[0], point_2_[1]))

    cv2.line(img, front_mark[0], front_mark[-1], (255, 0, 0), 1)
    cv2.line(img, front_mark[1], front_mark[2], (255, 0, 0), 1)

    for i in range(8):
        point_1_ = box_3d[i]
        point_2_ = box_3d[(i + 2) % 8]
        cv2.line(img, (point_1_[0], point_1_[1]), (point_2_[0], point_2_[1]), (0, 255, 0), 1)

    return img


def main():
    for f in all_image:
        image_file = os.path.join(image_dir, f)
        calib_file = os.path.join(calib_dir, f.replace('png', 'txt'))
        predi_file = os.path.join(predi_dir, f.replace('png', 'txt'))
        label_file = os.path.join(label_dir, f.replace('png', 'txt'))
        box3d_file = os.path.join(box3d_dir, f.replace('png', 'txt'))

        image = cv2.imread(image_file, cv2.COLOR_BGR2RGB)

        with open(label_file, 'r') as f:

            label_info = dict()
            for l in f.readlines():
                l = l.strip().split(' ')

                # this angle need a preprocess
                label_info['alpha'] = float(l[3])
                label_info['location'] = np.asarray(l[11: 14], dtype=float)
                label_info['dimension'] = np.asarray(l[8: 11], dtype=float)
                label_info['calib'] = get_calibration_cam_to_image(calib_file)

                image = plot_3d_bbox(image, label_info)
        cv2.imshow('rr', image)
        cv2.waitKey(0)

camera parameter in calib file

@fuenwang I'm not sure if the P2 (left image) in the calibration file is intrinsic camera matrix or the whole projection matrix (K[R T])?

Loss questions during training

I founded that the loss number is minus ,is there anything wrong? Can you give me some suggestions

fuenwang / 3d-boundingbox Goto Github PK

3d-boundingbox's People

Contributors

Stargazers

Watchers

Forkers

3d-boundingbox's Issues

Recommend Projects

Recommend Topics

Recommend Org