Code Monkey home page Code Monkey logo

3d-boundingbox's People

Contributors

fuenwang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

3d-boundingbox's Issues

Visualize the 3D coordinate

Hello I am very new in this research. How can we draw the 3D bounding boxes? I have read the paper and still don't understand how to map from our 3D coordinate to the image which is 2D? From your code we can find dimension, center point, also angle, but how can we draw it to the image using opencv?
-Thank You-
image
image

Should I crop image before input my own image to the model?

Hi, I am trying to test some of my own images to your code,
but I found that the input of your model is a crop image instead of a raw image.
So does it mean that I have to do 2D bounding box with my own raw image,
and then input each crop image to your model to get 3D bounding Box?
I am new to deep learning, hope you won't think it is a silly question~
Thanks.

Does it work for YOLOv3?

Hi @fuenwang ! I am new to deep learning. I am trying to output a 3D box from already detected 2D bounding box.

I used YOLOv3 and produced an output with a 2D bounding box. In order to generate a 3D bounding box from the image, how do I proceed using your code? Do I need to extract the co-ordinates of these 2D bounding boxes and then pass these co-ordinates along with the image to your code to get a 3D bounding box?
Thanks.

Confused about so many angles

I founded so many angles in your code ,KITTI label and the paper and I have some questions
1.
1
As shown in the image, theta(red )=theta_ray(black)+theta_local(bule)-360, but the paper said theta=theta_ray +theta_local, I am very confused about it.
2.Is the Alpha angle equal to theta_ray? if not, what is the different?
3. What is the local angle? What is the difference between local angle and local theta?

a bad result, training loss erro

thank you so much for your code,
but the loss is always not convergent,

and i test the result , its still bad,

so ...may i know how many epochs you have trained and the change of the loss?

that will be very helpful

Confused about confidence_multi and confidence

Thanks for your great work, I want to know what the meaning of intervalAngle / 2 + self.overlap and intervalAngle / 2, waht the different between confidence_multi and confidence in dataset.py。Thanks

Question for the

Hi,@fuenwang
Can you explain to me the function of these two codes? Does the execution of the second code depend on the execution of the first code?
Thanks man!

insert video

I am a student trying to implement a single camera based ranging system. I am trying to get information about 2D BBOX using kitti dataset using yolov5 and reduce the error in distance measurement through angle regression.
When I train the kitti data and then eval, I want to get the dimension, angle, orient information of the front object by inserting video, not a custom data set or kitti dataset. Can you ask for help on what part to change??
I'm still a college student, and I haven't been studying about vision for a while, so I don't know. I'd like to ask for help if you don't mind

pre-trained model

Can someone share a pre-trained model for reproducing some results?

I COME BACK, hahaha

sorry ...come again, ..can you give the explain: how to get T ?
if we don't get location T(tx, ty, tz), we can't get the rotation_y.
Help me ...

Scripts using for drawing 3d box, can test with only ground truth

If anyone wanna test how to using kitti 3D label to draw a box on image, or using the model prediction to draw a final box on image, there is a scripts you can reference:

def plot_3d_bbox(img, label_info):
    print('current label info: ', label_info)
    alpha = label_info['alpha']
    # theta_ray = label_info['theta_ray']
    box_3d = []
    center = label_info['location']
    dims = label_info['dimension']
    cam_to_img = label_info['calib']
    rot_y = alpha / 180 * np.pi + np.arctan(center[0] / center[2])
    # import pdb; pdb.set_trace()

    for i in [1, -1]:
        for j in [1, -1]:
            for k in [0, 1]:
                point = np.copy(center)
                point[0] = center[0] + i * dims[1] / 2 * np.cos(-rot_y + np.pi / 2) + (j * i) * dims[2] / 2 * np.cos(
                    -rot_y)
                point[2] = center[2] + i * dims[1] / 2 * np.sin(-rot_y + np.pi / 2) + (j * i) * dims[2] / 2 * np.sin(
                    -rot_y)
                point[1] = center[1] - k * dims[0]

                point = np.append(point, 1)
                point = np.dot(cam_to_img, point)
                point = point[:2] / point[2]
                point = point.astype(np.int16)
                box_3d.append(point)

    front_mark = []
    for i in range(4):
        point_1_ = box_3d[2 * i]
        point_2_ = box_3d[2 * i + 1]
        cv2.line(img, (point_1_[0], point_1_[1]), (point_2_[0], point_2_[1]), (0, 255, 0), 1)
        if i == 0 or i == 3:
            front_mark.append((point_1_[0], point_1_[1]))
            front_mark.append((point_2_[0], point_2_[1]))

    cv2.line(img, front_mark[0], front_mark[-1], (255, 0, 0), 1)
    cv2.line(img, front_mark[1], front_mark[2], (255, 0, 0), 1)

    for i in range(8):
        point_1_ = box_3d[i]
        point_2_ = box_3d[(i + 2) % 8]
        cv2.line(img, (point_1_[0], point_1_[1]), (point_2_[0], point_2_[1]), (0, 255, 0), 1)

    return img


def main():
    for f in all_image:
        image_file = os.path.join(image_dir, f)
        calib_file = os.path.join(calib_dir, f.replace('png', 'txt'))
        predi_file = os.path.join(predi_dir, f.replace('png', 'txt'))
        label_file = os.path.join(label_dir, f.replace('png', 'txt'))
        box3d_file = os.path.join(box3d_dir, f.replace('png', 'txt'))

        image = cv2.imread(image_file, cv2.COLOR_BGR2RGB)

        with open(label_file, 'r') as f:

            label_info = dict()
            for l in f.readlines():
                l = l.strip().split(' ')

                # this angle need a preprocess
                label_info['alpha'] = float(l[3])
                label_info['location'] = np.asarray(l[11: 14], dtype=float)
                label_info['dimension'] = np.asarray(l[8: 11], dtype=float)
                label_info['calib'] = get_calibration_cam_to_image(calib_file)

                image = plot_3d_bbox(image, label_info)
        cv2.imshow('rr', image)
        cv2.waitKey(0)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.