fuenwang / 3d-boundingbox Goto Github PK
View Code? Open in Web Editor NEWPyTorch implementation for 3D Bounding Box Estimation Using Deep Learning and Geometry
PyTorch implementation for 3D Bounding Box Estimation Using Deep Learning and Geometry
Hello I am very new in this research. How can we draw the 3D bounding boxes? I have read the paper and still don't understand how to map from our 3D coordinate to the image which is 2D? From your code we can find dimension, center point, also angle, but how can we draw it to the image using opencv?
-Thank You-
Hi, I am trying to test some of my own images to your code,
but I found that the input of your model is a crop image instead of a raw image.
So does it mean that I have to do 2D bounding box with my own raw image,
and then input each crop image to your model to get 3D bounding Box?
I am new to deep learning, hope you won't think it is a silly question~
Thanks.
Hi @fuenwang ! I am new to deep learning. I am trying to output a 3D box from already detected 2D bounding box.
I used YOLOv3 and produced an output with a 2D bounding box. In order to generate a 3D bounding box from the image, how do I proceed using your code? Do I need to extract the co-ordinates of these 2D bounding boxes and then pass these co-ordinates along with the image to your code to get a 3D bounding box?
Thanks.
I founded so many angles in your code ,KITTI label and the paper and I have some questions
1.
As shown in the image, theta(red )=theta_ray(black)+theta_local(bule)-360, but the paper said theta=theta_ray +theta_local, I am very confused about it.
2.Is the Alpha angle equal to theta_ray? if not, what is the different?
3. What is the local angle? What is the difference between local angle and local theta?
Hi! What is a procedure to markup a custom dataset?
thank you so much for your code,
but the loss is always not convergent,
and i test the result , its still bad,
so ...may i know how many epochs you have trained and the change of the loss?
that will be very helpful
Thanks for your great work, I want to know what the meaning of intervalAngle / 2 + self.overlap and intervalAngle / 2, waht the different between confidence_multi and confidence in dataset.py。Thanks
Hi,@fuenwang
Can you explain to me the function of these two codes? Does the execution of the second code depend on the execution of the first code?
Thanks man!
I am a student trying to implement a single camera based ranging system. I am trying to get information about 2D BBOX using kitti dataset using yolov5 and reduce the error in distance measurement through angle regression.
When I train the kitti data and then eval, I want to get the dimension, angle, orient information of the front object by inserting video, not a custom data set or kitti dataset. Can you ask for help on what part to change??
I'm still a college student, and I haven't been studying about vision for a while, so I don't know. I'd like to ask for help if you don't mind
Can someone share a pre-trained model for reproducing some results?
When I use the location in ground truth and dimension and R_y predicted by model, I can plot a better 3D bounding box in test image. However, I want to plot 3D bounding box without any assitance from ground truth. Then, how can I infer the location or eight corner point for each object ???
Futen 666666666666666666666
sorry ...come again, ..can you give the explain: how to get T ?
if we don't get location T(tx, ty, tz), we can't get the rotation_y.
Help me ...
If anyone wanna test how to using kitti 3D label to draw a box on image, or using the model prediction to draw a final box on image, there is a scripts you can reference:
def plot_3d_bbox(img, label_info):
print('current label info: ', label_info)
alpha = label_info['alpha']
# theta_ray = label_info['theta_ray']
box_3d = []
center = label_info['location']
dims = label_info['dimension']
cam_to_img = label_info['calib']
rot_y = alpha / 180 * np.pi + np.arctan(center[0] / center[2])
# import pdb; pdb.set_trace()
for i in [1, -1]:
for j in [1, -1]:
for k in [0, 1]:
point = np.copy(center)
point[0] = center[0] + i * dims[1] / 2 * np.cos(-rot_y + np.pi / 2) + (j * i) * dims[2] / 2 * np.cos(
-rot_y)
point[2] = center[2] + i * dims[1] / 2 * np.sin(-rot_y + np.pi / 2) + (j * i) * dims[2] / 2 * np.sin(
-rot_y)
point[1] = center[1] - k * dims[0]
point = np.append(point, 1)
point = np.dot(cam_to_img, point)
point = point[:2] / point[2]
point = point.astype(np.int16)
box_3d.append(point)
front_mark = []
for i in range(4):
point_1_ = box_3d[2 * i]
point_2_ = box_3d[2 * i + 1]
cv2.line(img, (point_1_[0], point_1_[1]), (point_2_[0], point_2_[1]), (0, 255, 0), 1)
if i == 0 or i == 3:
front_mark.append((point_1_[0], point_1_[1]))
front_mark.append((point_2_[0], point_2_[1]))
cv2.line(img, front_mark[0], front_mark[-1], (255, 0, 0), 1)
cv2.line(img, front_mark[1], front_mark[2], (255, 0, 0), 1)
for i in range(8):
point_1_ = box_3d[i]
point_2_ = box_3d[(i + 2) % 8]
cv2.line(img, (point_1_[0], point_1_[1]), (point_2_[0], point_2_[1]), (0, 255, 0), 1)
return img
def main():
for f in all_image:
image_file = os.path.join(image_dir, f)
calib_file = os.path.join(calib_dir, f.replace('png', 'txt'))
predi_file = os.path.join(predi_dir, f.replace('png', 'txt'))
label_file = os.path.join(label_dir, f.replace('png', 'txt'))
box3d_file = os.path.join(box3d_dir, f.replace('png', 'txt'))
image = cv2.imread(image_file, cv2.COLOR_BGR2RGB)
with open(label_file, 'r') as f:
label_info = dict()
for l in f.readlines():
l = l.strip().split(' ')
# this angle need a preprocess
label_info['alpha'] = float(l[3])
label_info['location'] = np.asarray(l[11: 14], dtype=float)
label_info['dimension'] = np.asarray(l[8: 11], dtype=float)
label_info['calib'] = get_calibration_cam_to_image(calib_file)
image = plot_3d_bbox(image, label_info)
cv2.imshow('rr', image)
cv2.waitKey(0)
@fuenwang I'm not sure if the P2 (left image) in the calibration file is intrinsic camera matrix or the whole projection matrix (K[R T])?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.