vita-epfl / monoloco Goto Github PK

A 3D vision library from 2D keypoints: monocular and stereo 3D detection for humans, social distancing, and body orientation.

Home Page: https://vita.epfl.ch/monoloco

License: Other

Python 80.34% CMake 0.05% C++ 19.61%

3d-object-detection 3d-detection 3d-deep-learning pytorch computer-vision deep-learning machine-learning pose-estimation human-pose-estimation uncertainty

monoloco's People

Contributors

Stargazers

Watchers

Forkers

spyderxu xychen9459 jgabriellima ogencoglu peterzs jlqzzz ngocketit recreatemyself aiplus2019 peterzhousz joyceya ahangchen zfyong sadjadasghari hbyacademic aoe-khkhan hcv1027 phymucs louisnust javierlorenzod xjsxujingsong anujonthemove abdelrahman-ashraff caric-bot guptam lp940708 luben2018 yanghedada dineshkumares ehwhoami plarr2020-team1 pedestrian-detection saccadic mengqidyangge jeffgan99 gaopeng5 alexdevmotion cteckwee 10183308 plkms rstatkevych miu200521358 alalemp levirion mfkiwl charlesbvll daodao9801 josephkokchin deepbehavier ruthrash dongxuguo1997 machinevisionbeans zkxwy1996 jie311 quantum-entropy miaochenguo magicjane cuulee jimbojonesmgr wufanie wayveai gallegi minatab icra-2021 xuefeng6 rosnovice wolfworld6 minhna1112 schliffen alesss4ndro69 roboticrepositories dimarapis simzhi kiwiixaos vicentetm luojueling zjut-jianhuazhang changlee0903

monoloco's Issues

Need help with generate ground truth file

Lorenzo, good afternoon.
Help me figure out some points in the formation of data for monoloco to work on other cameras.

In the description of the Prediction section it is said that for more accurate results it is necessary to use the ground truth file. It can be obtained by following the instructions in the Preprocessing section.
But here it becomes unclear, tk. This section describes how to get the ground truth file from the KITTI / nuScenes dataset, but how do I get the ground truth file for my scene? Do we understand correctly that for this you need to collect your dataset from your scene and train your model on it. And already on your dataset you can get the ground truth file we need.
The project also talks about the use of a calibration matrix, which provides more accurate results. Your project uses a standard calibration matrix, which is designed for a resolution of 1600 * 900. But I have a video image with a resolution of 1280 * 720, due to which the results of monoloco work are very different from real ones, how can I get the calibration matrix for my image size?

additional question from issue #45

Hello bertoni

I have additional question from issue 45
regarding the training label file

the name in training instruction command is "joints-kitti-201202-1743.json"
but the link file name is "joints-kitti-200604-0939.json"
do they have same contents? between 200604 & 201202

thank you

Reading an Image from ros-topic

I am using Monoloco in a Mobile robot application using ROS that I am currently developing, if you don't mind I need some help regarding Monoloco.
I am quite familiar with using Monoloco with pictures using --glob argument as mentioned in the repo, now I am trying to apply Monoloco on an image coming from ros-topic,

Is Monoloco available to be used in this scenario or it must have the image existing on the system ?
From the help I noticed that glob argument is used to add image(s) to Monoloco, What's the image argument then and how can I use the image argument?

That's not an issue with the network it's quite perfect, Just a problem I've been struggling with regarding Monoloco, so if you could give me any hint this would be great.
Thanks in Advance.

Using monoloco with Openpifpaf 0.11.6

Hi, I tried to integrate openpifpaf 0.11.6 to monoloco as 0.11.6 is way faster than 0.9.0 but the monoloco results were way off. The key difference I observed from the 2 versions are 0.9.0 will provide all keypoint estimates even if certain joints are invisible or occluded whereas 0.11.6 will just return zero.

Can I confirm if this is the reason? Or if there are other reasons why the results are so off and if there are ways around it?

btw, awesome work done. I really like the quality of the projects.

Question about [geometric_baseline]

HI. Thank you for your open-source program.
I have one question about geometric_baseline program:
In the geometric method, how did the following matrix(matrix,bb) be obtained?

def compute_distance(xyz_norm_1, xyz_norm_2, average_y, mode='average', dy_met=0):
"""
Compute distance Z of a mask annotation (solving a linear system) for 2 possible cases:
1. knowing specific height of the annotation (head-ankle) dy_met
2. using mean height of people (average_y)
"""
assert mode in ('average', 'real')

x1 = float(xyz_norm_1[0])
y1 = float(xyz_norm_1[1])
x2 = float(xyz_norm_2[0])
y2 = float(xyz_norm_2[1])
xx = (x1 + x2) / 2

# Choose if solving for provided height or average one.
if mode == 'average':
    cc = - average_y  # Y axis goes down
else:
    cc = -dy_met

# Solving the linear system Ax = b
**matrix** = np.array([[y1, 0, -xx],
                   [0, -y1, 1],
                   [y2, 0, -xx],
                   [0, -y2, 1]])

**bb** = np.array([cc * xx, -cc, 0, 0]).reshape(4, 1)
xx = np.linalg.lstsq(matrix, bb, rcond=None)
z_met = abs(np.float(xx[0][1]))  # Abs take into account specularity behind the observer

return z_met

--mode keypoints Error

python3 -m monoloco.run predict m2.jpeg --mode keypoints

error messages：
"monoloco/predict.py",line 261, in predict
avg_time = int(np.mean(timing))

"cannot convert float NaN to integer"

How to apply monoloco to each frame of a video?

Hello,
I am trying to use monoloco for a video file. At the moment my code has the following simple structure:

I would like to call monoloco where it says "detect pose". I am now looking of a way to apply monoloco to each frame of the video and it should be as fast as possible. Thats why I tried for example loading the model before looping through each frame of the video, which did not work. Is there an easy way of calling monoloco within my script for each frame? I am looking fo a similar way like the openpifpaf api described here. I dont want to save each frame as a .png file and then use monoloco. Thanks :)

How to use output of PifPaf plugins (Crowdpose)

Hi,
Great Work with Monoloco and Pifpaf.

I wanted to train Monoloco on my custom datasets where I don't have all 17 keypoints. I'll be having 15 Keypoints and wanted to get depth out of them. So I was thinking of using plugin in PifPaf for training and use those keypoints in Monoloco.

How can I pass output of PIFPAF on custom dataset? Can you please help me how entire process will work.

Thanks,
Jagdish Bhanushali

Pose track plugin for video

Hello, the project is exciting. I would like to ask if we have the possibility to integrate pose track plugin of OpenPifPaf (https://github.com/openpifpaf/openpifpaf_posetrack) to Monoloco so that it could give person tracking ID for a video clip. I have tried to integrate a piece of core code of OpenPifPaf to Monoloco to extract each person's pose data for each frame, but it could not find the PoseTrack decoder from Monoloco, even though I have installed OpenPifPaf-posetrack to the conda environment. Could you give some suggestions? Thank you in advance.

Problem about --keypoints command

I apologize for my ignorance.
If you try to use --keypoints, you will get the following error.

ValueError: cannot convert float NaN to integer

It was running until two months ago.

Here is the log.

python3 -m monoloco.run predict sample.jpg --mode keypoints

`
INFO:monoloco.predict:Force complete pose is active

INFO:openpifpaf.predictor:neural network device: cuda (CUDA available: True, count: 1)
INFO:openpifpaf.decoder.cifcaf:5 annotations: [12, 12, 11, 6, 0]
INFO:openpifpaf.predictor:batch 0: sample.jpg
0 image sample.jpg saved as out_sample.jpg
/home/fuchsia/.local/lib/python3.6/site-packages/numpy/core/fromnumeric.py:3373: RuntimeWarning: Mean of empty slice.
out=out, **kwargs)
/home/fuchsia/.local/lib/python3.6/site-packages/numpy/core/_methods.py:170: RuntimeWarning: invalid value encountered in double_scalars
ret = ret.dtype.type(ret / rcount)
Traceback (most recent call last):
File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/usr/local/lib/python3.6/dist-packages/monoloco/run.py", line 218, in
main()
File "/usr/local/lib/python3.6/dist-packages/monoloco/run.py", line 147, in main
predict(args)
File "/usr/local/lib/python3.6/dist-packages/monoloco/predict.py", line 259, in predict
avg_time = int(np.mean(timing))
ValueError: cannot convert float NaN to integer
`

Installing monoloco on Jetson

Hi,

I am new to the area and tried to install monoloco as recommended by running,
pip3 install monoloco
However, it droped error saying,
"Collecting torch>=1.7.1 (from openpifpaf>=v0.12.1->monoloco)
Could not find a version that satisfies the requirement torch>=1.7.1 (from openpifpaf>=v0.12.1->monoloco) (from versions: 0.1.2, 0.1.2.post1, 0.1.2.post2)
No matching distribution found for torch>=1.7.1 (from openpifpaf>=v0.12.1->monoloco)
"

I am running the command on my Jetson Xavier NX module with pytorch 1.8.0 installed,
when I run pip3 freeze, torch=1.8.0 is shown and import torch in the python3 is successful.

I guess that means the torch version is satisfied, right?
Could you please give some advice on how to fix this?
Thanks in advance.

Question about [training process of the prediction network]

When training your prediction network, where does the ground truth distance μ and spread b come from?

In the Section 5 of the original paper, I am still unclear about how to obtain input-output pairs of 2D joints and distances, especially how to get the ground truth distance μ and spread b of various kinds of predestrians.

It's really important to train a prediction network. I hope you can give more tips, explanations or references.

Regarding the updated question

Evaluation data (KITTI) link is down ?

HI,
I am trying to download the evaluation data but it seems the links don't work. Could you check the links below?

Mono3D: download validation files from here and save them into data/kitti/m3d
3DOP: download validation files from here and save them into data/kitti/3dop

inverse of camera intrinsic matrix kk

Hi, I've 2 questions:

I understand that pixel_to_camera function is to back project the image keypoints to real world coordinates before feeding into the monoloco network to estimate z. But I'm having trouble reconciling the code with the formula in your research paper in section 4.1 eq (3), i.e. the order of the matrix multiplication between uv and the inverse of kk doesn't seem to match.
I'm looking at a use case of mounted camera at titled angle. I wonder if multiplying the inverse of the extrinsic camera matrix (rotation + translation) would improve the depth estimation. Have your team tested this scenario previously?

Appreciate your advise. Thanks

) missing in argparse sub parsing

In line 76 , run.py the argparse is not closed using ). It might be typo but it sure does raise a error when executed directly.

FileNotFoundError: [Errno 2] No such file or directory: '/root/.cache/torch/hub/checkpoints/shufflenet

We tried to test the monoloco following code repo on google colab.

!python -m monoloco.run predict 'docs/002282.png' \ --path_gt '/content/monoloco/docs/names-kitti-200615-1022.json' \ -o '/content/monoloco' \ --long-edge 2500 \ --n_dropout 0

We faced following error.

INFO:monoloco.predict:Downloading OpenPifPaf model in /root/.cache/torch/hub/checkpoints
Downloading...
From: https://drive.google.com/uc?id=1b408ockhh29OLAED8Tysd2yGZOo0N_SQ
To: /root/.cache/torch/hub/checkpoints/shufflenetv2k30-201104-224654-cocokp-d75ed641.pkl
Traceback (most recent call last):
File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/usr/local/lib/python3.7/dist-packages/monoloco/run.py", line 202, in
main()
File "/usr/local/lib/python3.7/dist-packages/monoloco/run.py", line 131, in main
predict(args)
File "/usr/local/lib/python3.7/dist-packages/monoloco/predict.py", line 149, in predict
args, dic_models = factory_from_args(args)
File "/usr/local/lib/python3.7/dist-packages/monoloco/predict.py", line 104, in factory_from_args
dic_models = download_checkpoints(args)
File "/usr/local/lib/python3.7/dist-packages/monoloco/predict.py", line 64, in download_checkpoints
DOWNLOAD(OPENPIFPAF_MODEL, pifpaf_model, quiet=False)
File "/usr/local/lib/python3.7/dist-packages/gdown/download.py", line 90, in download
f = open(tmp_file, 'wb')
FileNotFoundError: [Errno 2] No such file or directory: '/root/.cache/torch/hub/checkpoints/shufflenetv2k30-201104-224654-cocokp-d75ed641.pkl9kpqm5tgtmp'

Incorrect bird's eye view plot

Hi,

Please see the attached images where the bird's eye view seems to be incorrect. I wonder whether this is because of the plot settings.

Thanks for your nice work.

How to pass in the parameters of my own camera?

Thank you for your excellent work！I want to use my own camera for testing. From where in the code should I pass in my camera parameters and what parameters do I need to pass in?

How do you learn body orientation?

Thanks for your great work

I want to study further in body orientation, but I can't find how does your model learns body orientation in your papers.

So could you tell me how do you learn body orientation in your model

Problem about pifpaf model

First of all, I'm thank for your work

But I have a problem to run your predict code

I was faced this error

assert DOWNLOAD is not None, "install gdown to download pifpaf model, or pass it as --checkpoint"
AssertionError: install gdown to download pifpaf model, or pass it as --checkpoint

How can I solve it?

Import processor_factory and preprocess_factory from openpifpaf

Hi Lorenzo,

Sorry to disturb you again, forgive me if this is naive problom. I am a beginner and much appreciate your help.

Since I failed to setup the monoloco on my jetson devices, I turned to desktop computers this time. However, there seems to be unconsistansy in the new monoloco and openpifpaf. When I run the monoloco demo, it says ImportError: cannot import name 'processor_factory' from 'openpifpaf.predict' .
The processor_factoryand preprocess_factorycall is called in line 164 and 165 in predict.py and imported in the beginning. However, I couldn't find their definition in openpifpaf.predict or elsewhere.

Best,
Pheo

Generate a ground-truth json file

Good day. Tell me how to generate a ground-truth json file.

Perms needed for cloning social distance branch

I'm interested in testing out the social distance variation of monoloco, but I'm unable to clone the social distance branch due to perm issues. How can I resolve this? Thanks!

evaluation result question from monoloco++ article

Hello,
thanks for the feedback now I could run train model followed by below command in instruction,
"python -m monoloco.run train --joints data/arrays/joints-kitti-201202-1743.json --save --monocular"
and could run eval mode as described in instruction
(although there were a few minor modification was needed to run it , but somehow there was result)

and when I compare the result I got vs in article TABLE I(https://arxiv.org/pdf/2009.00984.pdf)

I could check the difference as below

** from article TABLE 1
method || Dataset || NR of Instances || ALE(Easy) (Mod) (Hard) (ALL) || ALA(<0.5m) (1m) (<2m)
MonoLoco++|| KITTI || 1799 || 0.69[90%] 0.71[66%] 1.37[31%] 0.76[70%] || 37.4 53.2 63.6

** from self trial followed by instruction
method || Dataset || NR of Instances || ALE(Easy) (Mod) (Hard) (ALL) || ALA(<0.5m) (1m) (<2m)
MonoLoco++|| KITTI || 1799 || 0.65[80%] 0.61[52%] 1.07[17%] 0.67[58%] || 32.8 46.31 54.34

would you please confirm if the re-produce will have same result with TABLE I?

thank you

How to generate annotation file for custom dataset

We are trying to evaluate the pre-trained model on our custom dataset. How do we generate an annotation file for each image of the custom dataset?

Thanks.

Draw skeleton with pifpaf fails

Hi! First of all, thanks for your great work! The code quality is much higher than what I saw in other DL projects. Congratulations on the ICCV paper. 👍

I am trying to save the skeleton and keypoints prediction and I got the following error:

Ground-truth file not found
Using a standard calibration matrix...
Traceback (most recent call last):
  File "/Users/pliu/anaconda3/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/Users/pliu/anaconda3/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/Users/pliu/github/monoloco/monoloco/run.py", line 157, in <module>
    main()
  File "/Users/pliu/github/monoloco/monoloco/run.py", line 100, in main
    predict.predict(args)
  File "/Users/pliu/github/monoloco/monoloco/predict.py", line 78, in predict
    factory_outputs(args, images_outputs, output_path, pifpaf_outputs, dic_out=dic_out, kk=kk)
  File "/Users/pliu/github/monoloco/monoloco/predict.py", line 105, in factory_outputs
    keypoint_painter.keypoints(ax, keypoint_sets)

The command I used is:

python3 -m monoloco.run predict --glob /Users/pliu/Downloads/1000289627384.jpeg --output_types combined json keypoints skeleton --scale 2 --model data/models/monoloco-190513-1437.pkl  --z_max 50 --networks monoloco pifpaf

I looked at the arg list of keypoint_painter.keypoints and I am quite puzzled:

def keypoints(self, ax, keypoint_sets, *,
                  skeleton, scores=None, color=None, colors=None, texts=None):

Why there is a * in the middle, even before the positional argument of skeleton? How do I make saving keypoints and skeletons work?

Thanks for your help in advance!

"additional minimization objective" for epistemic uncertainty modeling

Hi, thank you for sharing the code of your work!
I read your paper and got a question about epistemic uncertainty modeling.

In the paper, you formulated additional minimization objective in eq. 5 for modeling the epistemic uncertainty.
How is this term used in training (or inference) phase?
I mean, is this term added to the loss to train the model?
I also looked into your code but could not figure out where this term was used.

Thank you in advance.

The same error as #13 with more info provided

I encountered the same issue and I am using Python 3
I cloned the repo then pip3 install -e '.[test, prep]'

How to use the GPU for faster predictions?

Hello, I installed monoloco and ran a few tests with images. I dont get any errors and output images with bounding boxes are created. Since the prediction took a little while, I checked the Task manager and found, that my GPU is not used. To verify, I added a line to the predict.py file below line 114, right below the setting of the device:
print(str(args.device))
When running the prediction again, the console prints "cpu". I checked, that Cuda and cudnn were all installed correctly. Still I am not able to make use of the graphics card. What can I do to make cuda available?
I am using Windows 10, cuda version 11.1.1 and cudnn 8.1.0

Training Issue (Shape mismatch)

Hi,
I'm having a shape error when I run the training process for Kitti dataset.

  _""
  python -m monoloco.run train --joints data/arrays/joints-kitti-201202-1743.json --print_loss --monocular
  ""_

Do I have to pre-process the dataset, I think the preprocessing is already embeded in the code provided ?

Thanks !

Error:

File "/monoloco/monoloco/run.py", line 197, in
main()
File "/monoloco/monoloco/run.py", line 151, in main
_ = training.train()
File "/monoloco/monoloco/train/trainer.py", line 160, in train
outputs = self.model(inputs)
File "/home/student/miniconda3/envs/pytorch_env/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(input, **kwargs)
File "/architectures.py", line 50, in forward
y = self.w1(x)
File "/miniconda3/envs/pytorch_env/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(input, **kwargs)
File "/home/student/miniconda3/envs/pytorch_env/lib/python3.8/site-packages/torch/nn/modules/linear.py", line 94, in forward
return F.linear(input, self.weight, self.bias)
File "/miniconda3/envs/pytorch_env/lib/python3.8/site-packages/torch/nn/functional.py", line 1753, in linear
return torch._C._nn.linear(input, weight, bias)
RuntimeError: mat1 dim 1 must match mat2 dim 0

RuntimeError: CUDA error: invalid device ordinal

Hi,

When I want to run the model on NVIDIA Jetson Nano GPU device, it turns this error:

python -m monoloco.run predict --predict --glob "data/input/*.png" --networks monoloco --output_type front bird --model data/models/monoloco-191018-1459.pkl -o data/output --z_max 22
/home/pyimagesearch/.virtualenvs/py3cv4/lib/python3.6/site-packages/torch/serialization.py:454: SourceChangeWarning: source code of class 'openpifpaf.network.nets.Shell' has changed. you can retrieve the original source code by accessing the object's source attribute or set torch.nn.Module.dump_patches = True and use the patch tool to revert the changes.
warnings.warn(msg, SourceChangeWarning)
INFO:root:selected decoder: PifPaf
Traceback (most recent call last):
File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/pyimagesearch/Downloads/monoloco/monoloco/run.py", line 166, in
main()
File "/home/pyimagesearch/Downloads/monoloco/monoloco/run.py", line 109, in main
predict.predict(args)
File "/home/pyimagesearch/Downloads/monoloco/monoloco/predict.py", line 19, in predict
monoloco = MonoLoco(model=args.model, device=args.device, n_dropout=args.n_dropout, p_dropout=args.dropout)
File "/home/pyimagesearch/Downloads/monoloco/monoloco/network/net.py", line 50, in init
self.model.to(self.device)
File "/home/pyimagesearch/.virtualenvs/py3cv4/lib/python3.6/site-packages/torch/nn/modules/module.py", line 386, in to
return self._apply(convert)
File "/home/pyimagesearch/.virtualenvs/py3cv4/lib/python3.6/site-packages/torch/nn/modules/module.py", line 193, in _apply
module._apply(fn)
File "/home/pyimagesearch/.virtualenvs/py3cv4/lib/python3.6/site-packages/torch/nn/modules/module.py", line 199, in _apply
param.data = fn(param.data)
File "/home/pyimagesearch/.virtualenvs/py3cv4/lib/python3.6/site-packages/torch/nn/modules/module.py", line 384, in convert
return t.to(device, dtype if t.is_floating_point() else None, non_blocking)
RuntimeError: CUDA error: invalid device ordinal

the results of fig.1

How do I get the result of Figure 1 in the paper？

confirmation for training instruction - MonoLocoPP

hello,
I get question during re-product trial for MonoLocoPP's parameter.
followed by the training instruction, the training command line says that use "joints-kitti-201202-1743.json"
but the link attached(https://drive.google.com/file/d/1e-wXTO460ip_Je2NdXojxrOrJ-Oirlgh/view) points out "names-kitti-200615-1022.json"

would you please confirm which json was used for your MonoLocoPP article?

thank you

set up failed

Hello, I have encountered the following problems, please help to solve them.

ERROR: Could not find a version that satisfies the requirement nuscenes-devkit (from monoloco==0.4.6) (from versions: none)
ERROR: No matching distribution found for nuscenes-devkit (from monoloco==0.4.6)

All outputs in one run?

Dear all,

I need two kinds of results:

The bird and front views of monoloco
The pose estimation views, generated with the shufflenetv2k30-wholebody model.

Is there a way of obtaining these without having to run every video twice?

Any help is appreciated!
Peter

Kitti ground truth files

I am new to deep learning and CV. I am not able to find the exact Kitti ground truth files for the evaluation. Please Direct me towards the website where i can find it and please mention what files to be downloaded for kitti ground truth evaluation.

Show 3d boxes

How to plot 3d boxes? There is a script "plot_3d_box.py". The function "correct_boxes" requires some arguments. How to provide them?

Unexpected results on processed image

Hello!

First of all, thanks for sharing your job I found it helpful and really interesting. I have a problem with one of the images that I have processed. In the image there are one woman and a child and they are really close in the scene. However when I process the image I have the following results:

I thought that it was because I did not use an intrinsic matrix. I followed the issue #18 and I changed the matrix returned in function factory_for_gt in process.py, obtaining similar results, shown in the following image:

You said in the README that the relative distances are still meaningful , but seeing the results seems that the relative distances are also strange. Is there anything that I am doing wrong?

Thank you

JLD

about the predict demo

When I run the demo command:
python3 -m monoloco.run predict docs/frame0032.jpg \ --activities social_distance --output_types front bird
there is a loader_workers error, then I just modify the args.loader_workers=1 to test,there is a batch_size arg error.
what happened, about the code installing is successful, I run the test code of monoloco.run.

nuScenes pre-trained model

Hello,

first of all, thank you very much for your paper and for this repository. I see that you provide pre-trained models on google drive. Can you please tell me which model was pre-trained on the nuScenes dataset?

Thank you very much!

About webcam settings

Can you explain the method used by webam real time?
I remember being able to use it before

detection on closer distances

Thanks for your great work. When I run the algorithm on my images (from Realsense) or even with webcam, it seems its depth estimation (specially for images from the realsense) is incorrect and it is usually a high value. Do you know if I have set any parameters?Thanks!

About distortion

I have a question about distortion.

In KITTI dataset, they use rectified images which were erased distortion like radial.

But in your work, there isn't mention about distortion.

So, why did you use uncalibrated image?

Is it minor things to ignore?

Using the images parametere instead of glob

I got confused now solved, kindly remove this issue

The objects is stored on the edges when using webcams

First of all, you did such a GREAT work here. As for my issue I followed the instructions provided and downloaded the model/array files and folders as mentioned, but when I use the model with the webcam using python3 -m monoloco.run predict --webcam --scale 0.2 --output_types combined --z_max 10 --checkpoint resnet50 --model data/models/monoloco-190719-0923.pkl at first the camera detects me well, but when I try to move away from the FOV the latest seen position is stored and I can still me detected but frozen (Keeps showing the last frame with the last pedestrian detected)! Can you provide any help?

How to load our own camera intrinsic matrix in webcam mode?

Is there a way to load my own camera intrinsic matrix in webcam mode?
From the source code, it looks like intrinsic matrix is computed by some default formula in process.py.

def factory_for_gt(im_size, name=None, path_gt=None):
    ...
    except KeyError:
    dic_gt = None
    x_factor = im_size[0] / 1600
    y_factor = im_size[1] / 900
    pixel_factor = (x_factor + y_factor) / 2   # TODO remove and check it
    if im_size[0] / im_size[1] > 2.5:
        kk = [[718.3351, 0., 600.3891], [0., 718.3351, 181.5122], [0., 0., 1.]]  # Kitti calibration
    else:
        kk = [[1266.4 * pixel_factor, 0., 816.27 * x_factor],
                 [0, 1266.4 * pixel_factor, 491.5 * y_factor],
                 [0., 0., 1.]]  # nuScenes calibration
    print("Using a standard calibration matrix...")

evaluation GT file

Hello
now I am trying to do evaluation script followed by readme.md's instruction
according to the code,
it looks gt file is required according to line 52 of below file
( https://github.com/vita-epfl/monoloco/blob/main/monoloco/eval/generate_kitti.py )

would you please specify how can I get the gt file?

thank you

Installing monoloco using pip

When i try to install monoloco using command : pip3 install monoloco, it is trying to install torch but couldn't. Is there any mistake on my side in the process of installation? I am beginner in the field of deep learning and AI.