Code Monkey home page Code Monkey logo

consistent_depth's Introduction

[SIGGRAPH 2020] Consistent Video Depth Estimation

Open in Colab

We present an algorithm for reconstructing dense, geometrically consistent depth for all pixels in a monocular video. We leverage a conventional structure-from-motion reconstruction to establish geometric constraints on pixels in the video. Unlike the ad-hoc priors in classical reconstruction, we use a learning-based prior, i.e., a convolutional neural network trained for single-image depth estimation. At test time, we fine-tune this network to satisfy the geometric constraints of a particular input video, while retaining its ability to synthesize plausible depth details in parts of the video that are less constrained. We show through quantitative validation that our method achieves higher accuracy and a higher degree of geometric consistency than previous monocular reconstruction methods. Visually, our results appear more stable. Our algorithm is able to handle challenging hand-held captured input videos with a moderate degree of dynamic motion. The improved quality of the reconstruction enables several applications, such as scene reconstruction and advanced video-based visual effects.

Consistent Video Despth Estimation
Xuan Luo, Jia-Bin Huang, Richard Szeliski, Kevin Matzen, and Johannes Kopf
In SIGGRAPH 2020.

Prerequisite

Quick Start

You can run the following demo without installing COLMAP. The demo takes 37 min when tested on one NVIDIA GeForce RTX 2080 GPU.

  • Download models and the demo video together with its precomputed COLMAP results.
    ./scripts/download_model.sh
    ./scripts/download_demo.sh results/ayush
    
  • Run
    python main.py --video_file data/videos/ayush.mp4 --path results/ayush \
      --camera_params "1671.770118, 540, 960" --camera_model "SIMPLE_PINHOLE" \
      --make_video
    
    where 1671.770118, 540, 960 is camera intrinsics (f, cx, cy) and SIMPLE_PINHOLE is the camera model.
  • You can inspect the test-time training process by
    tensorboard --logdir results/ayush/R_hierarchical2_mc/B0.1_R1.0_PL1-0_LR0.0004_BS4_Oadam/tensorboard/ 
    
  • You can find your results as below.
    results/ayush/R_hierarchical2_mc
      videos/
        color_depth_mc_depth_colmap_dense_B0.1_R1.0_PL1-0_LR0.0004_BS4_Oadam.mp4    # comparison of disparity maps from mannequin challenge, COLMAP and ours
      B0.1_R1.0_PL1-0_LR0.0004_BS4_Oadam/
        depth/                      # final disparity maps
        checkpoints/0020.pth        # final checkpoint
        eval/                       # disparity maps and losses after each epoch of training
    
    Expected output can be found here. Your results can be different due to randomness in the test-time training process.

The demo runs everything including flow estimation, test-time training, etc. except the COLMAP part for quick demonstration and ease of installation. To enable testing the COLMAP part, you can delete results/ayush/colmap_dense and results/ayush/depth_colmap_dense. And then run the python command above again.

Customized Run:

Please refer to params.py or run python main.py --help for the full list of parameters. Here I demonstrate some examples for common usage of the system.

Run on Your Own Videos

  • Place your video file at $video_file_path.
  • [Optional] Calibrate camera using PINHOLE (fx, fy, cx, cy) or SIMPLE_PINHOLE (f, cx, cy) model. Camera intrinsics calibration is optional but suggested for more accurate and faster camera registration. We typically calibrate the camera by capturing a video of a textured plane with really slow camera motion while trying to let target features cover the full field of view, selecting non-blurry frames, running COLMAP on these images.
  • Run
    • Run without camera calibration.
      python main.py --video_file $video_file_path --path $output_path --make_video
      
    • Run with camera calibration. For instance, run with PINHOLE model and fx, fy, cx, cy = 1660.161322, 1600, 540, 960
      python main.py --video_file $video_file_path --path $output_path \
        --camera_model "PINHOLE" --camera_params "1660.161322, 1600, 540, 960" \
        --make_video
      
    • You can also specify backend monocular depth estimation network by
      python main.py --video_file $video_file_path --path $output_path \
        --camera_model "PINHOLE" --camera_params "1660.161322, 1600, 540, 960" \
        --make_video --model_type "${model_type}"
      
      The supported model types are mc (Mannequin Challenge by Zhang et al. 2019), , midas2 (MiDaS by Ranftl el al. 2019) and monodepth2 (Monodepth2 by Godard et al. 2019).

Run with Precomputed Camera Poses

We rely on COLMAP to for camera pose registration. If you have precomputed camera poses instead, you can provide them to the system in folder $path as follows. (Example file structure of $path see here.)

  • Save your color images as color_full/frame_%06d.png.
  • Create frame.txt of format (example see here):
    number_of_frames
    width
    height
    frame_000000_timestamp_in_seconds
    frame_000001_timestamp_in_seconds
    ...
    
  • Convert your camera pose to COLMAP sparse reconstruction format following this. Put your images.txt, cameras.txt and points3D.txt (or .bin) under colmap_dense/pose_init/. Note that the POINTS2D in images.txt and the points3D.txt can be empty.
  • Run.
    python main.py --path $path --initialize_pose
    

Mask out Dynamic Object for Camera Pose Estimation

To get better pose for dynamic scene, you can mask out dynamic objects when extracting features with COLMAP. Note COLMAP >= 3.6 is required to extract features in masked regions.

  • Extract frames

    python main.py --video_file $video_file_path --path $output_path --op extract_frames
    
  • Run your favourite segmentation method (e.g., Mask-RCNN) on images in $output_path/color_full to extract binary mask for dynamic objects (e.g., human). No features will be extracted in regions, where the mask image is black (pixel intensity value 0 in grayscale). Following COLMAP document, save the mask of frame $output_path/color_full/frame_000010.png, for instance, at $output_path/mask/frame_000010.png.png.

  • Run the rest of the pipeline.

    python main.py --path $output_path --mask_path $output_path/mask \
      --camera_model "${camera_model}" --camera_params "${camera_intrinsics}" \
      --make_video
    

Result Folder Structure

The result folder is of the following structure. Lots of files are saved only for debugging purposes.

frames.txt              # meta data about number of frames, image resolution and timestamps for each frame
color_full/             # extracted frames in the original resolution
color_down/             # extracted frames in the resolution for disparity estimation 
color_down_png/      
color_flow/             # extracted frames in the resolution for flow estimation
flow_list.json          # indices of frame pairs to finetune the model with
flow/                   # optical flow 
mask/                   # mask of consistent flow estimation between frame pairs.
vis_flow/               # optical flow visualization. Green regions contain inconsistent flow. 
vis_flow_warped/        # visualzing flow accuracy by warping one frame to another using the estimated flow. e.g., frame_000000_000032_warped.png warps frame_000032 to frame_000000.
colmap_dense/           # COLMAP results
    metadata.npz        # camera intrinsics and extrinsics converted from COLMAP sparse reconstruction.
    sparse/             # COLMAP sparse reconstruction
    dense/              # COLMAP dense reconstruction
depth_colmap_dense/     # COLMAP dense depth maps converted to disparity maps in .raw format
depth_${model_type}/    # initial disparity estimation using the original monocular depth model before test-time training
R_hierarchical2_${model_type}/ 
    flow_list_0.20.json                 # indices of frame pairs passing overlap ratio test of threshold 0.2. Same content as ../flow_list.json.
    metadata_scaled.npz                 # camera intrinsics and extrinsics after scale calibration. It is the camera parameters used in the test-time training process.
    scales.csv                          # frame indices and corresponding scales between initial monocular disparity estimation and COLMAP dense disparity maps.
    depth_scaled_by_colmap_dense/       # monocular disparity estimation scaled to match COLMAP disparity results
    vis_calibration_dense/              # for debugging scale calibration. frame_000000_warped_to_000029.png warps frame_000000 to frame_000029 by scaled camera translations and disparity maps from initial monocular depth estimation.
    videos/                             # video visualization of results 
    B0.1_R1.0_PL1-0_LR0.0004_BS4_Oadam/
        checkpoints/                    # checkpoint after each epoch
        depth/                          # final disparity map results after finishing test-time training
        eval/                           # intermediate losses and disparity maps after each epoch 
        tensorboard/                    # tensorboard log for the test-time training process

Citation

If you find our code useful, please consider citing our paper:

@article{Luo-VideoDepth-2020,
  author    = {Luo, Xuan and Huang, Jia{-}Bin and Szeliski, Richard and Matzen, Kevin and Kopf, Johannes},
  title     = {Consistent Video Depth Estimation},
  booktitle = {ACM Transactions on Graphics (Proceedings of ACM SIGGRAPH)},
  publisher = {ACM},
  volume = {39},
  number = {4},
  year = {2020}
}

License

This work is licensed under MIT License. See LICENSE for details.

Acknowledgments

We would like to thank Patricio Gonzales Vivo, Dionisio Blanco, and Ocean Quigley for creating the artistic effects in the accompanying video. We thank True Price for his practical and insightful advice on reconstruction and Ayush Saraf for his suggestions in engineering.

consistent_depth's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

consistent_depth's Issues

colmap issue

Hi,

Thanks for the excellent framework!

When testing without camera calibration:


python main.py --video_file $video_file_path --path $output_path --make_video


There is an error that seems to ask for camera information:


File "consistent_depth/process.py", line 62, in pipeline
valid_frames = calibrate_scale(self.video, self.out_dir, frame_range, params)

File "consistent_depth/scale_calibration.py", line 184, in calibrate_scale
video.path, colmap.sparse_dir(colmap_dir, 0)

File "consistent_depth/scale_calibration.py", line 74, in make_camera_params_from_colmap
cameras, images, points3D = load_colmap.read_model(path=sparse_dir, ext=".bin")

File "consistent_depth/third_party/colmap/scripts/python/read_write_model.py", line 416, in read_model
cameras = read_cameras_binary(os.path.join(path, "cameras" + ext))

File "consistent_depth/third_party/colmap/scripts/python/read_write_model.py", line 135, in read_cameras_binary
with open(path_to_model_file, "rb") as fid:
FileNotFoundError: [Errno 2] No such file or directory: './results/tmp/colmap_dense/sparse/0/cameras.bin'


Can you kindly advise?

Thanks so much!

Masking out dynamic object is not working.

The provided option about masking out dynamic object for camera pose estimation is not working.

main.py: error: unrecognized arguments: --mask_path

It seems that the project does not provide an option about mask.
And the mask in the final results is based on two frames. The mask required in the README is based on a single frame.

Are there other ways to mask out dynamic objects?

Experiments on ScanNet

Hi,
Thank you for sharing the code. Could I know which method is used for estimating the camera pose on ScanNet?

I checked the paper, but I cannot find the details.

So sorry for my troubles.

Thank you for your attention

Customized Run is not working.

Great paper and thank you for making the source code public. The quick start code works for me in a conda environment, but it would be nice if the code can support virtual environment.
I am running into an error when I want to try the customized run on a random internet video. The video is 1920x1080. I don't provide the camera model. The command I used is:
python main.py --video_file ./data/videos/1943413.mp4 --path ./results/pexels_video1 --make_video --model_type "midas2"
The error I got is:
Traceback (most recent call last): File "main.py", line 13, in <module> dp.process(params) File "/home/owen/Dev/consistent_depth/process.py", line 117, in process return self.pipeline(params) File "/home/owen/Dev/consistent_depth/process.py", line 88, in pipeline ft.fine_tune(writer=self.writer) File "/home/owen/Dev/consistent_depth/depth_fine_tuning.py", line 257, in fine_tune validate(0, 0) File "/home/owen/Dev/consistent_depth/depth_fine_tuning.py", line 248, in validate criterion, val_data_loader, suffix(epoch, niters) File "/home/owen/Dev/consistent_depth/depth_fine_tuning.py", line 323, in eval_and_save for _, data in zip(range(N), data_loader): File "/home/owen/anaconda3/envs/consistent_depth/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 345, in __next__ data = self._next_data() File "/home/owen/anaconda3/envs/consistent_depth/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 856, in _next_data return self._process_data(data) File "/home/owen/anaconda3/envs/consistent_depth/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 881, in _process_data data.reraise() File "/home/owen/anaconda3/envs/consistent_depth/lib/python3.6/site-packages/torch/_utils.py", line 394, in reraise raise self.exc_type(msg) IndexError: Caught IndexError in DataLoader worker process 0. Original Traceback (most recent call last): File "/home/owen/anaconda3/envs/consistent_depth/lib/python3.6/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop data = fetcher.fetch(index) File "/home/owen/anaconda3/envs/consistent_depth/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/owen/anaconda3/envs/consistent_depth/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp> data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/owen/Dev/consistent_depth/loaders/video_dataset.py", line 172, in __getitem__ intrinsics = torch.stack([self.intrinsics[k] for k in pair], dim=0) File "/home/owen/Dev/consistent_depth/loaders/video_dataset.py", line 172, in <listcomp> intrinsics = torch.stack([self.intrinsics[k] for k in pair], dim=0) IndexError: index 64 is out of bounds for dimension 0 with size 13

I am suspecting that the camera parameter derived from colmap is not compatible with the dataloader.

AssertionError: assert frame_names is not None

When I try to run main.py as given in the documentation with the data given I get the assertion error.I Found the depth_colmap_dense folder is not there. After the banner Compute per-frame scales it shows
assert frame_names is not None . Can anyone tell me what is the issue here ? Do i need to create a folder named depth_colmap_dense

Stereo input???

First of all, FANTASTIC WORK! And thanks for sharing!
How would one go about inputting a stereo pair(or more) to increase the accuracy of the resulting depth maps? Is something that can be done easily?

ffmpeg check moov atom not found ayush.mp4?

Hi, Try to:
python main.py --video_file data/videos/ayush.mp4 --path results/ayush --camera_params "1671.770118, 540, 960" --camera_model "SIMPLE_PINHOLE" --make_video

----------- Parameters -------------
align: 16
batch_size: 4
camera_model: SIMPLE_PINHOLE
camera_params: 1671.770118, 540, 960
colmap_bin_path: colmap
configure: default
dense_frame_ratio: 0.95
dense_pixel_ratio: 0.3
display_freq: 100
ffmpeg: ffmpeg
flow_checkpoint: FlowNet2
flow_ops: ['hierarchical2']
frame_range: ''
initialize_pose: False
lambda_parameter: 0
lambda_reprojection: 1.0
lambda_view_baseline: 0.1
learning_rate: 0.0004
log_dir: None
make_video: True
matcher: exhaustive
model_type: mc
num_epochs: 20
op: all
optimizer: Adam
overlap_ratio: 0.2
path: results/ayush
print_freq: 1
refine_intrinsics: False
save_epoch_freq: 1
size: 384
sparse: False
val_epoch_freq: 1
video_file: data/videos/ayush.mp4


Processing dataset 'results/ayush'

Output directory: results/ayush/R_hierarchical2_mc


**** Extracting PTS ****


ffmpeg -i data/videos/ayush.mp4 -vframes 1 /tmp/tmpu_mf371z.png
ffmpeg version 3.4.8-0ubuntu0.2 Copyright (c) 2000-2020 the FFmpeg developers
built with gcc 7 (Ubuntu 7.5.0-3ubuntu1~18.04)
configuration: --prefix=/usr --extra-version=0ubuntu0.2 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --enable-gpl --disable-stripping --enable-avresample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librubberband --enable-librsvg --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-omx --enable-openal --enable-opengl --enable-sdl2 --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libopencv --enable-libx264 --enable-shared
libavutil 55. 78.100 / 55. 78.100
libavcodec 57.107.100 / 57.107.100
libavformat 57. 83.100 / 57. 83.100
libavdevice 57. 10.100 / 57. 10.100
libavfilter 6.107.100 / 6.107.100
libavresample 3. 7. 0 / 3. 7. 0
libswscale 4. 8.100 / 4. 8.100
libswresample 2. 9.100 / 2. 9.100
libpostproc 54. 7.100 / 54. 7.100
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55cd130988c0] Format mov,mp4,m4a,3gp,3g2,mj2 detected only with low score of 1, misdetection possible!
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55cd130988c0] moov atom not found
data/videos/ayush.mp4: Invalid data found when processing input



Traceback (most recent call last):
File "main.py", line 13, in
dp.process(params)

fp = builtins.open(filename, "rb")
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmpu34wuoeh.png'

Thanks!

Cannot find an image when running demo

Hello. I followed your instructions to run the demo. However, when I was running the command

"python main.py --video_file data/videos/ayush.mp4 --path results/ayush --camera_params "1671.770118, 540, 960" --camera_model "SIMPLE_PINHOLE" --make_video",

a FIleNotFoundError occurs:

FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmprh9hg3hs.png'

I wonder where can I obtained these images? Or whether I ignore something?

RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED

I installed all the requirements as you listed on the website. However, I met this error while running the demo.

Error:

Fine-tuning directory: 'results/ayush/R_hierarchical2_mc/B0.1_R1.0_PL1-0_LR0.0004_BS4_Oadam'
Found cache checkpoints/mc.pth
Using 1 GPUs.
Traceback (most recent call last):
  File "main.py", line 13, in <module>
    dp.process(params)
  File "/home/ubuntu/Documents/consistent_depth/process.py", line 117, in process
    return self.pipeline(params)
  File "/home/ubuntu/Documents/consistent_depth/process.py", line 60, in pipeline
    ft.save_depth(initial_depth_dir)
  File "/home/ubuntu/Documents/consistent_depth/depth_fine_tuning.py", line 190, in save_depth
    depth = self.model.forward(stacked_images, metadata)
  File "/home/ubuntu/Documents/consistent_depth/monodepth/depth_model.py", line 23, in forward
    depth = self.estimate_depth(images)
  File "/home/ubuntu/Documents/consistent_depth/monodepth/mannequin_challenge_model.py", line 60, in estimate_depth
    self.model.prediction_d, _ = self.model.netG.forward(images)
  File "/home/ubuntu/anaconda3/envs/depth/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 150, in forward
    return self.module(*inputs[0], **kwargs[0])
  File "/home/ubuntu/anaconda3/envs/depth/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/ubuntu/Documents/consistent_depth/monodepth/mannequin_challenge/models/hourglass.py", line 176, in forward
    pred_feature = self.seq(input_)
  File "/home/ubuntu/anaconda3/envs/depth/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/ubuntu/anaconda3/envs/depth/lib/python3.6/site-packages/torch/nn/modules/container.py", line 100, in forward
    input = module(input)
  File "/home/ubuntu/anaconda3/envs/depth/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/ubuntu/anaconda3/envs/depth/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 345, in forward
    return self.conv2d_forward(input, self.weight)
  File "/home/ubuntu/anaconda3/envs/depth/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 342, in conv2d_forward
    self.padding, self.dilation, self.groups)
RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED

python main.py --video_file data/videos/ayush.mp4 --path results/ayush --camera_params "1671.770118, 540, 960" --camera_model "SIMPLE_PINHOLE" --make_video

Here is my setup

Cuda compilation tools, release 10.1, V10.1.243

print(torch.__version__) 1.4.0+cu100

Thanks for helping!

Support for MobileNetV3 or similar to run on mobile devices

Is it possible to run the model on a mobile device (iOS, Android) or compatible with neural architectures like MobileNetV3, EfficientNet-B0/1/3 ?

I like to see the performance and accuracy in action on real-time captured video streams from the device.

Scale calibration

In section 4 under scale calibration to align the colmap scale with the network, the global scale adjustment factor is multiplied with the translation vector as shown in eq. 3. In the implementation the translation vector is divided by the scale factor. Is there a reason for this difference or am I missing something here.

No module named 'networks'

By running the

python main.py --video_file data/videos/ayush.mp4 --path results/ayush \
  --camera_params "1671.770118, 540, 960" --camera_model "SIMPLE_PINHOLE" \
  --make_video

I get the following error:
Traceback (most recent call last):

File "/home/Projects/consistent_depth/consistent_depth/third_party/flownet2/models.py", line 12, in
from networks.resample2d_package.resample2d import Resample2d
ModuleNotFoundError: No module named 'networks'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "main.py", line 5, in
from process import DatasetProcessor
File "/home/Projects/consistent_depth/consistent_depth/process.py", line 10, in
from flow import Flow
File "/home/Projects/consistent_depth/consistent_depth/flow.py", line 15, in
import optical_flow_flownet2_homography
File "/home/Projects/consistent_depth/consistent_depth/optical_flow_flownet2_homography.py", line 11, in
from third_party.flownet2.models import FlowNet2
File "/home/Projects/consistent_depth/consistent_depth/third_party/flownet2/models.py", line 23, in
from .networks.channelnorm_package.channelnorm import ChannelNorm
File "/home/Projects/consistent_depth/consistent_depth/third_party/flownet2/networks/channelnorm_package/channelnorm.py", line 5, in
import channelnorm_cuda
ModuleNotFoundError: No module named 'channelnorm_cuda'

Failed to read image file format

So today I'm having a different issue. It's failing a lot sooner in the pipeline with an image read error. I've check the images and they are all there and yet it can't read them. any ideas?
image

How can I get the real depth from the output?

Hi, thanks for your brilliant job, the result is really amazing. I am a noob in this field and I see the visulization of the result is the disparity map, not the depth map. I see that if you want to convert the disparty to the depth, you should konw the baseline first. But this is a monocular depth estimation job, is the baseline the distance of the camera between two frames in world coordinates? Can I get the real depth from the output directly? How can I convert the output to the real depth? Looking forward to your answer.

MIDAS V.3 update

Any plan to update to MIDAS V.3 with "dpt_large-midas-2f21e586.pt" model?

Customized run result not reasonable

Thanks for sharing the code!

I run python main.py --video_file $video_file_path --path $output_path --make_video, with no calibrate camera. But the output results do not look reasonable.

Any suggestion for this case? I see the finetuning makes disparity-0.1 and loss-0.5. Must I provided the calibrated intrinsics to run it well?

image

Demo keeps getting KILLED!

I've been trying to run the demo for a few days now.

It keeps getting killed every time.
If you could think of the possible reason, please help!
The last part of the output is :


**** COLMAP reconstruction ****


Checked metadata file exists.


**** Convert COLMAP depth maps ****



**** Compute per-frame scales ****


Existing scales file loaded.
Scaled metadata file exists.
Filtered out frames []


**** Compute flow ****


Sampled 522 frame pairs.
Resizing flow to (224, 384)
Killed

Sometimes it finishes. Sometimes not.

I was able to get one video to go all the way through the pipe and output successful video results. Since then I get this error on every other video I try. All my clips are shot with the same camera and are roughly 4 seconds long. Any advice would be appreciated.

"Elapsed time: 16.860 [minutes]
2020-08-26 19:16:37,511 - INFO - #models = 0
Traceback (most recent call last):
File "main.py", line 13, in
dp.process(params)
File "/content/consistent_depth/process.py", line 117, in process
return self.pipeline(params)
File "/content/consistent_depth/process.py", line 62, in pipeline
valid_frames = calibrate_scale(self.video, self.out_dir, frame_range, params)
File "/content/consistent_depth/scale_calibration.py", line 184, in calibrate_scale
video.path, colmap.sparse_dir(colmap_dir, 0)
File "/content/consistent_depth/scale_calibration.py", line 74, in make_camera_params_from_colmap
cameras, images, points3D = load_colmap.read_model(path=sparse_dir, ext=".bin")
File "/content/consistent_depth/third_party/colmap/scripts/python/read_write_model.py", line 416, in read_model
cameras = read_cameras_binary(os.path.join(path, "cameras" + ext))
File "/content/consistent_depth/third_party/colmap/scripts/python/read_write_model.py", line 135, in read_cameras_binary
with open(path_to_model_file, "rb") as fid:
FileNotFoundError: [Errno 2] No such file or directory: 'results/GH017205/colmap_dense/sparse/0/cameras.bin'
Done. Your results are saved at /content/consistent_depth/.
Video results are at /content/consistent_depth/results/GH017205/R_hierarchical2_mc/videos/.
Disparity maps are at /content/consistent_depth/results/GH017205/R_hierarchical2_mc/B0.1_R1.0_PL1-0_LR0.0004_BS4_Oadam/depth"

No module named "monodepth.mannequin_challenge.models"

After following your instructions step by step I run the ayush demo and get the following error:

No module named "monodepth.mannequin_challenge.models"

I'm guessing something would have to be downloaded, but from where, and where do I put it?

segment fault

hello, i met the segment fault as the picture show when run code python main.py --video_file data/videos/ayush.mp4 --path results/ayush:
image

Dockerfile request

Hi, could you provide us with a dockerized version of the code? Thanks!

cannot extract model file from FlowNet2_checkpoint.pth.tar

Hi, I downloaded the model file from chrome webbrowser directly and I got the FlowNet2_checkpoint.pth.tar file.
Then I tried to extract the flownet2.pth from it as described in ./scripts/download_model.sh, but it turned out I couldn't extract it.
I cannot use gdown to download the model file.

Application to KITTI Dataset

Hi,

Please could you explain the steps / configuration required for application to the Eigen split of the KITTI dataset and evaluation of the results, as mentioned in the paper?

Thanks.

AssertionError on "Compute per-frame scales" step (frame_names is None)

The Problem

While doing a customized run with python main.py --video_file {...} --path {...} --batch_size 2 --make_video, I'm getting the following error in terminal:

************************************
****  Compute per-frame scales  ****
************************************
Traceback (most recent call last):
  File "main.py", line 13, in <module>
    dp.process(params)
  File "/home/aeverless/Desktop/content/consistent_depth/process.py", line 117, in process
    return self.pipeline(params)
  File "/home/aeverless/Desktop/content/consistent_depth/process.py", line 62, in pipeline
    valid_frames = calibrate_scale(self.video, self.out_dir, frame_range, params)
  File "/home/aeverless/Desktop/content/consistent_depth/scale_calibration.py", line 242, in calibrate_scale
    os.path.dirname(scaled_depth_fmt), ".raw"
  File "/home/aeverless/Desktop/content/consistent_depth/scale_calibration.py", line 142, in check_frames
    assert frame_names is not None
AssertionError

Setup

I'm running Pop!_OS 20.10 with the following setup:

  • CPU: i5-8400
  • GPU: GTX 1660 Super
  • RAM: 16GB DDR4
  • CUDA: 11.2.0
  • Graphic Driver: Nvidia 460.32.03

Context

I have followed the steps in Readme and Google Colab, successfully reproducing the result (although with lowered batch size) as shown in the demo — thus I think that the issue lies in Colmap or the way that I have it installed, which I installed by running install_colmap_ubuntu.sh.

Installing it was painful, however, when I tried to reinstall it by a different method — my system seemed to hang forever on the make -j step (the same was for building ceres-solver) and I had to cold reboot (REISUB) my system. At that point I had to scrap this method, and I stuck with the script provided with this repository.

I have also ran the setup/install shells in flownet2, and even recloned it from the flownet2-pytorch repo afterwards, trying to fix this problem I'm having.

Right now I am at the point that Colmap is installed and it seems to work fine (at least it does respond to the -h and gui opens the GUI), but frame_names still seems to be None. I have reviewed some of the code to try and figure out what the problem was in, and I figured out that it checks the frame_names from the given frame_range (which is unspecified in my case) to not be None, and based on the result of this check it then creates several folders in the hierarchical directory and proceeds the work. My current result folder looks like this:
colmap_dense color_down color_down_png color_flow color_full depth_mc frames.txt R_hierarchical2_mc

R_hierarchical2_mc contains B0.1_R1.0_PL1-0_LR0.0004_BS2_Oadam folder, inside which is an empty checkpoints directory.

Running "demo" error

Trying to run demo without installing COLMAP, following "Quick Start" directions.
After successful subtraction video frames and creating corresponding subfolders inside result/ayush it throws an error:
"frame= 92 fps= 21 q=-0.0 Lsize=N/A time=00:00:03.06 bitrate=N/A speed=0.711x
video:252330kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown
ERROR: 92 frames extracted, but 0 PTS entries." ,
Folder "R_hierarchical2_mc" stays empty.
"ayush_colmap.zip" content extracted to resluts/ayush/
"frames.txt" consists 3 lines:
0
1080
1920

COLMAP STEP ERROR

COLMAP step in the pipeline was not working at all. So, I fixed this by running COLMAP from gui separately then integrating the data to the project folders and it worked.

How make the same effect as in demo

Hi, many thanks for the work done. The question is do you provide any code that can make the same effect as in your header of the readme? Thank you for the answer.

ModuleNotFoundError: No module named 'third_party.flownet2.networks.resample2d_package'

hello, when I run the file
python main.py --video_file data/videos/ayush.mp4 --path results/ayush --camera_params "1671.770118, 540, 960" --camera_model "SIMPLE_PINHOLE" --make_video
,there are some errors.
ModuleNotFoundError: No module named 'networks'
and
ModuleNotFoundError: No module named 'third_party.flownet2.networks.resample2d_package'

errors show in the figure below. But I have install the flownet2.
图片

Is this real time/real time alternatives

I know this isn't an issue, but didn't know where else to ask so:

Is this real time depth estimation given a monocular lens, if not does anyone know of any that don't require extensive GPU that are quite accurate?

Thanks

Using ORB_SLAM for camera pose computation

Hi,
I've been trying to pre-compute camera poses of TUM_RGBD dataset with ORB_SLAM2 tool and running consistent_depth algorithm. Ideally, this should result in a faster and more accurate reconstruction, since SLAM is better at predicting poses. However I haven't been able to get consistent_depth to work with these poses, because COLMAP fails to reconstruct depth maps.

In the paper you mention that you have tested the algorithm on TUM_RGBD dataset, which is close to what I'm trying to do, since ORB_SLAM outputs poses in TUM dataset format. Could you share the steps you took to achieve a successful evaluation on this dataset?

AssertionError during custom run with changed camera model

During the Image Undistortion step of the process, I get the following error when attempting to use the simple pinhole camera model and the same camera parameters used in the demo:

Traceback (most recent call last):
  File "main.py", line 13, in <module>
    dp.process(params)
  File "<path to>/consistent_depth/process.py", line 117, in process
    return self.pipeline(params)
  File "<path to>/consistent_depth/process.py", line 62, in pipeline
    valid_frames = calibrate_scale(self.video, self.out_dir, frame_range, params)
  File "<path to>/consistent_depth/scale_calibration.py", line 184, in calibrate_scale
    video.path, colmap.sparse_dir(colmap_dir, 0)
  File "<path to>/consistent_depth/scale_calibration.py", line 79, in make_camera_params_from_colmap
    cameras, images, size_new
  File "<path to>/consistent_depth/utils/load_colmap.py", line 175, in convert_calibration
    intrinsics = cameras_to_intrinsics(cameras, sorted_cam_ids, size_new)
  File "<path to>/consistent_depth/utils/load_colmap.py", line 116, in cameras_to_intrinsics
    for c in cameras.values()))
AssertionError

All the line throwing the error is checking is if the model is:

assert all(
    (c.model == "SIMPLE_PINHOLE" or c.model == "PINHOLE"
        or c.model == "SIMPLE_RADIAL"
     for c in cameras.values()))

which it seems like I am complying to? my full command is sudo python3 main.py --video_file <path_to_avi> --path <path_to_results> --camera_model "SIMPLE_PINHOLE" --camera_params "1671.770118, 540, 960" --make_video where the video being used in my command is not the same video as the one in the demo. Am I missing something? I know the camera params are probably off for my use case, but I imagine it should run to completion regardless?

Documentation request

It would be really useful and time-saving to have more appropriate documentation for this project or at least comments in the code. Please consider.

[Windows 10] No module named 'resample2d_cuda'

First of all thank you for this fantastic program.
I'm confident it's going to be a game-changer.

I'm trying to use it on Windows 10 and I've managed to install the needed packages using Anaconda.
However, I'm getting this error message: No module named 'resample2d_cuda'.

Reading up different solutions I've found out that I need to go to "third_party\flownet2\networks\resample2d_package"
and enter:

python setup.py build
python setup.py install

(Source: NVIDIA/vid2vid#30 )

However, executing the first line gives me the following:

No CUDA runtime is found, using CUDA_HOME='C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.4'
running build
running build_ext
C:\Users\Rafi\anaconda3\lib\site-packages\torch\utils\cpp_extension.py:304: UserWarning: Error checking compiler version for cl: [WinError 2] Das System kann die angegebene Datei nicht finden
  warnings.warn(f'Error checking compiler version for {compiler}: {error}')
building 'resample2d_cuda' extension
creating H:\Program Files\consistent_depth-master\third_party\flownet2\networks\resample2d_package\build\temp.win-amd64-3.8
creating H:\Program Files\consistent_depth-master\third_party\flownet2\networks\resample2d_package\build\temp.win-amd64-3.8\Release
Emitting ninja build file H:\Program Files\consistent_depth-master\third_party\flownet2\networks\resample2d_package\build\temp.win-amd64-3.8\Release\build.ninja...
Compiling objects...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/2] cl /showIncludes /nologo /Ox /W3 /GL /DNDEBUG /MD /MD /wd4819 /wd4251 /wd4244 /wd4267 /wd4275 /wd4018 /wd4190 /EHsc -IC:\Users\Rafi\anaconda3\lib\site-packages\torch\include -IC:\Users\Rafi\anaconda3\lib\site-packages\torch\include\torch\csrc\api\include -IC:\Users\Rafi\anaconda3\lib\site-packages\torch\include\TH -IC:\Users\Rafi\anaconda3\lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.4\include" -IC:\Users\Rafi\anaconda3\include -IC:\Users\Rafi\anaconda3\include "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30037\include" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um" -c "H:\Program Files\consistent_depth-master\third_party\flownet2\networks\resample2d_package\resample2d_cuda.cc" /Fo"H:\Program Files\consistent_depth-master\third_party\flownet2\networks\resample2d_package\build\temp.win-amd64-3.8\Release\resample2d_cuda.obj" -std=c++11 -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=resample2d_cuda -D_GLIBCXX_USE_CXX11_ABI=0 /std:c++14
FAILED: H:/Program Files/consistent_depth-master/third_party/flownet2/networks/resample2d_package/build/temp.win-amd64-3.8/Release/resample2d_cuda.obj
cl /showIncludes /nologo /Ox /W3 /GL /DNDEBUG /MD /MD /wd4819 /wd4251 /wd4244 /wd4267 /wd4275 /wd4018 /wd4190 /EHsc -IC:\Users\Rafi\anaconda3\lib\site-packages\torch\include -IC:\Users\Rafi\anaconda3\lib\site-packages\torch\include\torch\csrc\api\include -IC:\Users\Rafi\anaconda3\lib\site-packages\torch\include\TH -IC:\Users\Rafi\anaconda3\lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.4\include" -IC:\Users\Rafi\anaconda3\include -IC:\Users\Rafi\anaconda3\include "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30037\include" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um" -c "H:\Program Files\consistent_depth-master\third_party\flownet2\networks\resample2d_package\resample2d_cuda.cc" /Fo"H:\Program Files\consistent_depth-master\third_party\flownet2\networks\resample2d_package\build\temp.win-amd64-3.8\Release\resample2d_cuda.obj" -std=c++11 -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=resample2d_cuda -D_GLIBCXX_USE_CXX11_ABI=0 /std:c++14
cl : Befehlszeile warning D9002 : Unbekannte Option "-std=c++11" wird ignoriert.
Hinweis: Einlesen der Datei: C:\Users\Rafi\anaconda3\lib\site-packages\torch\include\ATen/ATen.h
Hinweis: Einlesen der Datei:  C:\Users\Rafi\anaconda3\lib\site-packages\torch\include\c10/core/Allocator.h
C:\Users\Rafi\anaconda3\lib\site-packages\torch\include\c10/core/Allocator.h(3): fatal error C1083: Datei (Include) kann nicht geöffnet werden: "stddef.h": No such file or directory
[2/2] C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.4\bin\nvcc  --use-local-env -Xcompiler /MD -Xcompiler /wd4819 -Xcompiler /wd4251 -Xcompiler /wd4244 -Xcompiler /wd4267 -Xcompiler /wd4275 -Xcompiler /wd4018 -Xcompiler /wd4190 -Xcompiler /EHsc -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -IC:\Users\Rafi\anaconda3\lib\site-packages\torch\include -IC:\Users\Rafi\anaconda3\lib\site-packages\torch\include\torch\csrc\api\include -IC:\Users\Rafi\anaconda3\lib\site-packages\torch\include\TH -IC:\Users\Rafi\anaconda3\lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.4\include" -IC:\Users\Rafi\anaconda3\include -IC:\Users\Rafi\anaconda3\include "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30037\include" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um" -c "H:\Program Files\consistent_depth-master\third_party\flownet2\networks\resample2d_package\resample2d_kernel.cu" -o "H:\Program Files\consistent_depth-master\third_party\flownet2\networks\resample2d_package\build\temp.win-amd64-3.8\Release\resample2d_kernel.obj" -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_70,code=compute_70 -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=resample2d_cuda -D_GLIBCXX_USE_CXX11_ABI=0
FAILED: H:/Program Files/consistent_depth-master/third_party/flownet2/networks/resample2d_package/build/temp.win-amd64-3.8/Release/resample2d_kernel.obj
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.4\bin\nvcc  --use-local-env -Xcompiler /MD -Xcompiler /wd4819 -Xcompiler /wd4251 -Xcompiler /wd4244 -Xcompiler /wd4267 -Xcompiler /wd4275 -Xcompiler /wd4018 -Xcompiler /wd4190 -Xcompiler /EHsc -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -IC:\Users\Rafi\anaconda3\lib\site-packages\torch\include -IC:\Users\Rafi\anaconda3\lib\site-packages\torch\include\torch\csrc\api\include -IC:\Users\Rafi\anaconda3\lib\site-packages\torch\include\TH -IC:\Users\Rafi\anaconda3\lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.4\include" -IC:\Users\Rafi\anaconda3\include -IC:\Users\Rafi\anaconda3\include "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30037\include" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um" -c "H:\Program Files\consistent_depth-master\third_party\flownet2\networks\resample2d_package\resample2d_kernel.cu" -o "H:\Program Files\consistent_depth-master\third_party\flownet2\networks\resample2d_package\build\temp.win-amd64-3.8\Release\resample2d_kernel.obj" -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_70,code=compute_70 -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=resample2d_cuda -D_GLIBCXX_USE_CXX11_ABI=0
C:/Program Files (x86)/Microsoft Visual Studio/2019/Community/VC/Tools/MSVC/14.29.30037/include\crtdefs.h(10): fatal error C1083: Datei (Include) kann nicht ge÷ffnet werden: "corecrt.h": No such file or directory
nvcc warning : The 'compute_35', 'compute_37', 'compute_50', 'sm_35', 'sm_37' and 'sm_50' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
resample2d_kernel.cu
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
  File "C:\Users\Rafi\anaconda3\lib\site-packages\torch\utils\cpp_extension.py", line 1667, in _run_ninja_build
    subprocess.run(
  File "C:\Users\Rafi\anaconda3\lib\subprocess.py", line 516, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "setup.py", line 19, in <module>
    setup(
  File "C:\Users\Rafi\anaconda3\lib\site-packages\setuptools\__init__.py", line 153, in setup
    return distutils.core.setup(**attrs)
  File "C:\Users\Rafi\anaconda3\lib\distutils\core.py", line 148, in setup
    dist.run_commands()
  File "C:\Users\Rafi\anaconda3\lib\distutils\dist.py", line 966, in run_commands
    self.run_command(cmd)
  File "C:\Users\Rafi\anaconda3\lib\distutils\dist.py", line 985, in run_command
    cmd_obj.run()
  File "C:\Users\Rafi\anaconda3\lib\distutils\command\build.py", line 135, in run
    self.run_command(cmd_name)
  File "C:\Users\Rafi\anaconda3\lib\distutils\cmd.py", line 313, in run_command
    self.distribution.run_command(command)
  File "C:\Users\Rafi\anaconda3\lib\distutils\dist.py", line 985, in run_command
    cmd_obj.run()
  File "C:\Users\Rafi\anaconda3\lib\site-packages\setuptools\command\build_ext.py", line 79, in run
    _build_ext.run(self)
  File "C:\Users\Rafi\anaconda3\lib\site-packages\Cython\Distutils\old_build_ext.py", line 186, in run
    _build_ext.build_ext.run(self)
  File "C:\Users\Rafi\anaconda3\lib\distutils\command\build_ext.py", line 340, in run
    self.build_extensions()
  File "C:\Users\Rafi\anaconda3\lib\site-packages\torch\utils\cpp_extension.py", line 708, in build_extensions
    build_ext.build_extensions(self)
  File "C:\Users\Rafi\anaconda3\lib\site-packages\Cython\Distutils\old_build_ext.py", line 195, in build_extensions
    _build_ext.build_ext.build_extensions(self)
  File "C:\Users\Rafi\anaconda3\lib\distutils\command\build_ext.py", line 449, in build_extensions
    self._build_extensions_serial()
  File "C:\Users\Rafi\anaconda3\lib\distutils\command\build_ext.py", line 474, in _build_extensions_serial
    self.build_extension(ext)
  File "C:\Users\Rafi\anaconda3\lib\site-packages\setuptools\command\build_ext.py", line 196, in build_extension
    _build_ext.build_extension(self, ext)
  File "C:\Users\Rafi\anaconda3\lib\distutils\command\build_ext.py", line 528, in build_extension
    objects = self.compiler.compile(sources,
  File "C:\Users\Rafi\anaconda3\lib\site-packages\torch\utils\cpp_extension.py", line 681, in win_wrap_ninja_compile
    _write_ninja_file_and_compile_objects(
  File "C:\Users\Rafi\anaconda3\lib\site-packages\torch\utils\cpp_extension.py", line 1354, in _write_ninja_file_and_compile_objects
    _run_ninja_build(
  File "C:\Users\Rafi\anaconda3\lib\site-packages\torch\utils\cpp_extension.py", line 1683, in _run_ninja_build
    raise RuntimeError(message) from e
RuntimeError: Error compiling objects for extension

Any idea what I'm doing wrong?

I must say I'm not a programmer and I'm barely familiar with the Python/Linux stuff.
So any help is very much apprechiated.

Inconsistent initial depth values with the same image of different formats

Thank you for sharing this great work!

I've been playing with the demo and find this difference by making the following changes:

  1. comment this line
    color_fmt = pjoin(self.base_dir, "color_down", "frame_{:06d}.raw")
  2. replace it with color_fmt = pjoin(self.base_dir, "color_down_png", "frame_{:06d}.png") # load .png file
  3. It would trigger the else branch here to load image with opencv method, originally the .raw format is loaded.
  4. The depth is then collected as print(f'{torch.min(depth)} {torch.max(depth)} {torch.mean(depth)}') after this line:
    depth = self.model.forward(stacked_images, metadata)

CMD

python main.py --video_file data/videos/ayush.mp4 --path results/ayush --camera_params "1671.770118, 540, 960" --camera_model "SIMPLE_PINHOLE" --make_video --model_type monodepth2

Results (only the first 5 frames are attached because of space limit):

(The order is dmin, dmax, dmean in each row)

.raw format

6.328142166137695 47.315696716308594 16.583988189697266
5.077469825744629 48.07398986816406 16.365726470947266
5.110544204711914 48.893699645996094 16.422765731811523
5.226498126983643 45.44618606567383 16.323867797851562
5.302231311798096 40.730411529541016 16.2956600189209

.png format

6.994554042816162 84.51655578613281 26.014997482299805
6.27308988571167 100.8250503540039 27.64916229248047
6.835447311401367 94.29505157470703 26.967622756958008
7.421243190765381 95.34587097167969 27.17084312438965
7.507992744445801 91.67366027832031 26.8772029876709

Does it make sense?

pytorch/cuda version

I've noticed that pytorch version 1.4 with cuda 10.0 is specified in the requirements (torch==1.4.0+cu100). Is there any specific reason for this requirement or can I run the code with more up to date pytorch and/or cuda?

Thanks,
Matt

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.