yvanyin / diversedepth Goto Github PK

The code and data of DiverseDepth

License: Other

Python 98.23% Shell 1.77%

monocular-depth-estimation generalization-on-diverse-scenes single-image-depth-prediction depth-estimation depth-prediction dataset

diversedepth's People

Contributors

Stargazers

Watchers

diversedepth's Issues

No such file or directory: './annotations.json'

test_any_diversedepth.py requires an annotations.json file.

What is this file and can it be downloaded from somewhere for the pre-trained model?

Traceback (most recent call last):
  File "/Users/foobar/opt/anaconda3/envs/diversedepth/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/Users/foobar/opt/anaconda3/envs/diversedepth/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/Users/foobar/Documents/Projects/DiverseDepth/tools/test_any_diversedepth.py", line 55, in <module>
    f = open(anno_path, 'r')
FileNotFoundError: [Errno 2] No such file or directory: './annotations.json'

The maximum depth of 80 meters

Hello, I see that many gits that use the kitti data set have a maximum depth of 80 meters. How do you determine the maximum depth of 80 meters?

Any plan to provide the original stereo image pairs of DiverseDepth dataset?

If the original stereo image pairs are provided, then we can make use of the recent optical flow models to improve the GT of the depths.

Do you have the plans to release those data?

it seems the test_any_diversedepth doesn't work well

Hi @YvanYin
Thanks for your excellent work, but when I test your model on my own images, the results are bad.
Hope you can release the train code~

About retraining

Thanks your amazing work!
I met a problem here. I tried to retrain the network according to the information you provided, but I did not get similar results to the pre-trained model you provided. The results I get are much worse compared to yours. I can't locate where the problem is, can you give me some advice? The results are as follows:

The code of converting the depth map into a point cloud image

Hello author, I converted the depth map into a point cloud image. After converting with my code, the effect image visualized by meshlab is very different from the effect image in your paper. Could you share your code for converting depth map to point cloud map?
The following is my code:

from PIL import Image
import pandas as pd
import numpy as np
import time
import open3d as o3d

class point_cloud_generator():

def __init__(self, rgb_file, depth_file, pc_file, focal_length, scalingfactor):
    self.rgb_file = rgb_file
    self.depth_file = depth_file
    self.pc_file = pc_file
    self.focal_length = focal_length
    self.scalingfactor = scalingfactor
    self.rgb = Image.open(rgb_file)
    self.depth = Image.open(depth_file).convert('I')
    self.width = self.rgb.size[0]
    self.height = self.rgb.size[1]

def calculate(self):
    t1=time.time()
    depth = np.asarray(self.depth).T
    self.Z = depth / self.scalingfactor
    X = np.zeros((self.width, self.height))
    Y = np.zeros((self.width, self.height))
    for i in range(self.width):
        X[i, :] = np.full(X.shape[1], i)

    self.X = ((X - self.width / 2) * self.Z) / self.focal_length
    for i in range(self.height):
        Y[:, i] = np.full(Y.shape[0], i)
    self.Y = ((Y - self.height / 2) * self.Z) / self.focal_length

    df=np.zeros((6,self.width*self.height))
    df[0] = self.X.T.reshape(-1)
    df[1] = -self.Y.T.reshape(-1)
    df[2] = -self.Z.T.reshape(-1)
    img = np.array(self.rgb)
    df[3] = img[:, :, 0:1].reshape(-1)
    df[4] = img[:, :, 1:2].reshape(-1)
    df[5] = img[:, :, 2:3].reshape(-1)
    self.df=df
    t2=time.time()
    print('calcualte 3d point cloud Done.',t2-t1)

def write_ply(self):
    t1=time.time()
    float_formatter = lambda x: "%.4f" % x
    points =[]
    for i in self.df.T:
        points.append("{} {} {} {} {} {} 0\n".format
                      (float_formatter(i[0]), float_formatter(i[1]), float_formatter(i[2]),
                       int(i[3]), int(i[4]), int(i[5])))

    file = open(self.pc_file, "w")
    file.write('''ply
    format ascii 1.0
    element vertex %d
    property float x
    property float y
    property float z
    property uchar red
    property uchar green
    property uchar blue
    property uchar alpha
    end_header
    %s
    ''' % (len(points), "".join(points)))
    file.close()
    t2=time.time()
    print("Write into .ply file Done.",t2-t1)

a = point_cloud_generator('rgb1.png', 'depth1.png', '1.ply',focal_length=50, scalingfactor=1)
a.calculate()
a.write_ply()

Problem about the depth in sky region.

DiverseDepth/Train/data/diverse_dataset.py

Line 101 in b88c380

depth_resize[sky_mask_resize.astype(np.bool)] = 100

Should the depth of sky Region be set to 10 or 100? Inconsistent code and comments in here.

Final result/output

In the file test_any_diversedepth.py, in a commented line, the output is combined with a tanh-fcn-evaluation.

pred_depth = torch.nn.functional.tanh(pred_depth) + 1

Should this be done with every output of the network to get the right disparity scaling?

Thanks in advance.

Change shape for exact integer upsampling

Hi,
I am trying to convert the pretrained to ios coreml. So i first converted your pretrained model to onnx and then use onnx-coreml to convert it to an ios model. Unfortunately the resize operations on ios coreml happens by using upsampling with an integer scale factor. I put some logs when running your model in python and i noticed some of the resizing are not exact integer scales (in this case they are 1.98):

# ====fcn_topdown_block resizing  torch.Size([1, 256, 35, 27]) to size  70 53
# ====fcn_topdown_block resizing  torch.Size([1, 256, 70, 53]) to size 140 105

If i round them in coreml script as a hack, the model cannot compile.

I wanted to know if it's possible for me to change those size in your model class to be exactly divisible by 2? i looked at the configuration file /lib/core/config.py but could not figure it out.

ModuleNotFoundError

keep getting ModuleNotFoundError: No module named 'Minist_Test when l run python ./Minist_Test/tools/test_depth.py --load_ckpt model.pth why?

Missing Pretrained model

Hi I cant download pretrained model from the link u provided, can u share a new one please? Appreciate

Looking forward to your dataset

Great work. looking for you dataset

Trained weights

Hello, thanks for sharing this great research.
I'm currently studying depth estimation - can you share your trained weights with me?

Part Fore Depth Format

Hello, thank you for posting this dataset. I am excited to start working with it, but I have a question about the part fore depth format. Specifically, all of the part fore depth images are 8-bit png files, when depths are typically stored as 16-bit in mm (i.e. multiply by 1e-3 to get to meters). So, I am wondering what the scale is for these 8-bit files? Also, is the scale for part_in and part_out actually 1e-3 like normal? It seems after some inspection that all of the 16-bit depth files in part in and part out have a max value of around 255 like the 8-bit files as well, so it would seem impossible that these depths are in mm. Without an example dataloader here, these things are impossible to determine.

virtual camera camera parameters?

In the paper, the author mentioned that the true metric depth is transformed into the perspective of a virtual camera via an affine transformation.

Just wondering what the virtual camera intrinsic parameter is you used for the paper?

Using GT surface normals to improve the depth prediction

Thanks for the interesting work. I have found from the paper that the Virtual Normal loss is computed as a difference between the normals of the point clouds generated from the GT depth maps and predicted depth maps.
I would like to pass the ground truth surface normals instead and compute the loss. Could you give some directions to make these changes?

Scaling and translation factors

Hi, thanks for sharing the code of the paper.
How did you get the scaling and translation factors when evaluating on zero-shot datasets?

typo in config.py

DiverseDepth/lib/core/config.py

Line 54 in 4d5b285

__C.MODEL.ENCODER = 'resnext50_body_32x4d'

should be resnext50_32x4d_body_stride16 ?

Failed to find function: lateral_net.lateral_resnext50_body_32x4d while loading the model

ERROR net_tools.py: 32: Failed to f1ind function: lateral_net.lateral_resnext50_body_32x4d

AttributeError Traceback (most recent call last)
in ()
----> 1 test_any_diversedepth.RelDepthModel()

2 frames
/content/DiverseDepth/lib/models/diverse_depth_model.py in init(self)
11 super(RelDepthModel, self).init()
12 self.loss_names = ['Virtual_Normal']
---> 13 self.depth_model = DepthModel()
14
15 def forward(self, data):

/content/DiverseDepth/lib/models/diverse_depth_model.py in init(self)
153 #bottom_up_model = 'lateral_net.lateral_' + cfg.MODEL.ENCODER
154 bottom_up_model = network.name.split('.')[-1] + '.lateral_' + cfg.MODEL.ENCODER
--> 155 self.encoder_modules = get_func(bottom_up_model)()
156 #self.decoder_modules = lateral_net.fcn_topdown(cfg.MODEL.ENCODER)
157 self.decoder_modules = network.fcn_topdown()

/content/DiverseDepth/lib/utils/net_tools.py in get_func(func_name)
28 # Otherwise, assume we're referencing a module under modeling
29 module_name = 'lib.models.' + '.'.join(parts[:-1])
---> 30 module = importlib.import_module(module_name)
31 return getattr(module, parts[-1])
32 except Exception:

AttributeError: module 'lib.models.lateral_net' has no attribute 'lateral_resnext50_body_32x4d'

Problems of dataset downloading

Your work is excellent, and I appreciate the sharing of your RGB-D dataset. I am wondering if there would be other download links like the google drive link, as the current link seems to be the cloud drive in your school, which is hard to download in Mainland, China.

When will you release the training code ?

Hello,
Fristly, thank for sharing your work.!!

like my title, do you have a plan to release the training code ??

Missing DIML Teacher Curriculum

Hello,

The teacher_curriculum.npy file is missing from the provided DIML annotations zip file (gdrive and cloudstor). I am wondering if you had a teacher curriculum for that split or not? If so, and this is an error, could you please provide it?

Adding more RGB-D data to training

Thanks for your work and publishing your models. I have tested your pre-trained models with some of my images for a different use case. The depth prediction looks promising, but not accurate. I want to train a model by adding some data from my dataset so that it could predict accurate depth. Any suggestions on how to add custom data? Do I need any other files other than the train_annotations.json and valid_annotations.json?
What is the importance of teacher_curriculum.npy? How do I create this for the custom data?

when will you release the training dataset?

Great work. I just want to check the time of releasing the dataset. Any schedule?

Diverse Data Dataset

Hi,
I cannot find the diverse data dataset you mention in your paper. Can you please share it?

missing code in diverse_depth_model.py

Hello, it seems that the code related to the loss(Ranking loss/WCEL loss/SSIL loss/MSGL_Loss/YouTube3D_Ranking_Loss) defined in ModelLoss is missing, could you provide them?

About evaluate for relative depth estimation

Hi, thank you for your work! I am a new one for relative depth estimation. I want to know why the evaluate for depth estimation only use metric depth? Can I use relative depth to evaluate, like RMSE, AbsRel use relative depth? Looking forward to your reply. Thank you very much!

Couple of questions regarding how to use this

Hi Wei Yin,

I'll give a quick paragraph of introduction before I'll pose my questions. I'm a computer science student who's working on a little side project. I made a fairly small script that changes video frames into audio in real time. I'm looking to test it's feasibility for use of object detection through audio. Currently it's working on grayscale images, but I'm trying to change that to depth estimations. That's why I started looking at your code. I'm a bit at a loss how to use it though. I tried looking into your paper, but I can't seem to open it because I'm not allowed on campus due to COVID

Say I have extracted a videoframe in my own python module. How would I convert that image to a depth estimation. Which imports do I need to do and which functions would I need to call?

How would the execution change on a lower resolution. Due to hardware limitations I'm downscaling my frames before rendering my waveforms. Should I run your code over my frames before or after downscaling?

Lastly, if I won't use it for commercial use, am I allowed to use this as part of my program if I want to write a research with it?

Kind regards,
Koen

Convert depth map to point cloud map

Hello author, first of all，thank you for your outstanding work, I used your Git to test on my own data set, the effect is very great; I see that your paper also lists the figures of converting the depth map to point cloud, how did you achieve it, and what software did you use to visualize the point cloud diagram?

generate the pointcloud

Very nice job!

Could you provide the code to get absolute depth values in meters, and camera parameters to generate the pointcloud?

Thanks!

Running Quick Start on Windows

I'm attempting to run the Quick Start code on windows but keep getting the same error. Since "export" doesn't exist on windows, I tried using "set" instead but that didn't seem to help.

I get this error:
Traceback (most recent call last): File "./Minist_Test/tools/test_depth.py", line 1, in <module> from Minist_Test.lib.diverse_depth_model import RelDepthModel ModuleNotFoundError: No module named 'Minist_Test'

Am I missing a step in setting up the environment in Conda or is it something else?

error in resnext_weights_helper.py

using the model from here and a simple:

from lib.models.diverse_depth_model import RelDepthModel
model = RelDepthModel()


---------------------------------------------------------------------------

ValueError                                Traceback (most recent call last)

<ipython-input-1-75d68ac643e0> in <module>()
      5 print("__ ", cfg.MODEL.ENCODER)
      6 
----> 7 model = RelDepthModel()
      8 model.eval()
      9 """checkpoint = torch.load("resnext50_32x4d.pth")

6 frames

/content/DiverseDepth/lib/utils/resnext_weights_helper.py in convert_state_dict(src_dict)
     39         print(k)
     40         toks = k.split('.')
---> 41         if int(toks[0]) == 0:
     42             name = 'res%d.' % res_id + 'conv1.' + toks[-1]
     43         elif int(toks[0]) == 1:

ValueError: invalid literal for int() with base 10: 'model_state_dict'

k == "model_state_dict"

even when trying like this:

for k, v in src_dict["model_state_dict"].items():

k == "depth_model.encoder_modules.topdown_lateral_modules.0.lateral.conv1.weight"

yvanyin / diversedepth Goto Github PK

diversedepth's People

Contributors

Stargazers

Watchers

Forkers

diversedepth's Issues

Hello author, I converted the depth map into a point cloud image. After converting with my code, the effect image visualized by meshlab is very different from the effect image in your paper. Could you share your code for converting depth map to point cloud map? The following is my code:

ERROR net_tools.py: 32: Failed to f1ind function: lateral_net.lateral_resnext50_body_32x4d

Recommend Projects

Recommend Topics

Recommend Org

Hello author, I converted the depth map into a point cloud image. After converting with my code, the effect image visualized by meshlab is very different from the effect image in your paper. Could you share your code for converting depth map to point cloud map?
The following is my code: