the-learning-and-vision-atelier-lava / pam Goto Github PK

View Code? Open in Web Editor NEW

93.0 7.0 16.0 3.25 MB

[TPAMI 2020] Parallax Attention for Unsupervised Stereo Correspondence Learning

Python 97.77% Shell 2.23%

super-resolution parallax attention-mechanism stereo-matching

pam's People

Contributors

Stargazers

Watchers

Forkers

xrkang gary66 ella2le zyl1336110861 guofenggitlearning xiwj yangsunnudt yzxstore lettiez fcc-sunshine cv-ip moyunudt bit-mjy fangchuan lumeihe xiaoboom

pam's Issues

left_disp和right_disp参与训练，还是无监督吗？

作者您好，感谢您的工作和开源的代码，我受益匪浅。
我有一个疑问，希望请教您一下：

根据train.py中的代码，
mask_left = ((disp_left > 0) & (disp_left < 192)).float() mask_right = ((disp_right > 0) & (disp_right < 192)).float()
以及
loss_PAM_P = loss_pam_photometric(img_left, img_right, att, valid_mask, [mask_left, mask_right])
disp_left以及disp_right的信息被用在了训练过程中，这是否和无监督训练有所冲突呢？

[单对图像test接口]

作者您好！非常感谢您的杰出工作！看您test.py里面没有单对图像的测试接口, 能否抽空提供？另外，PAM可以输出任意分辨率的图对视差图吗？

泛化性能和微调方法

您好，非常感谢您的工作。我想问一下这个模型的泛化性能怎样呢？对实际情况能够给出很好的结果吗？如果想要用自己的真实数据进行训练，还需要打上label吗？

为什么运行的时候CPU满载而GPU不运行呢

已经安装好CUDA等，且查询到默认运行处理器为GPU，该怎么解决呢

a problem with the default learning configuration

lr = cfg.lr * (cfg.gamma ** -(epoch // cfg.n_steps))
cfg.gamma = 0.1, so the lr might become larger during training?

Small mistake about the metric.

Hi,thank you very much for your great work!
But there may be a small mistake in your metric code.
Refer to Kitti benchmark for the definition of ‘D1’ is percentage of stereo disparity outliers in first frame.
The outliers are defined as the pixels whose disparity errors are larger thanmax(3px; 0.05dtruth), where dtruth denotes the ground-truth disparity.
And I guess your ‘D3’ refers to ‘bad3’，Which is Percentage of “bad”pixels whose error is greater than 3px.
If I am wrong , please tell me.

the release time

@LongguangWang
Mr Wang,
I am very interested in your work and I want to refer to your code to do some unsupervised stereo matching works for your excellent disparity results. So I want to know when will the code be released? Thank you very much!

代码可以在没有cuda的情况下运行吗

作者你好，我想请问代码可以在没有cuda的情况下运行吗

请问这段代码所对应的图片在哪里可以下载呢

parser.add_argument('--datapath', default='D:/LongguangWang/Data/SceneFlow', help='data path') 感谢

[performance comparison] Is PAM performing better than MonoDepthV2?

Thank you for your excellent work! Do you provide performance comparisons with some other unsupervised(or self-supervised) depth estimation like MonoDepthV2?

关于parallax attention 中epipolar line和disparity的问题

作者您好，不好意思在百忙之中打扰您！现在在做一个工作，希望借鉴您的工作，但是有两个问题想请教您：
1. 如果 left image 和 right image 的偏移方向不是水平方向的，例如人在歪着头观察物体的情况下产生的 left image 和 right image，本算法也可以适用嘛。因为根据我的理解，epipolar line是水平的，本算法对Q和K的每一行矩阵相乘，从而保证仅在epipolar line上做attention。
2. 如果 left image 和 right image 的 disparity 较小，例如偏移的像素块仅仅为2，本算法适用嘛，谢谢！

[Question about --max_disp in triaining]

what 's the difference between set max_disp to 192 and set to 0?

为什么无监督立体匹配却在计算损失的时候关联了很多真实的视差值？

作者您好，很感谢你的工作和代码分享，我想请教一下，论文是无监督立体匹配，为什么在代码的损失计算中，mask_left，mask_right是用真实视差值计算出来的，用在了损失的计算中，非常不解，谢谢您的回复

代码可以在kitti数据集上训练么

作者你好，我想请问可以在kitti数据集上训练么？因为我看到train.py里是只写了train_set = SceneFlowDatset()，但是您也提供了KITTIDataset()，所以我想请问可以把SceneFlowDatset换成KITTIDataset来训练么？

forward() missing 2 required positional arguments: 'x_left' and 'x_right'

Mr Wang,
I'm very interested in your work.But I have some problems in training with sceneflow dataset.
In the process of debugging with multiple GPUs, when debugging with two GPUs,
if the batchsize is set to 1, "TypeError: forward() missing 2 required positional arguments: 'x_left' and 'x_right'".
if the batchsize is set to 2, "RuntimeError: CUDA error: out of memory".
How to solve it?

debug

@LongguangWang
Mr Wang,

File "/home/PAM-master/PASMnet/models/modules.py", line 130, in forward
cost_right2left = torch.tril(cost_right2left)
RuntimeError: invalid argument 1: expected a matrix at /opt/conda/conda-bld/pytorch_1544202130060/work/aten/src/THC/generic/THCTensorMathPairwise.cu:174

I'm very interested in your work, and I'd like to ask you what is the reason and how to solve this problem in the process of training with sceneflow dataset.

How to calculate the initial disparity map?

Thank you for your wonderful work and codes.
Could you help me understand the code below?
https://github.com/LongguangWang/PAM/blob/7200dd201047ceb2d98eeea8215b773d11fff8d4/PASMnet/models/modules.py#L199

Why not directly use the sum of all disparity candidates weighted by the parallax-attention map?