hlwang1124 / sne-roadseg Goto Github PK
View Code? Open in Web Editor NEWSNE-RoadSeg for Freespace Detection in PyTorch, ECCV 2020
Home Page: https://sites.google.com/view/sne-roadseg
License: MIT License
SNE-RoadSeg for Freespace Detection in PyTorch, ECCV 2020
Home Page: https://sites.google.com/view/sne-roadseg
License: MIT License
Hi Hengli,
Could you provide some instructions for generating dense depth maps? Using depth completion networks or simply using interpolation? If we don't have perfect depth maps, could SNE and RoadSeg Net generalize well?
Hello,
I'm very new to training/experimenting with models so I apologize if I made some obvious mistakes when going trough your instructions.
I'm trying to run the example you provided with ./scripts/run_example.sh
. In order to do this, the only file I downloaded in addition to the repo content is the "checkpoints". I don't have the KITTI road dataset or the "depth_u16" as I think they are not necessary for testing on the first example.
Here are the steps I followed:
1- I cloned the repository to my desktop.
2- I added the "--gpu_ids -1" flag to run_example.sh as I got some errors otherwise. I think this runs the model on CPU.
3- I downloaded the "checkpoints.zip" and extracted it to the root directory.
4- I created a docker container from the provided Dockerfile (with docker build etc.)
5- I ran the docker container.
6- I copied the repo content (with the changes I made, eg. the additional flag and the "checkpoints" file) to the container with "docker cp" command.
7- I exec into the docker container and view the content as:
8- I run the ./scripts/run_example.sh
command and see:
Can you spot where I'm making a mistake? I would appreciate any help. Thank you in advance.
Thank you for your excellent work!
I run this scripts
bash ./scripts/test.sh, then I got this results
Epoch kitti test loss: 0.568 loss: 0.376
Epoch kitti glob acc : 0.820, pre : 0.000, recall : nan, F_score : nan, IoU : 0.000
Is this global acc useful of judging model good or bad ?
Tanks for your sharing!
I try to use the SNE module to produce a normal image from a depth, but It seems that the quality of the result is not good.
The depth image I used is from the YCB_Video.
depth_image = cv2.imread(depth_path, cv2.IMREAD_ANYDEPTH)
sne_model = SNE()
camParam = torch.tensor([[1.066778e+03, 0.000000e+00, 3.129869e+02],
[0.000000e+00, 1.067487e+03, 2.413109e+02],
[0.000000e+00, 0.000000e+00, 1.000000e+00]], dtype=torch.float32) # camera parameters
normal = sne_model(torch.tensor(depth_image.astype(np.float32)/1000), camParam)
normal_image = normal.cpu().numpy()
normal_image = np.transpose(normal_image, [1, 2, 0])
cv2.imwrite(normal_path, cv2.cvtColor(255 * (1 + normal_image) / 2, cv2.COLOR_RGB2BGR))
Could you give me some advice?Thanks!
Hi. I would like to train this model on R2D dataset. However, I could not find information of calibration file even if I refer to carla homepage and searched 'calibration'/ I would like to know how to get calibration files of R2D dataset.
Thanks in advance.
Hello, I try to run "run_example.py" to test it on the image in the test dataset and it works very well. Great work!! But when I tried to run "test.py", an error appeared.
Traceback (most recent call last):
File "test.py", line 31, in
for i, data in enumerate(dataset):
File "C:\Users\sf995\iCloudDrive\DL\SNE-RoadSeg\data_init_.py", line 71, in iter
for i, data in enumerate(self.dataloader):
File "D:\anaconda\envs\pytorch\lib\site-packages\torch\utils\data\dataloader.py", line 193, in iter
return _DataLoaderIter(self)
File "D:\anaconda\envs\pytorch\lib\site-packages\torch\utils\data\dataloader.py", line 469, in init
w.start()
File "D:\anaconda\envs\pytorch\lib\multiprocessing\process.py", line 112, in start
self._popen = self._Popen(self)
File "D:\anaconda\envs\pytorch\lib\multiprocessing\context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "D:\anaconda\envs\pytorch\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "D:\anaconda\envs\pytorch\lib\multiprocessing\popen_spawn_win32.py", line 65, in init
reduction.dump(process_obj, to_child)
File "D:\anaconda\envs\pytorch\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object 'CustomDatasetDataLoader.initialize..'
(pytorch) C:\Users\sf995\iCloudDrive\DL\SNE-RoadSeg>Traceback (most recent call last):
File "", line 1, in
File "D:\anaconda\envs\pytorch\lib\multiprocessing\spawn.py", line 105, in spawn_main
exitcode = _main(fd)
File "D:\anaconda\envs\pytorch\lib\multiprocessing\spawn.py", line 115, in _main
self = reduction.pickle.load(from_parent)
EOFError: Ran out of input
Seems like it is about the lambda function. Could you help me?
I can not download this dataset,
Downloads are frequently interrupted!
Hello,
do you have an already trained model on the R2D dataset? Can you please provide a link?
Thank you
Hello!
I downloaded the R2D dataset you published before, and I want to use it further. Where can I get more information about this dataset? For example, what category does the label of the dataset correspond to, some corresponding settings, and so on.
Thank you very much!!!
Hi @hlwang1124 ,
Congratulations for your paper acceptance! I'm a newbie in road segmentation and for now I want to ask some questions regarding your freespace segmentation.
Check the following two plots,
Compare those two plots, The ground truth (below one) has only one lane labeled, while your prediction (above one) can label two lanes. This is impressive. Can you tell me how this works? As I assume that you have knowledge for only one lane (Namely gt one). Then how is SNE able to generalize to inference both lanes?
Thank you if you can leave me some hints.
I use the example script for kitti dataset test.
Depth image is from the kitti stereo 2015 dataset ground truth (disparity image not the depth) and corresponding to the rgb image.
I set the camera parameters unchanged, and the depth caculation from disparity as below:
depth_image= depth_image.astype(np.float32)
row= depth_image.shape[0]
col= depth_image.shape[1]
depth_image= np.true_divide(np.ones((row,col)), depth_image)
depth_image= depth_image * 721.5377* 0.537 * 256.0
normal = sne_model(torch.tensor(depth_image.astype(np.float32)), camParam
This is the example depth:
The example result:
My disparity image:
And my result:( It looks not smooth and filled with noise)
I wonder what dose the upper half green part of the result mean?
hi, I have some well-labeled 0\1 tags and related training images. I want to know how to generate images of the depth_u16.png type required for training
Hi @hlwang1124 ,
I just checked your source code. And I have one question. You defined a variable called "Lambda L1 loss" in parser. But I am not able to find what you apply it. Can you tell me if you use it to train your model?
Thank you,
Thank you for your excellent work!
I got the result as this #17 (comment).
The following is disparity, normal, prob_map
As you can see, the final result is worse than your one in #17 (comment).
Is the quality of my results normal?
Could you give me some advice how to improve the quality?
Thank you in advance!
Hello !! I try to run the training program . My GPU card is NVIDIA GTX1080 with 8GB memory. But it always reports cuda out of memory when run the training program . I try to low down the batch size to 1. But it still report error like this:
RuntimeError: CUDA out of memory. Tried to allocate 118.00 MiB (GPU 0; 7.92 GiB total capacity; 6.24 GiB already allocated; 139.69 MiB free; 304.82 MiB cached)
I try to run the program in a larger memory card, which has 11GB memory. It can start training successfully. But after running several epoch it always make the computer restart. What's wrong with that? Please help! THANKS !
I am wondering if this model can deal with the segmentation task without the depth image or by changing some components to achieve this.
Hello,
I'm planning to initialize the network with the pretrained model (on KITTI) that you have published and further train the network with R2D dataset.
1- Do you think this is a good idea?
2- In an issue you said that the network is very likely to converge in 200 epochs when training with the R2D dataset from scratch. In my case (training with R2D in addition to KITTI), how many epochs do you think I need to get good results?
Thank you
Hello, Hengli, is the calib folder necessary? is it suitable for kitti odometry sequence 00-10?
I try to generate the normal feature image using my depth image data (captured by realsense camera) with your SNE module, but it seems that the upper half of the normal relsult image is Nan.
Hi @hlwang1124 ,
Would you mind me to ask if you have used any pretrain weights to train this model? For example, use imagenet as the pretrain weight for the backbone or etc.
Thank you if you can provide me with some insights.
Wish you a great day,
Hi, Hengli, is the pre-trained kitti road model suitable for kitti sequence 00-10, as the pixel width and height of color images in kitti sequence 00-10 is 1241376, in kitti road dataset is 1242375.
I've generated some depth maps using Midas for non-Kitti images. Then inverted the depth image and feed into SNE-RoadSeg network but the result is hardly usable. Is there a way to segment road without using depth image? The use case is to segment out ground region for arbitrary RGB images only.
Hi,
Thank you for your work. Could I know how to split the R2D road dataset?
Which part belongs to training and which part belongs to the evaluation(The split files)?
Many thanks
Would you please share your evaluation code(devkit_road) with us?
the original code downloaded from the http://www.cvlibs.net/datasets/kitti/eval_road.php uses very old dependencies: python 2.7/numpy 1.51/opencv 2.4.6. It is almost impossible to rebuild this kind of environment.
Do you also use the original code? if not, would you please share your new code with us?
The evaluation method which you are using in the training process is different from the evaluation code in the devkit_road, right?
Therefore, the evaluation result showing in the training process can only be regarded as an indicator and can not be used as the result for submission to the server, am I correct?
Thanks in advance!
Thanks again for your previous reply!
After the env been set up(datasets, pretrained model), I run the test script:
bash ./scripts/test.sh
the F_score is nan as shown below:
model [RoadSegModel] was created
loading the model from ./checkpoints/kitti/kitti_net_RoadSeg.pth
---------- Networks initialized -------------
[Network RoadSeg] Total number of parameters : 201.325 M
-----------------------------------------------
Epoch kitti test loss: 0.838 loss: 0.569
Epoch kitti glob acc : 0.821, pre : 0.000, recall : nan, F_score : nan, IoU : 0.000
No error occurred,
Would you please enlighten me on which part probably went wrong?
Thanks in advance!
After training with the default hyperparameters, my evaluation result from kitti server is not as good as in your report:
MaxF
UM_Road: 95.12%
UMM_Road: 97.05%
UU_Road : 94.39%
urban_road: 95.76%
Could you please give me any suggestions to improve the performance?
Thanks in advance.
Following error is coming while loading the model :
RuntimeError: Error(s) in loading state_dict for RoadSeg:
size mismatch for encoder_another_conv1.weight: copying a param with shape torch.Size([64, 3, 7, 7]) from checkpoint, the shape in current model is torch.Size([64, 1, 7, 7]).
In the R2D dataset, I read the depth image as an RGB image, and followed the instructions in README to convert the image with 3 channel into depth value, and stored it using uint16. But the result seems incorrect, because the depth value is not continuous on the road. Could you tell me how to use the depth image correctly? Thanks a lot.
.
Hello Wang:
@hlwang1124 I found that the generated dense depth image may influence the inference performance. Can you provide the dense depth image generation code for us? I try your method mentioned in #7. But it seems that generated result is different from your privided depth_u16 data. My EMAIL: [email protected]. Thanks a lot!
Hi, nice work! I'm confused with file palette.txt, can you please explain the usage of this file?
What I understand is that when sne finds the optimal normal direction, it ignores the fluctuations in the xy direction and only optimizes the z direction. Is this understanding correct, and if so, why this is possible
Hi,
Could you please help me with the conversion from testresults to the BEV format before submitting it to the Kitti server?
I tried the devkit_road from the Kitti website, and I believe it needs to run under python 2.7.
Unfortunately, there is always some problem running it.
Would you please share your conversion code and execution steps with us if possible? It would be a great help!
Thanks in advance!
I'm working on a road estimation project for self-studying. I'm trying to convert the depth image in R2D, but although I use the converted depth image when I run the run_example code, the accuracy was bad. So I'm interested in the script which is described for the conversion method too. Could you please share with me? Thanks you!
Email : [email protected]
Hi, I am a newbie in semantic segmentation. Can the images marked with labelme be used as a dataset? If so, how to create a dataset and how to use it, do I need to retrain the model with a self-built dataset? How can the trained model be used for testing?
I wait for your response.
Hi, after reading your paper (https://arxiv.org/pdf/2107.14599.pdf) I have a question on SNE-RoadSeg+ you proposed.
The solution to the extreme problem in equation (3) is not so trivial. Can you give some derivations? Thanks so much.
Hello
My input is a single rgb image and I would like to get its road-segmented image. But I still have some problems running the code.
It seems to in addition to the input image, it also needs depth-image. I was wondering what depth-image is? How can I generate it for my input rgb image?
I am running the following script. the path "datasets/kitti/testing" contains my input image only.
python3 test.py --dataroot datasets/kitti --dataset kitti --name kitti --use_sne --prob_map --no_label --epoch kitti
I really appreciate it if you can help.
Best regards
Behnam
Hello, is SNE-RoadSeg suitable for Campus Dataset. Especially, when the road is not smooth enough, what is the effect of road
detection?
Hi, the link you give for downloading depth_u16 is inaccessable.
Can you please give another address ? Thank you so much.
Hello, the F-score in your paper refers to MaxF like KITTI benchmark, is it right?
Hello! I try to run the example scripts. However, the predicted image is black without any data. I try to run the test scripts but I get the nan results? What wrong about that ? Anyone tell me Please ? Thanks!!!
From where i can download it?
Hello, I tried to use the data captured by ZED 2 depth camera ,which are shown below, to obtain the road prediction. But the network doesn't work at all. Do you have any ideas on the results? And I noticed that in your paper, three datasets are used to train the model, and the code you show here is only for KITTI dataset? Thank you in advance!
Traceback (most recent call last):
File "train.py", line 58, in
model.optimize_parameters()
File "/content/gdrive/My Drive/SNE-RoadSeg-master/models/roadseg_model.py", line 59, in optimize_parameters
self.forward()
File "/content/gdrive/My Drive/SNE-RoadSeg-master/models/roadseg_model.py", line 50, in forward
self.output = self.netRoadSeg(self.rgb_image, self.another_image)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/parallel/data_parallel.py", line 150, in forward
return self.module(*inputs[0], **kwargs[0])
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/content/gdrive/My Drive/SNE-RoadSeg-master/models/networks.py", line 236, in forward
x3_2 = self.conv3_2(torch.cat([x3_0, x3_1, self.up4_1(x4_1)], dim=1))
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/content/gdrive/My Drive/SNE-RoadSeg-master/models/networks.py", line 94, in forward
x = self.up(x)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/upsampling.py", line 131, in forward
return F.interpolate(input, self.size, self.scale_factor, self.mode, self.align_corners)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py", line 2563, in interpolate
return torch._C._nn.upsample_bilinear2d(input, _output_size(2), align_corners)
RuntimeError: CUDA out of memory. Tried to allocate 60.00 MiB (GPU 0; 11.17 GiB total capacity; 10.34 GiB already allocated; 30.81 MiB free; 489.75 MiB cached)
cann't download checkpoints.zip
Downloads are frequently interrupted!
Hello,
when training the model from scratch on the KITTI road dataset, how many epochs should I run in order for the network to converge and not overfit?
Note that I divided the 289 training images to 173 training, 58 validation and 58 test images. If you have a better configuration, I would appreciate it if you can share.
Thanks.
(R2D dataset)
How many epochs did you train on RoadSeg-152?
The default maximum epoch value is 1000. However it is too big size.
Thanks.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.