ndrplz / dreyeve Goto Github PK

[TPAMI 2018] Predicting the Driver’s Focus of Attention: the DR(eye)VE Project. A deep neural network learnt to reproduce the human driver focus of attention (FoA) in a variety of real-world driving scenarios.

Home Page: https://arxiv.org/pdf/1705.03854.pdf

License: MIT License

C++ 5.22% MATLAB 22.04% Makefile 2.69% Shell 0.10% C 55.36% Roff 0.52% Objective-C 0.95% Python 8.32% HTML 3.72% CSS 0.77% Clean 0.16% TeX 0.16%

driver-focus attention autonomous-vehicles autonomous-driving deep-learning convolutional-neural-networks adas automotive computer-vision

dreyeve's Introduction

DR(eye)VE Project: code repository

A deep neural network trained to reproduce the human driver focus of attention.

Results (video)

How-To

This repository was used throughout the whole work presented in the paper so it contains quite a large amount of code. Nonetheless, it should be quite easy to navigate into. In particular:

docs: project supplementary website, holding some additional information concerning the paper.
dreyeve-tobii: cpp code to acquire gaze over dreyeve sequences with Tobii EyeX.
semseg: python project to calculate semantic segmentation over all frames of a dreyeve sequence
experiments: python project that holds stuff for experimental section
matlab: some matlab code to compute optical flow, blends or to create the new fixation groundtruth.

The experiments section is the one that probably interest the reader, in that is the one that contains the code used for developing and training both our model and baselines and competitors. More detailed documentation is available there.

All python code has been developed and tested with Keras 1 and using Theano as backend.

Pre-trained weights:

Pre-trained weights of the multi-branch model can be downloaded from this link.

The code accompanies the following paper:

  @article{palazzi2018predicting,
  title={Predicting the Driver's Focus of Attention: the DR (eye) VE Project},
  author={Palazzi, Andrea and Abati, Davide and Solera, Francesco and Cucchiara, Rita},
  journal={IEEE transactions on pattern analysis and machine intelligence},
  volume={41},
  number={7},
  pages={1720--1733},
  year={2018},
  publisher={IEEE}
}

dreyeve's People

Contributors

Stargazers

Watchers

dreyeve's Issues

how to download the DR(eye)VE data?

I'm sorry to bother you in here，but i can‘t find the link of DR（eye）VE dataset in the paper of TPAMI18.
Can you tell me how to download the DR(eye)VE dataset?Thanks very much!

Merging branch predictions

Hello,

I am using this model for a similar experiment in predicting eye movement fixation maps, and was wondering what the reasoning is for simply summing and normalizing predictions from separate branches instead of doing some sort of weighted sum or concatenation? I am still in the early stages of training this network but my image branch seems to be significantly outperforming my optical flow branch in terms of validation and test set loss - if my final multi branch architecture simply averages the outputs of the two, will it not just perform worse than if only the better performing branch is used?

Error while loading pre-trained C3D weights

when i use this in the code i get
if pretrained:
weights_path = get_file('w_up2_conv4_new.h5', C3D_WEIGHTS_URL)#, cache_subdir='models')
model.load_weights(weights_path, by_name=True)
TypeError: load_weights() got an unexpected keyword argument 'by_name'

when i change it to
if pretrained:
weights_path = get_file('w_up2_conv4_new.h5', C3D_WEIGHTS_URL)#, cache_subdir='models')
model.load_weights(weights_path)

the error i get is:

Exception: You are trying to load a weight file containing 10 layers into a model with 13 layers.

can't find computer_vision_utils file

Recently I'm trying to achieve your experiment, but I can't find the computer_vision_utils file, so I self-defined the read_image function as follows:

def read_image(file_path,resize_dim,channels_first=True,color=True):
    if color:
        raw_image = cv2.imread(file_path)
    else:
        raw_image = cv2.imread(file_path,0)
    raw_image = raw_image.astype(np.float32)
    resized_image = cv2.resize(raw_image,resize_dim)
    if channels_first:
        return resized_image.transpose(2,0,1)
    else:
        return resized_image

But the prediction performance is really bad, so I'm wondering if I have missed some important operation?
By the way, is the 'dreyeve_mean_frame.png' generated as the mean of the first 37 runs?
And the semseg_branch also performs badly, I think there must be some wrong with my code. So I attach my code here, please help me.
Here is my models.py:
models.py.txt

ValueError: total size of new array must be unchanged

Hi, I'm trying to implement your code on my own server. But when I entered the /dreyeve-master/experiments/train$ directory and typed " python train.py", it showes "ValueError: total size of new array must be unchanged". The full information is as follows:

Traceback (most recent call last):
File "train.py", line 114, in
fine_tuning()
File "train.py", line 20, in fine_tuning
model = DreyeveNet(frames_per_seq=frames_per_seq, h=h, w=w)
File "/data/aaa/dreyeve-master/experiments/train/models.py", line 123, in DreyeveNet
im_net = SaliencyBranch(input_shape=(3, frames_per_seq, h, w), c3d_pretrained=True, branch='image')
File "/data/aaa/dreyeve-master/experiments/train/models.py", line 72, in SaliencyBranch
coarse_predictor = CoarseSaliencyModel(input_shape=(c, fr, h // 4, w // 4), pretrained=c3d_pretrained, branch=branch)
File "/data/aaa/dreyeve-master/experiments/train/models.py", line 47, in CoarseSaliencyModel
H = Reshape(target_shape=(512, h // 8, w // 8))(H) # squeeze out temporal dimension
File "/data/aaa/.local/lib/python2.7/site-packages/keras/engine/topology.py", line 569, in call
self.add_inbound_node(inbound_layers, node_indices, tensor_indices)
File "/data/aaa/.local/lib/python2.7/site-packages/keras/engine/topology.py", line 632, in add_inbound_node
Node.create_node(self, inbound_layers, node_indices, tensor_indices)
File "/data/aaa/.local/lib/python2.7/site-packages/keras/engine/topology.py", line 168, in create_node
output_shapes = to_list(outbound_layer.get_output_shape_for(input_shapes[0]))
File "/data/aaa/.local/lib/python2.7/site-packages/keras/layers/core.py", line 336, in get_output_shape_for
self.target_shape)
File "/data/aaa/.local/lib/python2.7/site-packages/keras/layers/core.py", line 330, in _fix_unknown_dimension
raise ValueError(msg)
"
It seems that the bug comes from this line:
H = Reshape(target_shape=(512, h // 8, w // 8))(H) # squeeze out temporal dimension

line 47 in the models.py

I have searched about this error, but still have no clues. So could you be so kind to help me with this issue?
Thank you!

Demo

Hi. Thank you for your work. Is there any way that I can produce these visual attention maps on my own images please? Thank you

CUDA_ERROR_OUT_OF_MEMORY: out of memory

hi，when I run train.py,the mode of train is fine_tuning(),I encounter an error:
pygpu.gpuarray.GpuArrayException: cuMemAlloc: CUDA_ERROR_OUT_OF_MEMORY: out of memory
Inputs types: [TensorType(int64, scalar), TensorType(int64, scalar), TensorType(int64, scalar), TensorType(int64, scalar), TensorType(int64, scalar)]
Inputs shapes: [(), (), (), (), ()]
Inputs strides: [(), (), (), (), ()]
Inputs values: [array(16), array(64), array(16), array(112), array(112)]
Outputs clients: [[GpuDnnConv{algo='none', inplace=True}(GpuContiguous.0, GpuContiguous.0, GpuAllocEmpty{dtype='float32', context_name=None}.0, GpuDnnConvDesc{border_mode='half', subsample=(1, 1, 1), conv_mode='conv', precision='float32'}.0, Constant{1.0}, Constant{0.0})]]
Can you please tell me how to deal with the problem ? Thank you!

Coarse2FineTwoFrames.m source code

I am trying to convert the optical flow code to C/C++ code. However, the core of the code is Coarse2FineTwoFrames.m file which is written in C++ and only a mex file is used to call the function using matlab.
Having this mex file makes Matlab coder unable to convert the code to C/C++ as it can not find the Coarse2FineTwoFrames function.

Can you please tell me how to get the Coarse2FineTwoFrames.m written completely in Matlab?

re-create the DrEYEve dataset ground truth

I am trying to run the matlab codes to "re-create the DrEYEve dataset ground truth". However, I can't find the script_for_fixation_images.m
Can you please tell me where to find it

Package dependency list

Hello,
Is there a package dependency list available for this code, specifically the semseg part? I am currently using Pyton 3.6.12 and having some dependency and compatibility issues with tensorflow and keras.
Thank you!

Where is your gt_fix?

hello,when i readed your code in dreyeve/experiments/metrics/compute_metrics.py, find gt_fix and gt_sal in the groundtruth.
But i don't find the gt_fix in your datasets , so how i can find the gt_fix??
Besides，the video_saliency.avi may be gt_sal in DR(eye)VE Dataset provided.

CUDA_ERROR_OUT_OF_MEMORY: out of memory

semseg predictions

Hi, I am trying to perform semantic segmentation on the images and for 1 sequence it takes approx 51 hrs. Can you let me know how to speed this up?

Differences between provided dataset and repo code

Hi, I'm trying to train this model from scratch using the dataset provided, however it seems the dataset provided doesn't quite match what the code requires, e.g. it has avi files instead of frame jpegs so when I try to run:

python2 train.py --which_branch image,

I end up with an error like:

ValueError: Provided path "/home/amakri/DREYEVE_DATA/23/frames/004465.jpg" does NOT exist.

Is there code somewhere in this repo that I've missed which does this sort of preprocessing and sets up the dataset to be run by the code? Or are these things I will just have to do myself?

dreyeve_mean_frame.png

Hi there,
Is it possible to make the dreyeve_mean_frame.png file available?

Many thanks,
Tobias

dataset

hello,could you tell me the meanning of "cineca",and if i should run matlab code before semseg code to deal with DREYEVE_DATA?

SVISINIT problem

Hi all,
I have a problem with running the test file dreyeve/matlab/assessment/test/test_multifovea.m to run on my data! but unfortunately, the SVISINIT keeps showing an error too many input arguments and when I've tried to debug it in the file it's saying don't edit the file which is apparently automatically generated. I don't know if I'm missing something or not. It would be great if you could direct me to solve this issue.

Thanks for your time!

how to use GPU? which version of Python, cuda, keras, and Theano should I use

I want to inference with the pre-trained model, which version of Python, cuda, keras, and Theano should I use. Besides, Do you use the system-level cuda or the cuda in the virtual environment？
I have tried with python=2.7， keras=1.1.0，cuda=10.0，theano=1.0.4，if I inference with GPU， the result obviouly wrong（show 阿as fig）， while correct with the cpu，how can I fix it？

TODO: Implement mirroring in data loader

The code that implemented data augmentation by mirroring in the old project follow for inspiration

mirror = bool(random.getrandbits(1))
augment_data and mirror:
np.array(frames_cropped)[..., ::-1]
np.array(frames_full)[..., ::-1]
np.array(frame_single)[..., ::-1]
np.array(frame_gt_cropped)[..., ::-1]
np.array(frame_gt_frame)[..., ::-1]

Is it possible to project the images and attention points into BEV?

how to download the DR(eye)VE data?

I'm sorry to bother you in here，but i can‘t find the link of DR（eye）VE dataset in the paper of TPAMI18.
Can you tell me how to download the DR(eye)VE dataset?Thanks very much!

Pre-trained coarse weights for segmentation branch?

Hi,

I'm currently working on this project to reproduce the gaze prediction by the pre-trained model from this work (I just need inferencing).

And I just found that there are no pre-trained weights of the coarse part of the segmentation branch.
That is, it seems that the weight file w_up2_conv4_new.h5 is only used in the image and optical flow branch.

dreyeve/experiments/train/models.py

Lines 123 to 125 in 5ba3217

    
           im_net = SaliencyBranch(input_shape=(3, frames_per_seq, h, w), c3d_pretrained=True, branch='image') 
        
           of_net = SaliencyBranch(input_shape=(3, frames_per_seq, h, w), c3d_pretrained=True, branch='optical_flow') 
        
           seg_net = SaliencyBranch(input_shape=(19, frames_per_seq, h, w), c3d_pretrained=False, branch='segmentation')

My question is, does the segmentation branch supposed to be no pre-trained coarse weight?
(If not, where could I get the pre-trained coarse weights for segmentation branch?)

Thanks for your time!

Could you release the bazzani.h5 in the rmdn_comparison?

Hi I am trying to use your model as well as your implementation of the rmdn for my dataset. I saw in predict_dreyeve_sequence.py file, a h5 file called bazzani.h5 is loaded. Could you release this weight file so that I can test the model on my dataset as well? Thanks a lot!

the use of GPU

hi，thank you for your share! when i run train.py with gpu,errors happen !
could you tell me how to deal with it
ERROR (theano.gof.opt): Optimization failure due to: LocalOptGroup(local_abstractconv_cudnn,local_abstractconv_gw_cudnn,local_abstractconv_gi_cudnn,local_abstractconv_gemm,local_abstractconv3d_gemm,local_abstractconv_gradweights_gemm,local_abstractconv3d_gradweights_gemm,local_abstractconv_gradinputs_gemm,local_abstractconv3d_gradinputs_gemm)
ERROR (theano.gof.opt): node: AbstractConv3d_gradWeights{convdim=3, border_mode='half', subsample=(1, 1, 1), filter_flip=True, imshp=(None, None, None, None, None), kshp=(512, 512, 3, 3, 3), filter_dilation=(1, 1, 1), num_groups=1, unshared=False}(GpuElemwise{mul,no_inplace}.0, GpuElemwise{add,no_inplace}.0, MakeVector{dtype='int64'}.0)
ERROR (theano.gof.opt): TRACEBACK:

frames extraction and path error

I am trying to implement your code but it gives the following error:
File "/content/drive/My Drive/computer_vision_utils/io_helper.py", line 33, in read_image
raise ValueError('Provided path "{}" does NOT exist.'.format(img_path)) ######1
ValueError: Provided path "/content/drive/My Drive/DREYEVE Unzipped/29/frames/004126.jpg" does NOT exist.
I checked the folders myself self and there is no created folder called frames and I didn't find this folder or the images in any other directory. I wonder if there is a missing code part that extracts frames from the videos and creates that missing path?

Saliency map and saliency_fix map files could not be found

Hi，excuse me.The saliency map and saliency_fix map files are used in the predict_dreyeve_sequence. Py project.I hope you can reply after reading it. Thank you very much.

`_
I_ff[0, :, 0, :, :] = x
OF_ff[0, :, 0, :, :] = of
SEG_ff[0, :, 0, :, :] = seg

Y_sal[0, 0] = read_image(join(sequence_dir, 'saliency', '{:06d}.png'.format(sample)), channels_first=False,
                         color=False, resize_dim=(h, w))
Y_fix[0, 0] = read_image(join(sequence_dir, 'saliency_fix', '{:06d}.png'.format(sample)), channels_first=False,
                         color=False, resize_dim=(h, w))

return [I_ff, I_s, I_c, OF_ff, OF_s, OF_c, SEG_ff, SEG_s, SEG_c], [Y_sal, Y_fix]

those code come from predict_dreyeve_sequence.py.

	im_net = SaliencyBranch(input_shape=(3, frames_per_seq, h, w), c3d_pretrained=True, branch='image')
	of_net = SaliencyBranch(input_shape=(3, frames_per_seq, h, w), c3d_pretrained=True, branch='optical_flow')
	seg_net = SaliencyBranch(input_shape=(19, frames_per_seq, h, w), c3d_pretrained=False, branch='segmentation')