xucong-zhang / eth-xgaze Goto Github PK

View Code? Open in Web Editor NEW

185.0 3.0 33.0 75.51 MB

Official implementation of ETH-XGaze dataset baseline

Python 100.00%

eth-xgaze's People

Contributors

Stargazers

Watchers

eth-xgaze's Issues

Is gc the coordinate of target or calulated vector

gc_normalized = gc - face_center # gaze vector
I am not sure whether gc is the coordinate of target based on camera coordinate system or already calculated based on head coordinate system? And why doing this step?
please explain this, thanks!

How to calculate the rvec and tvec? Are they calculated by solvepnp, which paras are face_model_3d_coordinates and ldmk68s from csv?

I got rvec, tvec calculated by the func below, but the result is not correspond with rvec, tvec from the lable csv.

_, rvec, tvec = cv2.solvePnP(face_model_3d_coordinates, ldmk50,
camera_matrix,
distortion_coefficients,
rvec,
tvec,
useExtrinsicGuess=True,
flags=cv2.SOLVEPNP_ITERATIVE)

trying to use other images to test

Hello Xucong Zhang,

I am glad to read your paper and code. I am trying to verify whether I can user your pre-trained model on my own dataset. But one problem is I do not have camera information (xml file). I am wondering if that will influence the quality of output.

I am looking for your feedback.

Questions about the baseline structure

Hi Xucong,

Excellent work on the ETH XGaze! It really provides a diverse dataset of gaze estimation. I have a questions regarding the baseline structure. Why we don't compress the FC outputs through a tanh/sigmoid activation function to normalize the output a bit? Is there a intuition to use the original outputs?

Additionally, I suggest in the demo code a model.eval() could be added before running the forward pass.

ETH-XGaze/demo.py

Lines 158 to 164 in ca2d991

    
           model.load_state_dict(ckpt['model_state'], strict=True) 
        
           input_var = img_normalized[:, :, [2, 1, 0]]  # from BGR to RGB 
        
           input_var = trans(input_var) 
        
           input_var = torch.autograd.Variable(input_var.float().cuda()) 
        
           input_var = input_var.view(1, input_var.size(0), input_var.size(1), input_var.size(2))  # the input must be 4-dimension 
        
           pred_gaze = model(input_var) 
        
           pred_gaze = pred_gaze[0] # here we assume there is only one face inside the image, then the first one is the prediction

Looking forward to your reply!

Best,
Yijun

Inquiry on Gaze Coordinate System Definition in ETH Dataset Compared to Gaze360

Hello ETH Dataset Team,

Thank you for open-sourcing the ETH dataset and methods!

I am currently working with the Gaze360 dataset and noticed its specific definition of the gaze coordinate system. In Gaze360, the eye coordinate system is defined as a right-handed coordinate system with the y-axis pointing up and the origin at the center of the Ladybug camera system. The positive x-axis points to the left, positive y-axis points up, and positive z-axis points away from the camera. For example, a gaze vector of [-1,0,0] represents looking to the right, and [0,0,-1] represents looking straight into the camera.

Could you please provide details on how the gaze coordinate system is defined in the ETH dataset? Specifically, how are the x, y, and z axes oriented in your gaze coordinate system, and what is the origin point of this system?

Understanding these differences will greatly assist in integrating and comparing data from both sources accurately.

Best regards,
HuAndrew

json file

Hi, it maybe missing the file of train_test_split.json? How to get this json file? @xucong-zhang

about the pre-trained model

Hi Xucong,

Excellent work on the ETH XGaze! Now I have meet an issue about the pre-trained model. I have download the model,but it cant uncompress. I dont know whether there has something wrong with the source file.
Looking forward to your reply!

Best,
LiuGang

Few questions about the dataset (gaze, pose)

Hello, I am trying to understand the data structure of eth-xgaze dataset.
In 'OnePersonDataset', three values are returned when it is called, which are image, pose, and gaze.
It seems that gaze is combination of pitch and yaw which are in radians. (please correct me if i am wrong)
I am little confused about what pose does during training. If the pose represents 'what direction the face is pointing at', then how can the pose be defined with one number (unlike pitch and yaw)?

Question about training dataset

Hi, Zhang. Thanks for ur project! My issue is why did u use original normalized image in training loop? In other words, why didn't u use any image enhancement? like random gray, random color distortion ....
Looking forward for ur reply! :)

ModuleNotFoundError: No module named 'imutils'

when I running demo.py, it occur a error "ModuleNotFoundError: No module named 'imutils'". Maybe you miss a imutils.py in project.

what are those values in face_model.txt?

Hi, I don't quite understand what are those values in face_model.txt? Is there any difference between it and landmarks?

cam_id or solvePnPRansac issue

Hello, I caught this problem:

In demo.py if I use HeadPoseEstimator() instead of estimateHeadPose() this error occures:

--> 29     hr, ht, o_l, o_r, _ = head_pose_estimator(image, landmarks, camera_matrix[cam_id])
     30     ## the easy way to get head pose information, fast and simple
     31 #     facePts = face_model.reshape(6, 1, 3)

NameError: name 'cam_id' is not defined

If I replace camera_matrix[cam_id] with camera_matrix this problem occures:

---------------------------------------------------------------------------
error                                     Traceback (most recent call last)
<ipython-input-13-a4e70c1ea3f5> in <module>
     27     landmarks = landmarks.reshape(-1, 2)
     28     head_pose_estimator = HeadPoseEstimator()
---> 29     hr, ht, o_l, o_r, _ = head_pose_estimator(image, landmarks, camera_matrix)
     30     ## the easy way to get head pose information, fast and simple
     31 #     facePts = face_model.reshape(6, 1, 3)

~/ETH-XGaze/head_pose.py in __call__(self, frame, landmarks, intrinsics, target_io_dist, visualize)
    136         # Do PnP-based head pose fitting
    137         rvec, tvec, reprojected_points, o_l, o_r, face_model = \
--> 138             self.head_pose_fit(landmarks, eos_mesh, intrinsics, scaling_factor)
    139         o_r_2D = cv.projectPoints(o_r, rvec, tvec, intrinsics, None)[0].reshape(2)
    140         o_l_2D = cv.projectPoints(o_l, rvec, tvec, intrinsics, None)[0].reshape(2)

~/ETH-XGaze/head_pose.py in head_pose_fit(self, landmarks_2D, deformed_mesh, intrinsics, scaling_factor)
     95         # Initial fit
     96         camera_matrix = intrinsics
---> 97         success, rvec, tvec, inliers = cv.solvePnPRansac(
     98             sfm_points_ibug_subset, landmarks_2D, camera_matrix, None,
     99             flags=cv.SOLVEPNP_EPNP)

error: OpenCV(4.5.2) /tmp/pip-req-build-dccdjyga/opencv/modules/calib3d/src/solvepnp.cpp:241: error: (-215:Assertion failed) npoints >= 4 && npoints == std::max(ipoints.checkVector(2, CV_32F), ipoints.checkVector(2, CV_64F)) in function 'solvePnPRansac'

could you please help?

Data download issues

Hi,
Sorry to bother you, I am a student from guangdong university of technology of china.
I want to download the dataset for researching and submitted the registration on the website according to the guidelines.
But i haven't received the download link.
My e-mail is [email protected], could you please send me the download link? I have submitted the application again.
Thanks for your great work and help!

Estimated gaze without the normalization

Hello,
I wolud like to ask, if there is any easy way to get the real gaze values. I understand from the code that the return values of gaze are calculated after normalization, but I need to get the real gaze values of origin image and I do not know what operation should I do with result gaze for getting the real estimation from original image position. I hope you understand what I mean. Thank you.

How to calculate 3d gaze vector by subject*.csv?

hi, one question, in raw/data/annotation_train/subject*.csv, column 3-4, column 5-7 are gaze point and gaze point. How to calculate 3d gaze vector by column 3-4 and column 5-7?

Person Specific Evaluation

Hi,

How to perform a person-specific evaluation if the ground truth data for test_person_specific is not present in the dataset?

Best Regards

XML files for camera parameters

Hi,

I noticed that in your code ,camera parameters are required (see below for details). However, I did not find the folder or the files in your code. Could you please let me know where can I download these parameters/xml files?

file_name = './calibration/cam_calibration/' + 'cam' + str(cam_id).zfill(2) + '.xml' (l.195 in normalization_example.py)

Your help is very much appreciated.

Thanks.

What does the face_gaze mean in the annotation file?

What does the face_gaze mean in the annotation file?
It does not the left/right eye gaze or the mean of the both eyes gaze, right?
Which 3d landmark do you use to calculate the face gaze?

pitch and yaw (raw outputs of network) are not in HCS (head coordinates system)

Hi, thanks for this great paper and dataset and also all of your previous valuable works in the field of appearance-based gaze estimation.

I recently tried to use the raw output of the network, which is trained on the ETH-XGaze dataset, to estimate the PoG (Point of Gaze) in CCS (Camera Coordinates System). So I used your normalization method and find the normalizing rotation matrix to transform the normalized gaze vector which is in HCS, to the 3D gaze vector which is in CCS.

But it seems that pitch and yaw are not in HCS because when everything is unchanging, except the camera position, the network output changes. So if it is correct and pitch and yaw are not in HCS, we need an extra step further than a normalizing rotation matrix which compensates head pose. But I can't find this step and it is ambiguous for me.

Dataset structure

Hi Xucong,

Thank you so much for making this dataset, and code available. I wanted to ask, is there any way we can get the structure of the dataset? That's to say what each tar file contains, and so on, similar to what you did with MPIIFaceGaze. That was very helpful :)
I was able to download the 448 dataset, but it's missing the json file for train_test split, and the test set. You mentioned in #9 (comment) that it's possible to get a needed missing file from the raw data, could you please clarify how?

Many thanks :)

Calibrating gaze vector to Screen point ?

I'm trying to map the pred_gaze_np output to the 2D screen point. Is this something already implemented ? If not can you please help me with what approach should be followed.
I tried a simple polynomial regression from the vector to screen points and the results are decent but I'm wondering if there's a better approach.

Request for Gaze Data Download Link

Hello,

I hope this message finds you well. I am a graduate student at the UESTC(University of Electronic Science and Technology of China), and I am currently conducting research that requires access to Gaze data. I have submitted the registration form as required, but I have not yet received the download link.

Could you please assist me by providing the download link at your earliest convenience? My registered email is [email protected], and all the necessary information has been provided in the registration form.

I apologize for any inconvenience, and I appreciate your prompt attention to this matter. Time is of the essence for my research, and I would be grateful for your assistance.

Thank you very much for your help.

Best regards,
Hong Hu
UESTC, University of Electronic Science and Technology of China
[email protected]

Data downloading issue

Hi,
Sorry to bother you, I am a student from Jimei University of China.
I want to download the dataset for researching and submitted the registration on the website according to the guidelines.
But I haven't received the download link.I'm sorry to borther you but I am in urgent need of this dataset to conduct my research.
My e-mail is ([email protected]), could you please send me the download link? I have submitted the application again.
Thanks for your great work and help!

How did you get the groundtruth gaze direction?

After reading your paper, i still don't know how to get the groundtruth gaze direction when collecting data.
Could you explain this?
Thanks!

Data download issues

Hi,
Sorry to bother you, I am a student from Guangzhou university of china.
I want to download the dataset for researching and submitted the registration on the website according to the guidelines.
But i haven't received the download link.
My e-mail is [email protected], could you please send me the download link? I have submitted the application again.
Thanks for your great work and help!

dataset download issues

Is there anyway to get the eye patches given we have `xgaze_224` (cropped face version) and `annotations` (lmks in original frame)

Hi everyone,
I wonder is there anyway to get the eye patches given we have

xgaze_224 (cropped face version) and
annotations (lmks in original frame).
Because download full frame version is 7T which will take along time to download.

Thanks and appreciate if anyone can help.

How to crop eye in the pre-processed datasets

Hello, I downloaded the 224x224 pre-processed dataset but found that only normalized face images are provided. Is there any way to extract cropped eyes? Since I need eye patches for my model. Thanks!

about the normalized gaze

I find the normalized gaze is computed by gc_normalized = np.dot(R, gc_src) in normalization_example.py. Does it means that when I test model trained by ETH-Xgaze in MPIIFace, the normalized gaze of MPIIFace should be computed by the same way?? Because the normalized gaze is always computed by gc_normalized = np.dot(M, gc_src) in MPIIFace.

Full dataset with raw face images release time

Thank you for your amazing job~ One question, when will you release full dataset with raw face images?

Data Downloading issue

Hi,
Sorry to bother you, I am a student from Beijing Jiaotong University of china.
I want to download the dataset for researching and submitted the registration on the website according to the guidelines.
But i haven't received the download link.
My e-mail is [email protected], could you please send me the download link? I have submitted the application yesterday and the information was filled in exactly as required.
Again, sorry to bother you here, but I really need to start my research as soon as possible.
Thanks for your great work and help! :)

Data download request

Hello,
i am a student from Tsinghua university of china and i want to download the data for some researching.
I submitted the registration form but haven't receive the download link.
Could you please send me a download link? My email is [email protected], the other information is in the registration form.
Sorry to bother you. I am just in a bit of a hurry.
Thanks a lot!

Datasets Request Download

Hi, I am a Master's degree student and I have made a request for the download service for this dataset 2 months ago. I have filled all the required information at that time but till now there is no response, can you please send me the download url of the dataset or some response to my mailbox, thank you very much.

Questions about mirror_position.xml

Hello, thank you for providing such a good gaze estimation dataset, it is really a great job!

I have a few questions to ask:

1. What do the parameters pix2mm, screen_translation, and screen_rotation in mirror_position.xml mean, and how are they obtained? How to ensure the measurement accuracy?

2. How are the coordinate points in the screen coordinate system converted to the camera coordinate system? Do you use the matrix in mirror_position.xml? Can you provide code to show this calculation process?

3. How do you get the cam_translation and cam_rotation in the calibration parameters(camxx.xml) of different cameras? Are you using a ruler to measure the distance between different cameras? How to ensure the measurement accuracy?

4. How are the parameters in the face_model.txt file calculated? How to ensure accuracy?

	model.load_state_dict(ckpt['model_state'], strict=True)
	input_var = img_normalized[:, :, [2, 1, 0]] # from BGR to RGB
	input_var = trans(input_var)
	input_var = torch.autograd.Variable(input_var.float().cuda())
	input_var = input_var.view(1, input_var.size(0), input_var.size(1), input_var.size(2)) # the input must be 4-dimension
	pred_gaze = model(input_var)
	pred_gaze = pred_gaze[0] # here we assume there is only one face inside the image, then the first one is the prediction

xucong-zhang / eth-xgaze Goto Github PK

eth-xgaze's People

Contributors

Stargazers

Watchers

Forkers

eth-xgaze's Issues

Recommend Projects

Recommend Topics

Recommend Org