xucong-zhang / eth-xgaze Goto Github PK
View Code? Open in Web Editor NEWOfficial implementation of ETH-XGaze dataset baseline
Official implementation of ETH-XGaze dataset baseline
gc_normalized = gc - face_center # gaze vector
I am not sure whether gc is the coordinate of target based on camera coordinate system or already calculated based on head coordinate system? And why doing this step?
please explain this, thanks!
I got rvec, tvec calculated by the func below, but the result is not correspond with rvec, tvec from the lable csv.
_, rvec, tvec = cv2.solvePnP(face_model_3d_coordinates, ldmk50,
camera_matrix,
distortion_coefficients,
rvec,
tvec,
useExtrinsicGuess=True,
flags=cv2.SOLVEPNP_ITERATIVE)
Hello Xucong Zhang,
I am glad to read your paper and code. I am trying to verify whether I can user your pre-trained model on my own dataset. But one problem is I do not have camera information (xml file). I am wondering if that will influence the quality of output.
I am looking for your feedback.
Hi Xucong,
Excellent work on the ETH XGaze! It really provides a diverse dataset of gaze estimation. I have a questions regarding the baseline structure. Why we don't compress the FC outputs through a tanh/sigmoid activation function to normalize the output a bit? Is there a intuition to use the original outputs?
Additionally, I suggest in the demo code a model.eval() could be added before running the forward pass.
Lines 158 to 164 in ca2d991
Looking forward to your reply!
Best,
Yijun
Hello ETH Dataset Team,
Thank you for open-sourcing the ETH dataset and methods!
I am currently working with the Gaze360 dataset and noticed its specific definition of the gaze coordinate system. In Gaze360, the eye coordinate system is defined as a right-handed coordinate system with the y-axis pointing up and the origin at the center of the Ladybug camera system. The positive x-axis points to the left, positive y-axis points up, and positive z-axis points away from the camera. For example, a gaze vector of [-1,0,0] represents looking to the right, and [0,0,-1] represents looking straight into the camera.
Could you please provide details on how the gaze coordinate system is defined in the ETH dataset? Specifically, how are the x, y, and z axes oriented in your gaze coordinate system, and what is the origin point of this system?
Understanding these differences will greatly assist in integrating and comparing data from both sources accurately.
Best regards,
HuAndrew
Hi, it maybe missing the file of train_test_split.json? How to get this json file? @xucong-zhang
Hi Xucong,
Excellent work on the ETH XGaze! Now I have meet an issue about the pre-trained model. I have download the model,but it cant uncompress. I don
t know whether there has something wrong with the source file.
Looking forward to your reply!
Best,
LiuGang
Hello, I am trying to understand the data structure of eth-xgaze dataset.
In 'OnePersonDataset', three values are returned when it is called, which are image, pose, and gaze.
It seems that gaze is combination of pitch and yaw which are in radians. (please correct me if i am wrong)
I am little confused about what pose does during training. If the pose represents 'what direction the face is pointing at', then how can the pose be defined with one number (unlike pitch and yaw)?
Hi, Zhang. Thanks for ur project! My issue is why did u use original normalized image in training loop? In other words, why didn't u use any image enhancement? like random gray, random color distortion ....
Looking forward for ur reply! :)
when I running demo.py, it occur a error "ModuleNotFoundError: No module named 'imutils'". Maybe you miss a imutils.py in project.
Hi, I don't quite understand what are those values in face_model.txt? Is there any difference between it and landmarks?
Hello, I caught this problem:
In demo.py if I use HeadPoseEstimator() instead of estimateHeadPose() this error occures:
--> 29 hr, ht, o_l, o_r, _ = head_pose_estimator(image, landmarks, camera_matrix[cam_id])
30 ## the easy way to get head pose information, fast and simple
31 # facePts = face_model.reshape(6, 1, 3)
NameError: name 'cam_id' is not defined
If I replace camera_matrix[cam_id]
with camera_matrix
this problem occures:
---------------------------------------------------------------------------
error Traceback (most recent call last)
<ipython-input-13-a4e70c1ea3f5> in <module>
27 landmarks = landmarks.reshape(-1, 2)
28 head_pose_estimator = HeadPoseEstimator()
---> 29 hr, ht, o_l, o_r, _ = head_pose_estimator(image, landmarks, camera_matrix)
30 ## the easy way to get head pose information, fast and simple
31 # facePts = face_model.reshape(6, 1, 3)
~/ETH-XGaze/head_pose.py in __call__(self, frame, landmarks, intrinsics, target_io_dist, visualize)
136 # Do PnP-based head pose fitting
137 rvec, tvec, reprojected_points, o_l, o_r, face_model = \
--> 138 self.head_pose_fit(landmarks, eos_mesh, intrinsics, scaling_factor)
139 o_r_2D = cv.projectPoints(o_r, rvec, tvec, intrinsics, None)[0].reshape(2)
140 o_l_2D = cv.projectPoints(o_l, rvec, tvec, intrinsics, None)[0].reshape(2)
~/ETH-XGaze/head_pose.py in head_pose_fit(self, landmarks_2D, deformed_mesh, intrinsics, scaling_factor)
95 # Initial fit
96 camera_matrix = intrinsics
---> 97 success, rvec, tvec, inliers = cv.solvePnPRansac(
98 sfm_points_ibug_subset, landmarks_2D, camera_matrix, None,
99 flags=cv.SOLVEPNP_EPNP)
error: OpenCV(4.5.2) /tmp/pip-req-build-dccdjyga/opencv/modules/calib3d/src/solvepnp.cpp:241: error: (-215:Assertion failed) npoints >= 4 && npoints == std::max(ipoints.checkVector(2, CV_32F), ipoints.checkVector(2, CV_64F)) in function 'solvePnPRansac'
could you please help?
Hi,
Sorry to bother you, I am a student from guangdong university of technology of china.
I want to download the dataset for researching and submitted the registration on the website according to the guidelines.
But i haven't received the download link.
My e-mail is [email protected], could you please send me the download link? I have submitted the application again.
Thanks for your great work and help!
Hello,
I wolud like to ask, if there is any easy way to get the real gaze values. I understand from the code that the return values of gaze are calculated after normalization, but I need to get the real gaze values of origin image and I do not know what operation should I do with result gaze for getting the real estimation from original image position. I hope you understand what I mean. Thank you.
hi, one question, in raw/data/annotation_train/subject*.csv, column 3-4, column 5-7 are gaze point and gaze point. How to calculate 3d gaze vector by column 3-4 and column 5-7?
Hi,
How to perform a person-specific evaluation if the ground truth data for test_person_specific
is not present in the dataset?
Best Regards
Hi,
I noticed that in your code ,camera parameters are required (see below for details). However, I did not find the folder or the files in your code. Could you please let me know where can I download these parameters/xml files?
file_name = './calibration/cam_calibration/' + 'cam' + str(cam_id).zfill(2) + '.xml' (l.195 in normalization_example.py)
Your help is very much appreciated.
Thanks.
What does the face_gaze mean in the annotation file?
It does not the left/right eye gaze or the mean of the both eyes gaze, right?
Which 3d landmark do you use to calculate the face gaze?
Hi, thanks for this great paper and dataset and also all of your previous valuable works in the field of appearance-based gaze estimation.
I recently tried to use the raw output of the network, which is trained on the ETH-XGaze dataset, to estimate the PoG (Point of Gaze) in CCS (Camera Coordinates System). So I used your normalization method and find the normalizing rotation matrix to transform the normalized gaze vector which is in HCS, to the 3D gaze vector which is in CCS.
But it seems that pitch and yaw are not in HCS because when everything is unchanging, except the camera position, the network output changes. So if it is correct and pitch and yaw are not in HCS, we need an extra step further than a normalizing rotation matrix which compensates head pose. But I can't find this step and it is ambiguous for me.
Hi Xucong,
Thank you so much for making this dataset, and code available. I wanted to ask, is there any way we can get the structure of the dataset? That's to say what each tar file contains, and so on, similar to what you did with MPIIFaceGaze. That was very helpful :)
I was able to download the 448 dataset, but it's missing the json file for train_test split, and the test set. You mentioned in #9 (comment) that it's possible to get a needed missing file from the raw data, could you please clarify how?
Many thanks :)
I'm trying to map the pred_gaze_np output to the 2D screen point. Is this something already implemented ? If not can you please help me with what approach should be followed.
I tried a simple polynomial regression from the vector to screen points and the results are decent but I'm wondering if there's a better approach.
Hello,
I hope this message finds you well. I am a graduate student at the UESTC(University of Electronic Science and Technology of China), and I am currently conducting research that requires access to Gaze data. I have submitted the registration form as required, but I have not yet received the download link.
Could you please assist me by providing the download link at your earliest convenience? My registered email is [email protected], and all the necessary information has been provided in the registration form.
I apologize for any inconvenience, and I appreciate your prompt attention to this matter. Time is of the essence for my research, and I would be grateful for your assistance.
Thank you very much for your help.
Best regards,
Hong Hu
UESTC, University of Electronic Science and Technology of China
[email protected]
Hi,
Sorry to bother you, I am a student from Jimei University of China.
I want to download the dataset for researching and submitted the registration on the website according to the guidelines.
But I haven't received the download link.I'm sorry to borther you but I am in urgent need of this dataset to conduct my research.
My e-mail is ([email protected]), could you please send me the download link? I have submitted the application again.
Thanks for your great work and help!
After reading your paper, i still don't know how to get the groundtruth gaze direction when collecting data.
Could you explain this?
Thanks!
Hi,
Sorry to bother you, I am a student from Guangzhou university of china.
I want to download the dataset for researching and submitted the registration on the website according to the guidelines.
But i haven't received the download link.
My e-mail is [email protected], could you please send me the download link? I have submitted the application again.
Thanks for your great work and help!
Hi everyone,
I wonder is there anyway to get the eye patches given we have
xgaze_224
(cropped face version) andannotations
(lmks in original frame).Thanks and appreciate if anyone can help.
Hello, I downloaded the 224x224 pre-processed dataset but found that only normalized face images are provided. Is there any way to extract cropped eyes? Since I need eye patches for my model. Thanks!
I find the normalized gaze is computed by gc_normalized = np.dot(R, gc_src) in normalization_example.py. Does it means that when I test model trained by ETH-Xgaze in MPIIFace, the normalized gaze of MPIIFace should be computed by the same way?? Because the normalized gaze is always computed by gc_normalized = np.dot(M, gc_src) in MPIIFace.
Thank you for your amazing job~ One question, when will you release full dataset with raw face images?
Hi,
Sorry to bother you, I am a student from Beijing Jiaotong University of china.
I want to download the dataset for researching and submitted the registration on the website according to the guidelines.
But i haven't received the download link.
My e-mail is [email protected], could you please send me the download link? I have submitted the application yesterday and the information was filled in exactly as required.
Again, sorry to bother you here, but I really need to start my research as soon as possible.
Thanks for your great work and help! :)
Hello,
i am a student from Tsinghua university of china and i want to download the data for some researching.
I submitted the registration form but haven't receive the download link.
Could you please send me a download link? My email is [email protected], the other information is in the registration form.
Sorry to bother you. I am just in a bit of a hurry.
Thanks a lot!
Hi, I am a Master's degree student and I have made a request for the download service for this dataset 2 months ago. I have filled all the required information at that time but till now there is no response, can you please send me the download url of the dataset or some response to my mailbox, thank you very much.
Hello, thank you for providing such a good gaze estimation dataset, it is really a great job!
I have a few questions to ask:
1. What do the parameters pix2mm, screen_translation, and screen_rotation in mirror_position.xml mean, and how are they obtained? How to ensure the measurement accuracy?
2. How are the coordinate points in the screen coordinate system converted to the camera coordinate system? Do you use the matrix in mirror_position.xml? Can you provide code to show this calculation process?
3. How do you get the cam_translation and cam_rotation in the calibration parameters(camxx.xml) of different cameras? Are you using a ruler to measure the distance between different cameras? How to ensure the measurement accuracy?
4. How are the parameters in the face_model.txt file calculated? How to ensure accuracy?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.