Code Monkey home page Code Monkey logo

eth-xgaze's People

Contributors

shreshthsaxena avatar xucong-zhang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

eth-xgaze's Issues

Is gc the coordinate of target or calulated vector

gc_normalized = gc - face_center # gaze vector
I am not sure whether gc is the coordinate of target based on camera coordinate system or already calculated based on head coordinate system? And why doing this step?
please explain this, thanks!

trying to use other images to test

Hello Xucong Zhang,

I am glad to read your paper and code. I am trying to verify whether I can user your pre-trained model on my own dataset. But one problem is I do not have camera information (xml file). I am wondering if that will influence the quality of output.

I am looking for your feedback.

Questions about the baseline structure

Hi Xucong,

Excellent work on the ETH XGaze! It really provides a diverse dataset of gaze estimation. I have a questions regarding the baseline structure. Why we don't compress the FC outputs through a tanh/sigmoid activation function to normalize the output a bit? Is there a intuition to use the original outputs?

Additionally, I suggest in the demo code a model.eval() could be added before running the forward pass.

ETH-XGaze/demo.py

Lines 158 to 164 in ca2d991

model.load_state_dict(ckpt['model_state'], strict=True)
input_var = img_normalized[:, :, [2, 1, 0]] # from BGR to RGB
input_var = trans(input_var)
input_var = torch.autograd.Variable(input_var.float().cuda())
input_var = input_var.view(1, input_var.size(0), input_var.size(1), input_var.size(2)) # the input must be 4-dimension
pred_gaze = model(input_var)
pred_gaze = pred_gaze[0] # here we assume there is only one face inside the image, then the first one is the prediction

Looking forward to your reply!

Best,
Yijun

Inquiry on Gaze Coordinate System Definition in ETH Dataset Compared to Gaze360

Hello ETH Dataset Team,

Thank you for open-sourcing the ETH dataset and methods!

I am currently working with the Gaze360 dataset and noticed its specific definition of the gaze coordinate system. In Gaze360, the eye coordinate system is defined as a right-handed coordinate system with the y-axis pointing up and the origin at the center of the Ladybug camera system. The positive x-axis points to the left, positive y-axis points up, and positive z-axis points away from the camera. For example, a gaze vector of [-1,0,0] represents looking to the right, and [0,0,-1] represents looking straight into the camera.

Could you please provide details on how the gaze coordinate system is defined in the ETH dataset? Specifically, how are the x, y, and z axes oriented in your gaze coordinate system, and what is the origin point of this system?

Understanding these differences will greatly assist in integrating and comparing data from both sources accurately.

Best regards,
HuAndrew

about the pre-trained model

Hi Xucong,

Excellent work on the ETH XGaze! Now I have meet an issue about the pre-trained model. I have download the model,but it cant uncompress. I dont know whether there has something wrong with the source file.
Looking forward to your reply!

Best,
LiuGang

Few questions about the dataset (gaze, pose)

Hello, I am trying to understand the data structure of eth-xgaze dataset.
In 'OnePersonDataset', three values are returned when it is called, which are image, pose, and gaze.
It seems that gaze is combination of pitch and yaw which are in radians. (please correct me if i am wrong)
I am little confused about what pose does during training. If the pose represents 'what direction the face is pointing at', then how can the pose be defined with one number (unlike pitch and yaw)?

Question about training dataset

Hi, Zhang. Thanks for ur project! My issue is why did u use original normalized image in training loop? In other words, why didn't u use any image enhancement? like random gray, random color distortion ....
Looking forward for ur reply! :)

cam_id or solvePnPRansac issue

Hello, I caught this problem:

In demo.py if I use HeadPoseEstimator() instead of estimateHeadPose() this error occures:

--> 29     hr, ht, o_l, o_r, _ = head_pose_estimator(image, landmarks, camera_matrix[cam_id])
     30     ## the easy way to get head pose information, fast and simple
     31 #     facePts = face_model.reshape(6, 1, 3)

NameError: name 'cam_id' is not defined

If I replace camera_matrix[cam_id] with camera_matrix this problem occures:

---------------------------------------------------------------------------
error                                     Traceback (most recent call last)
<ipython-input-13-a4e70c1ea3f5> in <module>
     27     landmarks = landmarks.reshape(-1, 2)
     28     head_pose_estimator = HeadPoseEstimator()
---> 29     hr, ht, o_l, o_r, _ = head_pose_estimator(image, landmarks, camera_matrix)
     30     ## the easy way to get head pose information, fast and simple
     31 #     facePts = face_model.reshape(6, 1, 3)

~/ETH-XGaze/head_pose.py in __call__(self, frame, landmarks, intrinsics, target_io_dist, visualize)
    136         # Do PnP-based head pose fitting
    137         rvec, tvec, reprojected_points, o_l, o_r, face_model = \
--> 138             self.head_pose_fit(landmarks, eos_mesh, intrinsics, scaling_factor)
    139         o_r_2D = cv.projectPoints(o_r, rvec, tvec, intrinsics, None)[0].reshape(2)
    140         o_l_2D = cv.projectPoints(o_l, rvec, tvec, intrinsics, None)[0].reshape(2)

~/ETH-XGaze/head_pose.py in head_pose_fit(self, landmarks_2D, deformed_mesh, intrinsics, scaling_factor)
     95         # Initial fit
     96         camera_matrix = intrinsics
---> 97         success, rvec, tvec, inliers = cv.solvePnPRansac(
     98             sfm_points_ibug_subset, landmarks_2D, camera_matrix, None,
     99             flags=cv.SOLVEPNP_EPNP)

error: OpenCV(4.5.2) /tmp/pip-req-build-dccdjyga/opencv/modules/calib3d/src/solvepnp.cpp:241: error: (-215:Assertion failed) npoints >= 4 && npoints == std::max(ipoints.checkVector(2, CV_32F), ipoints.checkVector(2, CV_64F)) in function 'solvePnPRansac'

could you please help?

Data download issues

Hi,
Sorry to bother you, I am a student from guangdong university of technology of china.
I want to download the dataset for researching and submitted the registration on the website according to the guidelines.
But i haven't received the download link.
My e-mail is [email protected], could you please send me the download link? I have submitted the application again.
Thanks for your great work and help!

Estimated gaze without the normalization

Hello,
I wolud like to ask, if there is any easy way to get the real gaze values. I understand from the code that the return values of gaze are calculated after normalization, but I need to get the real gaze values of origin image and I do not know what operation should I do with result gaze for getting the real estimation from original image position. I hope you understand what I mean. Thank you.

Person Specific Evaluation

Hi,

How to perform a person-specific evaluation if the ground truth data for test_person_specific is not present in the dataset?

Best Regards

XML files for camera parameters

Hi,

I noticed that in your code ,camera parameters are required (see below for details). However, I did not find the folder or the files in your code. Could you please let me know where can I download these parameters/xml files?

file_name = './calibration/cam_calibration/' + 'cam' + str(cam_id).zfill(2) + '.xml' (l.195 in normalization_example.py)

Your help is very much appreciated.

Thanks.

pitch and yaw (raw outputs of network) are not in HCS (head coordinates system)

Hi, thanks for this great paper and dataset and also all of your previous valuable works in the field of appearance-based gaze estimation.

I recently tried to use the raw output of the network, which is trained on the ETH-XGaze dataset, to estimate the PoG (Point of Gaze) in CCS (Camera Coordinates System). So I used your normalization method and find the normalizing rotation matrix to transform the normalized gaze vector which is in HCS, to the 3D gaze vector which is in CCS.

But it seems that pitch and yaw are not in HCS because when everything is unchanging, except the camera position, the network output changes. So if it is correct and pitch and yaw are not in HCS, we need an extra step further than a normalizing rotation matrix which compensates head pose. But I can't find this step and it is ambiguous for me.

Dataset structure

Hi Xucong,

Thank you so much for making this dataset, and code available. I wanted to ask, is there any way we can get the structure of the dataset? That's to say what each tar file contains, and so on, similar to what you did with MPIIFaceGaze. That was very helpful :)
I was able to download the 448 dataset, but it's missing the json file for train_test split, and the test set. You mentioned in #9 (comment) that it's possible to get a needed missing file from the raw data, could you please clarify how?

Many thanks :)

Calibrating gaze vector to Screen point ?

I'm trying to map the pred_gaze_np output to the 2D screen point. Is this something already implemented ? If not can you please help me with what approach should be followed.
I tried a simple polynomial regression from the vector to screen points and the results are decent but I'm wondering if there's a better approach.

Request for Gaze Data Download Link

Hello,

I hope this message finds you well. I am a graduate student at the UESTC(University of Electronic Science and Technology of China), and I am currently conducting research that requires access to Gaze data. I have submitted the registration form as required, but I have not yet received the download link.

Could you please assist me by providing the download link at your earliest convenience? My registered email is [email protected], and all the necessary information has been provided in the registration form.

I apologize for any inconvenience, and I appreciate your prompt attention to this matter. Time is of the essence for my research, and I would be grateful for your assistance.

Thank you very much for your help.

Best regards,
Hong Hu
UESTC, University of Electronic Science and Technology of China
[email protected]

Data downloading issue

Hi,
Sorry to bother you, I am a student from Jimei University of China.
I want to download the dataset for researching and submitted the registration on the website according to the guidelines.
But I haven't received the download link.I'm sorry to borther you but I am in urgent need of this dataset to conduct my research.
My e-mail is ([email protected]), could you please send me the download link? I have submitted the application again.
Thanks for your great work and help!

Data download issues

Hi,
Sorry to bother you, I am a student from Guangzhou university of china.
I want to download the dataset for researching and submitted the registration on the website according to the guidelines.
But i haven't received the download link.
My e-mail is [email protected], could you please send me the download link? I have submitted the application again.
Thanks for your great work and help!

How to crop eye in the pre-processed datasets

Hello, I downloaded the 224x224 pre-processed dataset but found that only normalized face images are provided. Is there any way to extract cropped eyes? Since I need eye patches for my model. Thanks!

about the normalized gaze

I find the normalized gaze is computed by gc_normalized = np.dot(R, gc_src) in normalization_example.py. Does it means that when I test model trained by ETH-Xgaze in MPIIFace, the normalized gaze of MPIIFace should be computed by the same way?? Because the normalized gaze is always computed by gc_normalized = np.dot(M, gc_src) in MPIIFace.

Data Downloading issue

Hi,
Sorry to bother you, I am a student from Beijing Jiaotong University of china.
I want to download the dataset for researching and submitted the registration on the website according to the guidelines.
But i haven't received the download link.
My e-mail is [email protected], could you please send me the download link? I have submitted the application yesterday and the information was filled in exactly as required.
Again, sorry to bother you here, but I really need to start my research as soon as possible.
Thanks for your great work and help! :)

Data download request

Hello,
i am a student from Tsinghua university of china and i want to download the data for some researching.
I submitted the registration form but haven't receive the download link.
Could you please send me a download link? My email is [email protected], the other information is in the registration form.
Sorry to bother you. I am just in a bit of a hurry.
Thanks a lot!

Datasets Request Download

Hi, I am a Master's degree student and I have made a request for the download service for this dataset 2 months ago. I have filled all the required information at that time but till now there is no response, can you please send me the download url of the dataset or some response to my mailbox, thank you very much.

Questions about mirror_position.xml

Hello, thank you for providing such a good gaze estimation dataset, it is really a great job!

I have a few questions to ask:

1. What do the parameters pix2mm, screen_translation, and screen_rotation in mirror_position.xml mean, and how are they obtained? How to ensure the measurement accuracy?

2. How are the coordinate points in the screen coordinate system converted to the camera coordinate system? Do you use the matrix in mirror_position.xml? Can you provide code to show this calculation process?

3. How do you get the cam_translation and cam_rotation in the calibration parameters(camxx.xml) of different cameras? Are you using a ruler to measure the distance between different cameras? How to ensure the measurement accuracy?

4. How are the parameters in the face_model.txt file calculated? How to ensure accuracy?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.