Comments (9)
Hi Yijun,
Thank you for your interest in our work.
I did not use tanh/sigmoid to normalize the output gaze direction because I think it is not necessary. The gaze label of training data is constrained in a certain range, and the model trained on such training data would output inside the range. I personally don't think it would be helpful with tanh normalization but you can give it a try.
Thank you for the suggestion of adding eval mode. I just made such change to the file.
Best,
Xucong
from eth-xgaze.
Hi Xucong,
Thanks for the explanation of the network structure. Additionally, can you share more details about how the gaze vectors (pitch and yaw) are generated? Does it follow a similar pipeline similar to these:
- Generate 3D key points of the participant's face
- Connect the point-of-regard with the 3D eye-center to formulate the gaze vector
- Normalize/Off-set the gaze vector using the head pose
Thanks in advance.
Best,
Yijun
from eth-xgaze.
Hi YIjun,
You are right about the pipeline of data normalization. For step 2, we take the 3D face centre as the gaze origin since the input image is face patch. The face centre is defined as "mean( mean(4 eye corners), mean(two nose corners) )". We will release the code for data normalization pipeline soon.
Best,
Xucong
from eth-xgaze.
Hi Xucong,
Thanks for the explanation of the network structure. Additionally, can you share more details about how the gaze vectors (pitch and yaw) are generated? Does it follow a similar pipeline similar to these:
- Generate 3D key points of the participant's face
- Connect the point-of-regard with the 3D eye-center to formulate the gaze vector
- Normalize/Off-set the gaze vector using the head pose
Thanks in advance.
Best,
Yijun
兄弟 他都发了好几篇论文了 你还不知道他怎么normalize的data
demo里面的用法不是写的很清楚了吗?
1.视线起始是基于6点的人脸中心 双眼角和鼻子角两点--->以前的论文是这么操作的
2.3Dgaze向量肯定是拍摄时 有个3d点给人盯着 人脸中点和此点的连线去做一个向量 向量再转换2D夹角 他们MP2gaze计算方法原理都有。
3.为啥要 Normalize/Off-set the gaze vector using the head pose? 作者数据是多个摄像头角度下拍摄的 而且拍摄的对象也允许头部有旋转 也提供旋转相关的GT 怎么就不会用了?
论文不去看 数据内容不check 我觉得你很不专业
还有网络输出结果 是一个angle [-pi/2,pi/2] 做一下tanh和不做 你改一下不就知道了嘛
其次 为啥就不能normalize tanh一下输入 这些你都得试试。
from eth-xgaze.
Hi Xucong,
Thanks for the explanation of the network structure. Additionally, can you share more details about how the gaze vectors (pitch and yaw) are generated? Does it follow a similar pipeline similar to these:
- Generate 3D key points of the participant's face
- Connect the point-of-regard with the 3D eye-center to formulate the gaze vector
- Normalize/Off-set the gaze vector using the head pose
Thanks in advance.
Best,
Yijun兄弟 他都发了好几篇论文了 你还不知道他怎么normalize的data
demo里面的用法不是写的很清楚了吗?1.视线起始是基于6点的人脸中心 双眼角和鼻子角两点--->以前的论文是这么操作的
2.3Dgaze向量肯定是拍摄时 有个3d点给人盯着 人脸中点和此点的连线去做一个向量 向量再转换2D夹角 他们MP2gaze计算方法原理都有。
3.为啥要 Normalize/Off-set the gaze vector using the head pose? 作者数据是多个摄像头角度下拍摄的 而且拍摄的对象也允许头部有旋转 也提供旋转相关的GT 怎么就不会用了?论文不去看 数据内容不check 我觉得你很不专业
还有网络输出结果 是一个angle [-pi/2,pi/2] 做一下tanh和不做 你改一下不就知道了嘛
其次 为啥就不能normalize tanh一下输入 这些你都得试试。
Hi,
Thank you very much for the comment. These are valuable feedback.
I personally think @Yijun88 should have asked these questions since he was confused. And he actually asked for the confirmation instead of why we did the data normalization. I am happy to answer any question related to our project. I hope we can keep a friendly environment in this Q&A part so that none will be afraid to ask question and post issue. Thank you.
Best,
Xucong
from eth-xgaze.
您好,我想请问下关于三维视点采集的问题,您论文中提及用投影仪产生凝视点,但是并未介绍怎么转换为3维点数据。通过实际尺寸测量获得?《Appearance-Based Gaze Estimation in the Wild》中通过标定显示屏幕可以和摄像机产生关联获得3维数据,本文看的不是很明确,谢谢解惑!
from eth-xgaze.
Hi, please open a new issue when asking a different question next time, and I will write in English so the public can understand.
To obtain the 3D gaze direction, we did a similar thing as we did in "Appearance-Based Gaze Estimation in the Wild". We did the camera-screen calibration so that we can convert any points in the 3D screen coordinate system to the 3D camera coordinate system.
from eth-xgaze.
Hi Xucong,
Thanks for the explanation of the network structure. Additionally, can you share more details about how the gaze vectors (pitch and yaw) are generated? Does it follow a similar pipeline similar to these:
- Generate 3D key points of the participant's face
- Connect the point-of-regard with the 3D eye-center to formulate the gaze vector
- Normalize/Off-set the gaze vector using the head pose
Thanks in advance.
Best,
Yijun兄弟 他都发了好几篇论文了 你还不知道他怎么normalize的data
demo里面的用法不是写的很清楚了吗?
1.视线起始是基于6点的人脸中心 双眼角和鼻子角两点--->以前的论文是这么操作的
2.3Dgaze向量肯定是拍摄时 有个3d点给人盯着 人脸中点和此点的连线去做一个向量 向量再转换2D夹角 他们MP2gaze计算方法原理都有。
3.为啥要 Normalize/Off-set the gaze vector using the head pose? 作者数据是多个摄像头角度下拍摄的 而且拍摄的对象也允许头部有旋转 也提供旋转相关的GT 怎么就不会用了?
论文不去看 数据内容不check 我觉得你很不专业
还有网络输出结果 是一个angle [-pi/2,pi/2] 做一下tanh和不做 你改一下不就知道了嘛
其次 为啥就不能normalize tanh一下输入 这些你都得试试。Hi,
Thank you very much for the comment. These are valuable feedback.
I personally think @Yijun88 should have asked these questions since he was confused. And he actually asked for the confirmation instead of why we did the data normalization. I am happy to answer any question related to our project. I hope we can keep a friendly environment in this Q&A part so that none will be afraid to ask question and post issue. Thank you.
Best,
Xucong
I am happy xucong zhang's reply, sometimes the more we read or try a lot, the more we are confused, and asked "stupid" questions, but when we get confirmation or positive feedback, we may suddenly understand totally. My supervisor always tells me there is no stupid question.
from eth-xgaze.
Hi Xucong,
Thanks for the explanation of the network structure. Additionally, can you share more details about how the gaze vectors (pitch and yaw) are generated? Does it follow a similar pipeline similar to these:
- Generate 3D key points of the participant's face
- Connect the point-of-regard with the 3D eye-center to formulate the gaze vector
- Normalize/Off-set the gaze vector using the head pose
Thanks in advance.
Best,
Yijun兄弟 他都发了好几篇论文了 你还不知道他怎么normalize的data demo里面的用法不是写的很清楚了吗?
1.视线起始是基于6点的人脸中心 双眼角和鼻子角两点--->以前的论文是这么操作的 2.3Dgaze向量肯定是拍摄时 有个3d点给人盯着 人脸中点和此点的连线去做一个向量 向量再转换2D夹角 他们MP2gaze计算方法原理都有。 3.为啥要 Normalize/Off-set the gaze vector using the head pose? 作者数据是多个摄像头角度下拍摄的 而且拍摄的对象也允许头部有旋转 也提供旋转相关的GT 怎么就不会用了?
论文不去看 数据内容不check 我觉得你很不专业
还有网络输出结果 是一个angle [-pi/2,pi/2] 做一下tanh和不做 你改一下不就知道了嘛 其次 为啥就不能normalize tanh一下输入 这些你都得试试。
There is no need to be toxic man. Imo what he asked actually means a lot to whoever came to this issue with questions. I am sure this question and the confirmation from the authors have helped many people, including me.
A Q&A board exists for speeding up our progress, not to judge one's professionalism.
from eth-xgaze.
Related Issues (20)
- How to crop eye in the pre-processed datasets HOT 3
- How to calculate the rvec and tvec? Are they calculated by solvepnp, which paras are face_model_3d_coordinates and ldmk68s from csv?
- pitch and yaw (raw outputs of network) are not in HCS (head coordinates system) HOT 2
- Dataset structure HOT 1
- Few questions about the dataset (gaze, pose)
- What does the face_gaze mean in the annotation file? HOT 1
- Data download issues HOT 1
- about the normalized gaze HOT 1
- Datasets Request Download HOT 1
- dataset download issues HOT 1
- Questions about mirror_position.xml
- Request for Gaze Data Download Link
- Data download issues
- Data Downloading issue
- Data downloading issue
- Inquiry on Gaze Coordinate System Definition in ETH Dataset Compared to Gaze360 HOT 2
- Question about training dataset
- Person Specific Evaluation
- Syntax Error in main.py L23 HOT 2
- annotation_test/subject0001.csv is incomplete?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from eth-xgaze.