Code Monkey home page Code Monkey logo

Comments (9)

xucong-zhang avatar xucong-zhang commented on September 17, 2024

Hi Yijun,

Thank you for your interest in our work.

I did not use tanh/sigmoid to normalize the output gaze direction because I think it is not necessary. The gaze label of training data is constrained in a certain range, and the model trained on such training data would output inside the range. I personally don't think it would be helpful with tanh normalization but you can give it a try.

Thank you for the suggestion of adding eval mode. I just made such change to the file.

Best,
Xucong

from eth-xgaze.

Yijun88 avatar Yijun88 commented on September 17, 2024

Hi Xucong,

Thanks for the explanation of the network structure. Additionally, can you share more details about how the gaze vectors (pitch and yaw) are generated? Does it follow a similar pipeline similar to these:

  1. Generate 3D key points of the participant's face
  2. Connect the point-of-regard with the 3D eye-center to formulate the gaze vector
  3. Normalize/Off-set the gaze vector using the head pose

Thanks in advance.

Best,
Yijun

from eth-xgaze.

xucong-zhang avatar xucong-zhang commented on September 17, 2024

Hi YIjun,

You are right about the pipeline of data normalization. For step 2, we take the 3D face centre as the gaze origin since the input image is face patch. The face centre is defined as "mean( mean(4 eye corners), mean(two nose corners) )". We will release the code for data normalization pipeline soon.

Best,
Xucong

from eth-xgaze.

lucaskyle avatar lucaskyle commented on September 17, 2024

Hi Xucong,

Thanks for the explanation of the network structure. Additionally, can you share more details about how the gaze vectors (pitch and yaw) are generated? Does it follow a similar pipeline similar to these:

  1. Generate 3D key points of the participant's face
  2. Connect the point-of-regard with the 3D eye-center to formulate the gaze vector
  3. Normalize/Off-set the gaze vector using the head pose

Thanks in advance.

Best,
Yijun

兄弟 他都发了好几篇论文了 你还不知道他怎么normalize的data
demo里面的用法不是写的很清楚了吗?

1.视线起始是基于6点的人脸中心 双眼角和鼻子角两点--->以前的论文是这么操作的
2.3Dgaze向量肯定是拍摄时 有个3d点给人盯着 人脸中点和此点的连线去做一个向量 向量再转换2D夹角 他们MP2gaze计算方法原理都有。
3.为啥要 Normalize/Off-set the gaze vector using the head pose? 作者数据是多个摄像头角度下拍摄的 而且拍摄的对象也允许头部有旋转 也提供旋转相关的GT 怎么就不会用了?

论文不去看 数据内容不check  我觉得你很不专业

还有网络输出结果 是一个angle [-pi/2,pi/2]  做一下tanh和不做 你改一下不就知道了嘛 
其次 为啥就不能normalize tanh一下输入 这些你都得试试。

from eth-xgaze.

xucong-zhang avatar xucong-zhang commented on September 17, 2024

Hi Xucong,
Thanks for the explanation of the network structure. Additionally, can you share more details about how the gaze vectors (pitch and yaw) are generated? Does it follow a similar pipeline similar to these:

  1. Generate 3D key points of the participant's face
  2. Connect the point-of-regard with the 3D eye-center to formulate the gaze vector
  3. Normalize/Off-set the gaze vector using the head pose

Thanks in advance.
Best,
Yijun

兄弟 他都发了好几篇论文了 你还不知道他怎么normalize的data
demo里面的用法不是写的很清楚了吗?

1.视线起始是基于6点的人脸中心 双眼角和鼻子角两点--->以前的论文是这么操作的
2.3Dgaze向量肯定是拍摄时 有个3d点给人盯着 人脸中点和此点的连线去做一个向量 向量再转换2D夹角 他们MP2gaze计算方法原理都有。
3.为啥要 Normalize/Off-set the gaze vector using the head pose? 作者数据是多个摄像头角度下拍摄的 而且拍摄的对象也允许头部有旋转 也提供旋转相关的GT 怎么就不会用了?

论文不去看 数据内容不check  我觉得你很不专业

还有网络输出结果 是一个angle [-pi/2,pi/2]  做一下tanh和不做 你改一下不就知道了嘛 
其次 为啥就不能normalize tanh一下输入 这些你都得试试。

Hi,

Thank you very much for the comment. These are valuable feedback.

I personally think @Yijun88 should have asked these questions since he was confused. And he actually asked for the confirmation instead of why we did the data normalization. I am happy to answer any question related to our project. I hope we can keep a friendly environment in this Q&A part so that none will be afraid to ask question and post issue. Thank you.

Best,
Xucong

from eth-xgaze.

JohnsenJiang avatar JohnsenJiang commented on September 17, 2024

您好,我想请问下关于三维视点采集的问题,您论文中提及用投影仪产生凝视点,但是并未介绍怎么转换为3维点数据。通过实际尺寸测量获得?《Appearance-Based Gaze Estimation in the Wild》中通过标定显示屏幕可以和摄像机产生关联获得3维数据,本文看的不是很明确,谢谢解惑!

from eth-xgaze.

xucong-zhang avatar xucong-zhang commented on September 17, 2024

Hi, please open a new issue when asking a different question next time, and I will write in English so the public can understand.

To obtain the 3D gaze direction, we did a similar thing as we did in "Appearance-Based Gaze Estimation in the Wild". We did the camera-screen calibration so that we can convert any points in the 3D screen coordinate system to the 3D camera coordinate system.

from eth-xgaze.

nalibjchn avatar nalibjchn commented on September 17, 2024

Hi Xucong,
Thanks for the explanation of the network structure. Additionally, can you share more details about how the gaze vectors (pitch and yaw) are generated? Does it follow a similar pipeline similar to these:

  1. Generate 3D key points of the participant's face
  2. Connect the point-of-regard with the 3D eye-center to formulate the gaze vector
  3. Normalize/Off-set the gaze vector using the head pose

Thanks in advance.
Best,
Yijun

兄弟 他都发了好几篇论文了 你还不知道他怎么normalize的data
demo里面的用法不是写的很清楚了吗?
1.视线起始是基于6点的人脸中心 双眼角和鼻子角两点--->以前的论文是这么操作的
2.3Dgaze向量肯定是拍摄时 有个3d点给人盯着 人脸中点和此点的连线去做一个向量 向量再转换2D夹角 他们MP2gaze计算方法原理都有。
3.为啥要 Normalize/Off-set the gaze vector using the head pose? 作者数据是多个摄像头角度下拍摄的 而且拍摄的对象也允许头部有旋转 也提供旋转相关的GT 怎么就不会用了?
论文不去看 数据内容不check  我觉得你很不专业
还有网络输出结果 是一个angle [-pi/2,pi/2]  做一下tanh和不做 你改一下不就知道了嘛 
其次 为啥就不能normalize tanh一下输入 这些你都得试试。

Hi,

Thank you very much for the comment. These are valuable feedback.

I personally think @Yijun88 should have asked these questions since he was confused. And he actually asked for the confirmation instead of why we did the data normalization. I am happy to answer any question related to our project. I hope we can keep a friendly environment in this Q&A part so that none will be afraid to ask question and post issue. Thank you.

Best,
Xucong

I am happy xucong zhang's reply, sometimes the more we read or try a lot, the more we are confused, and asked "stupid" questions, but when we get confirmation or positive feedback, we may suddenly understand totally. My supervisor always tells me there is no stupid question.

from eth-xgaze.

kevinsu628 avatar kevinsu628 commented on September 17, 2024

Hi Xucong,
Thanks for the explanation of the network structure. Additionally, can you share more details about how the gaze vectors (pitch and yaw) are generated? Does it follow a similar pipeline similar to these:

  1. Generate 3D key points of the participant's face
  2. Connect the point-of-regard with the 3D eye-center to formulate the gaze vector
  3. Normalize/Off-set the gaze vector using the head pose

Thanks in advance.
Best,
Yijun

兄弟 他都发了好几篇论文了 你还不知道他怎么normalize的data demo里面的用法不是写的很清楚了吗?

1.视线起始是基于6点的人脸中心 双眼角和鼻子角两点--->以前的论文是这么操作的 2.3Dgaze向量肯定是拍摄时 有个3d点给人盯着 人脸中点和此点的连线去做一个向量 向量再转换2D夹角 他们MP2gaze计算方法原理都有。 3.为啥要 Normalize/Off-set the gaze vector using the head pose? 作者数据是多个摄像头角度下拍摄的 而且拍摄的对象也允许头部有旋转 也提供旋转相关的GT 怎么就不会用了?

论文不去看 数据内容不check  我觉得你很不专业

还有网络输出结果 是一个angle [-pi/2,pi/2]  做一下tanh和不做 你改一下不就知道了嘛  其次 为啥就不能normalize tanh一下输入 这些你都得试试。

There is no need to be toxic man. Imo what he asked actually means a lot to whoever came to this issue with questions. I am sure this question and the confirmation from the authors have helped many people, including me.
A Q&A board exists for speeding up our progress, not to judge one's professionalism.

from eth-xgaze.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.