Code Monkey home page Code Monkey logo

eye-tracker's Introduction

Eye Tracker

Implemented and improved the iTracker model proposed in the paper Eye Tracking for Everyone.

Figure 1: iTracker architecture

Figure 2: modified iTracker architecture

Figures 1 and 2 show the architectures of the iTracker model and the modified model. The only difference between the modified model and the iTracker model is that we concatenate the face layer FC-F1 and face mask layer FC-FG1 first, after applying a fully connected layer FC-F2, we then concatenate the eye layer FC-E1 and FC-F2 layer. We claim that this modified architecture is superior to the iTracker architecture. Intuitively, concatenating the face mask information together with the eye information may confuse the model since the face mask information is irrelevant to the eye information. Even though the iTracker model succeeded to learn this fact from the data, the modified model outperforms the iTracker model by explictlying encoded with this knowledge. In experiments, the modified model converged faster (28 epochs vs. 40+ epochs) and achieved better validation error (2.19 cm vs. 2.514 cm). The iTracker model was implemented in itracker.py and the modified one was implemented in itracker_adv.py. Note that a smaller dataset (i.e., a subset of the full dataset in the original paper) was used in experiments and no data augmentation was applied. This smaller dataset contains 48,000 training samples and 5,000 validation samples. You can download this smaller dataset here.

Get started

To train the model: run python itracker_adv.py --train -i input_data -sm saved_model

To test the trained model: run python itracker_adv.py -i input_data -lm saved_model

You can find a pretrained (on the smaller dataset) model under the pretrained_models/itracker_adv/ folder.

FAQ

  1. What are the datasets?

The original dataset comes from the GazeCapture project. The dataset involves over 1400 subjects and results in more than 2 million face images. Due to the limitation of computation power, a much smaller dataset with 48000 training samples and 5000 validation samples was used here. Each sample contains 5 items: face, left eye, right eye, face mask and labels.

Other implementations

For pytorch implementations, see GazeCapture.

eye-tracker's People

Contributors

hugochan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

eye-tracker's Issues

How i can train the dataset

I am trying to run this in cmd but there is an error

python itracker_adv.py --train -i input_data -sm saved_model

how I can give the path of the dataset

def load_data(file):
npzfile = np.load(file)

thank you

请教下数据集合中y的含义?

开始以为是坐标,结果发现有正负···知道字段的含义吗?
[[ 1.70154930e+00 -1.93875000e+00]
[-6.85229867e+00 7.72708985e-01]
[-1.08925000e+01 5.00000000e-04]
...
[-7.52597535e+00 9.69375000e-01]
[ 5.78865727e+00 -4.60561372e-01]
[-3.64302465e+00 -9.69375000e-01]]

enquiry of nature of smaller data set

Hi Hugo Chan,

I'm a research staff based in Singapore, Nanyang Tech University.

I came across your version of iTracker am I'm really interested to study it and studying if I can implement it in the computer games I'm working on.

In the original paper of Eye Tracker for Everyone, it was mentioned that the number of subjects in the training set matters a lot to achieve generalization of the neural network. May I check with you while processing a smaller subset of the original data, how many subjects were included (Out of the original 1437 subjects) how many instances per subject?

Does the validation set share similar subjects and instances per subjects as the training set or does the validation set only include subjects not seen by the model when being trained by the training set?

Possible to share your code how you processed the original data set? :)

That aside,
Currently, I made an inference engine using openCV library but unfortunately using your pre-trained model as the deployed model, and my face through my laptop's webcam as the Inference instance, accuracy is horrible. The pre-trained model 23 was not able to detect my gaze at all. It gave random values.

What do you think may be the problem?

Guang Way

I have a question regarding input_data

Hello. We are running this code in PyCharm. I know that there are 4 files that are created when saving in tensorflow. To train the model, we tried running the program by putting

"--train -i pretrained-models/itracker_adv/model-23.data-00000-of-00001 -sm saved model" in the paremeter value of the configuration, but the following error occurs.
KakaoTalk_20210125_213216836

If you know how to solve it and answer me, I would be grateful.
Thanks for reading my answer. Have a nice day!

Load_model needed

Hi Hugo, I'm interested in studying your work for fun. I wanna first try to run it but I found no loaded model could be passed into the program. Could you please show me how to figure it out? Thx.

Link for smaller dataset

Thank you for your work. Could you please share a newer link for the smaller dataset, because the existing link seems to be broken. Thank you for your time

Capturing face parts

Hi!

Do you know a good method for capturing the face parts from a video or a image?

Thanks

Training on Custom Dataset

@hugochan hi thanks for the wonderful code for eye tracking and detection , just wanted to know whats the procedure to train on a custom dataset? can you briefly give the pointers for training on the new/ custom dataset

关于标签y

你好,请问train_y.npy、val_y.npy文件里只是单纯的(x,y)坐标,还是包含了前面四个图像输入的对应关系?求大神解答

How does one specify device type and orientation?

Hi, I remember reading this paper some time back, and as I understand the device type and orientation mattered a lot. This is because the position of the camera relative to the screen makes a difference.

In this code, I don't see any parameters to specify device type or orientation. Wondering if you could please explain what I'm missing? Does this model only work for a specific device type and orientation? Which device and orientation would that be?

Thanks in advance!

how to use?

Hello, I dont have an ML background, but would like to try this out.

How do I get the x and y gaze prediction output, given a face image, width and height of screen and maybe some camera details (distance of camera to screen, and camera size maybe)?

An example python implementation would be great, thanks a lot :D

Cheers!

P.S. Can this be used to for different screen size? like instead of an iPhone screen, a 24 inch monitor is used?

What is train_y and val_y in dataset?

I expect both train_y and val_y are the 2D coordination of eye graze .

But the result is quite strange. Most of the coordination are very small or negative. That means most eye graze always point at the left top side. But the fact doesn't like that. Most of them are wrong.

Capture000

Please concern the circle and the center of circle. The center is the predicted point of eye gaze.

Capture001
Capture002

about the better prediction error

Hi, Hugo Chan, nice to meet you, I am recently focused on this technology which might help us to get to know more about our end-user's thoughts, I read about the MIT paper, they claim their prediction error is 1.71cm on mobile and 2.53cm on a tablet.

And in this repo's README, it says" better validation error (2.19 cm vs. 2.514 cm)", is this the comparison to the MIT <eye_tracking_for_everyone> result?

Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.