We want to develop a cool feature in the smart-TV that can recognise five different gestures performed by the user which will help users control the TV without using a remote. The gestures are continuously monitored by the webcam mounted on the TV. Each gesture corresponds to a specific command:
- Thumbs up: Increase the volume
- Thumbs down: Decrease the volume
- Left swipe: 'Jump' backwards 10 seconds
- Right swipe: 'Jump' forward 10 seconds
- Stop: Pause the movie
We want to develop a cool feature in the smart-TV that can recognise five different gestures performed by the user which will help users control the TV without using a remote. The gestures are continuously monitored by the webcam mounted on the TV. Each gesture corresponds to a specific command:
Thumbs up: Increase the volume Thumbs down: Decrease the volume Left swipe: 'Jump' backwards 10 seconds Right swipe: 'Jump' forward 10 seconds Stop: Pause the movie
We tried both CCONV3D and CONV2d+LSTM to solve the problem. In case of CONV3D , we were not getting validation accuracy more than 20% though the training accuracy is about 80% and it requires a lots of resource and time as well.
So we choose conv2D+LSTM which give us better accuracy and able to overcome the overfitting problem.
- Python - 3.9.12
- numpy - 1.21.5
- pandas - 1.4.2
- matplotlib
- seaborn - 0.11.2
- sklearn
- keras
- This project was inspired by upGrade AI & ML course case study
Created by @sandipanp - feel free to contact me!