LipChat is a lipreading application that decodes text from the movement of a speaker's mouth. It is a remake of the LipNet model, designed for end-to-end sentence-level lipreading.
- Converts a sequence of video frames to text
- Uses spatiotemporal convolutions, a recurrent network, and connectionist temporal classification loss
- Trained entirely end-to-end
- Achieves high accuracy in converting speech from videos to text(95.2%)
-
Clone the repository:
git clone https://github.com/abuzar0013/First-Minor-Project.git
-
First, create a Conda environment to easily manage all the required libraries.
conda create --name LipChat
Activate LipChat environment
conda activate LipChat
-
Install dependencies:
pip install -r requirements.txt
To run this app, please adjust all the file paths according to your system.
-
Run the LipChat app:
python streamlitpp.py
OR
streamlit run /FIRST-MINOR-PROJECT/app/streamlitapp.py
-
Follow the on-screen instructions to input video files for lipreading.
Contributions are welcome! Please fork the repository and submit a pull request with your changes.
This project is licensed under the MIT License - see the LICENSE file for details.
- Wand et al., 2016; Chung & Zisserman, 2016a for pioneering work in end-to-end lipreading
- Gergen et al., 2016 for their word-level state-of-the-art accuracy
- Easton & Basala, 1982 for studies on human lipreading performance