Using Tesseract OCR
- Converting the image using the image link pasted into text
- Converting the recent screenshotted image into text
- Creating an virtual environment using Python >= 3.8
py 3.8 -m venv env
- Activate the env
env\Scripts\activate.bat
- Installing the Python libraries
pip install -r requirements.txt
- Download unofficial Tesseract for Windows: tesseract-ocr-w64-setup-v5.1.0.20220510.exe (64 bit)
- Adding the path of Tesseract
- For simplicity, you can create a shortcut to your desktop and run this script
- For example:
- Target:
C:\Windows\System32\cmd.exe /K "D:\Project\Python\2022\Screenshot2Text\env\Scripts\activate.bat" && ocr.py
- Start in:
D:\Project\Python\2022\Screenshot2Text
- For more languages, download at https://github.com/tesseract-ocr/tessdata and put those into the C:\Prorgam Files\Tesseract-OCR\tessdata
- Install Tesseract
sudo apt update
sudo add-apt-repository ppa:alex-p/tesseract-ocr-devel
sudo apt install -y tesseract-ocr
sudo apt update
tesseract -–version
- Create Python virtual environment and activate it
python3.10 -m venv env
. env/bin/activate
- Install dependencies
pip install -r requirements.txt
-
For more languages
- Method 1: Download at https://github.com/tesseract-ocr/tessdata, then copy the file to
/usr/share/tesseract-ocr/<version>/tessdata
- Method 2:
sudo apt-get install tesseract-ocr-[lang]
- Method 1: Download at https://github.com/tesseract-ocr/tessdata, then copy the file to
-
For simplicity of opening it, add a new shortcut with you own custom key
gnome-terminal --window-with-profile=Mini -x bash -c 'cd /home/<username>/Desktop/personal/Screenshot2Text ; source <venv-name>/bin/activate ; python ocr.py ; deactivate'
- Optional:
--window-with-profile=Mini
- Reference: https://askubuntu.com/questions/1072688/what-is-the-difference-between-the-e-and-x-options-for-gnome-terminal
- Optional:
- Install Tesseract
brew install tesseract
tesseract --list-langs
- Create Python virtual environment and activate it
python3.12 -m venv env
. env/bin/activate
- Install dependencies
pip3 install -r requirements.txt
- To install the all languages
brew install tesseract --all-languages
OR copy the required files from this folder to /opt/homebrew/share/tessdata/ 5. Adding alias for executing the script from terminal since there is no keyboard shortcut like Linux where you can open the terminal explicitly and run the script
# Add this line into ~/.zshrc
alias ocr='cd ~/PATH/TO/Screenshot2Text ; env/bin/python3 ocr.py'
- Run the script
- Enter to use the recent screenshot image to convert OR paste the image filepath
- The output will be copied to your clipboard directly
- URL images
- Google drive link
- Compatible for Linux
- Compatible for Mac OS
- PyQt a simple UI
- Google Chrome and Firefox extension for extracting the text
- Preprocess the image by inverting the dark image into bright image for better tesseract extraction.
- Multimodel image description generation
- LLM for answering simple question in PyQt