Code Monkey home page Code Monkey logo

screenshot2text3's Introduction

Screenshot2Text

Using Tesseract OCR

Main Functionality

  • Converting the image using the image link pasted into text
  • Converting the recent screenshotted image into text

Windows Setup

  1. Creating an virtual environment using Python >= 3.8 py 3.8 -m venv env
  2. Activate the env env\Scripts\activate.bat
  3. Installing the Python libraries pip install -r requirements.txt
  4. Download unofficial Tesseract for Windows: tesseract-ocr-w64-setup-v5.1.0.20220510.exe (64 bit)
  5. Adding the path of Tesseract
  6. For simplicity, you can create a shortcut to your desktop and run this script
    1. For example:
    • Target: C:\Windows\System32\cmd.exe /K "D:\Project\Python\2022\Screenshot2Text\env\Scripts\activate.bat" && ocr.py
    • Start in: D:\Project\Python\2022\Screenshot2Text
  7. For more languages, download at https://github.com/tesseract-ocr/tessdata and put those into the C:\Prorgam Files\Tesseract-OCR\tessdata

Linux Setup

  1. Install Tesseract
sudo apt update
sudo add-apt-repository ppa:alex-p/tesseract-ocr-devel
sudo apt install -y tesseract-ocr
sudo apt update
tesseract -–version
  1. Create Python virtual environment and activate it
python3.10 -m venv env
. env/bin/activate
  1. Install dependencies
pip install -r requirements.txt
  1. For more languages

  2. For simplicity of opening it, add a new shortcut with you own custom key gnome-terminal --window-with-profile=Mini -x bash -c 'cd /home/<username>/Desktop/personal/Screenshot2Text ; source <venv-name>/bin/activate ; python ocr.py ; deactivate'

Mac OS Setup

  1. Install Tesseract
brew install tesseract
tesseract --list-langs
  1. Create Python virtual environment and activate it
python3.12 -m venv env
. env/bin/activate
  1. Install dependencies
pip3 install -r requirements.txt
  1. To install the all languages
brew install tesseract --all-languages 

OR copy the required files from this folder to /opt/homebrew/share/tessdata/ 5. Adding alias for executing the script from terminal since there is no keyboard shortcut like Linux where you can open the terminal explicitly and run the script

# Add this line into ~/.zshrc
alias ocr='cd ~/PATH/TO/Screenshot2Text ; env/bin/python3 ocr.py'

How to use?

  1. Run the script
  2. Enter to use the recent screenshot image to convert OR paste the image filepath
  3. The output will be copied to your clipboard directly

Future work

  • URL images
  • Google drive link
  • Compatible for Linux
  • Compatible for Mac OS
  • PyQt a simple UI
  • Google Chrome and Firefox extension for extracting the text
  • Preprocess the image by inverting the dark image into bright image for better tesseract extraction.
  • Multimodel image description generation
  • LLM for answering simple question in PyQt

screenshot2text3's People

Contributors

jwtanx avatar

Stargazers

Peter Abbasi avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.