Code Monkey home page Code Monkey logo

pyugt's Introduction

pyugt - Python Universal Game Translator

PyPI-Status PyPI-Versions PyPI-Downloads

pyugt is a universal game translator coded in Python: it takes screenshots from a region you select on your screen, uses OCR (via Tesseract v5) to extract the characters, then feeds them to a machine translator (multiple backends are included) to then show you a translated text.

Since it works directly on images, there is no need to hack the game or anything to access the text. It is also cross-platform (support for Windows and Linux and experimentally on MacOSX).

Here is a demo:

demo

Several machine translation backends are available: from free online APIs such as Google Translate, to DeepL with a free or paid subscription, and also an offline translator that runs directly on your computer without needing internet access thanks to Argos Translate and OpenNMT.

Of course, since the translation is done by a machine, don't expect a very nice translation, but for games where no translation is available, it can be sufficient to understand the gist and be able to play.

The software can also be useful to human translators, as it is possible to enable logging of OCR'ed text in the config file, so that all captured text will be saved in a log file that can later be used as the source for a manual translation.

The software is also not limited to games, but can be applied to anything that displays text on screen.

This software was inspired by the amazing work of Seth Robinson on UGT (Universal Game Translator).

How to install & update

  1. First, install Tesseract v5 (an open-source OCR engine), installers are provided by UB Mannheim. Make sure to install the additional languages you want to translate from (eg, Japanese, there is support for both horizontal and vertical Kanji). Alternatively, on most platform, Tesseract can be installed with the default package manager, eg on Debian/Ubuntu: apt-get install tesseract. On Windows, Chocolatey can be used: choco install --pre tesseract.

  2. Then install pyugt:

    • Either on Windows OSes, there is a prepackaged binary you can download here (look for pyugt_vx.x.x_bin_win64.zip for 64-bit or win32 for 32-bit).

    • Either for other platform or if you want to run from sourcecode, you need to install a Python interpreter. Anaconda is a good one, and Miniconda3 is a smaller alternative that works too.

      Then, install this software:

      pip install --upgrade pyugt

      Or for developers who want to run it locally, after downloading the archive from Github, unzip it anywhere, cd in the folder and type:

      pip install py3make
      py3make installdev
      

      Note the software was tested on Windows 10 x64 with Python 3.10 (Anaconda) and 3.11. It was tested on Linux by other users but is not regularly tested.

Language packs for the Tesseract are downloadable directly from the installer. Language packs for the offline machine translator Argos are downloaded on-the-fly when required by the user, but can also be downloaded beforehand from this index, which also provides IPFS links that are future-proof, in case the on-the-fly downloads fail.

How to use

  • First, you need to configure the config file config.ini. A sample config file is provided with the software that should work fine on Windows, but on other platforms or in some cases you may need to edit it, particularly to setup the path to the Tesseract binaries. The config file also allows you to change the hotkeys and the monitor to screenshot from, and a few other things such as the source and target languages (by default, the source is japanese and target is english).

  • Then, you can launch the script from a terminal/console:

pyugt

or:

python -m pyugt

  • Then, use the hotkey to select a region to capture from (default hotkey: CTRL+SHIFT+F3). The selected region does not need to be very precise, but need to contain the text to translate.

  • Finally, use the hotkey to translate what is shown in the region (default: CTRL+F3). This will display a window with the original text and the translated text. Repeat as many times as you need, you don't need to reselect the region to translate again.

  • Tip: if the software has difficulties in recognizing the characters (you get gibberish and non-letters characters instead of words), first try to redefine the region with CTRL+F2 and make sure the region includes all text with some margin but not too much of the background (the tighter around the text, the less the OCR will be confused by the background, this can help a lot!). You can use the region selection and translation hotkey to do both in a streamlined fashion (default: CTRL+F2).

  • Tip2: Try to make the game screen bigger. The bigger the characters, the easier for the OCR to work.

  • Tip3: You can specify the path to a config file by using the -c or --config argument: pyugt -c <path_to_config_file>

  • Tip4: In the translation box, it's possible to manually edit the OCR'ed text and force a new translation by clicking on the "Translate again" button. This can be useful when the OCR has wrongly detected non-letters characters.

  • Tip5: If you use blue light filtering softwares, disable them all before using the OCR, it will improve the contrast and hence the accuracy.

IMPORTANT NOTE: The software is still in alpha stage (and may forever stay in this state). It IS working, but sometimes the hotkeys glitch and they do not work anymore. If this happens, simply focus the Python console and hit CTRL+C to force quit the app, then launch it again. The selected region is saved in the config file, so you don't have to redo this step everytime.

Options

Here is a sample configuration file, with comments for the various additional options (such as logs for the OCR'ed text and translated text):

[USER]
# User parameters to configure the pyUGT program.
# Note that parameters can be modified on-the-fly while the app is running, and changes will be reflected in realtime. For example, translators can be changed on-the-fly.

# Path to Tesseract v5 binary. Easily install it from UB Mannheim installers: https://github.com/UB-Mannheim/tesseract/wiki
path_tesseract_bin = C:\Program Files\Tesseract-OCR\tesseract.exe
# Source language to translate from, for OCR. Both the Optical Recognition Character and the translator will search specifically for strings in this language, this reduces the amount of false positives (eg, translating strings in other languages that are more prominent or bigger on-screen). Language code can be found inside Tesseract tessdata folder (depends on what languages you chose in the installer).
lang_source_ocr = jpn
# Source language to translate from.
lang_source_trans = ja
# Target language to translate to. Must be a language code for the target machine translator: either Google Translate language code (NOT a Tesseract code! See: https://readthedocs.org/projects/py-googletrans/downloads/pdf/latest/ ) or DeepL code (eg, en for Google Translator, Argos and most others, or EN-US for DeepL).
lang_target = en
# Machine translator library to use. Can be online_free to use free online APIs but which can be throttled (eg, Google Translate, DeepL free, Baidu, etc) ; deepl to use DeepL API with your own authkey (not throttled but limited number of translations in free plan, then need to pay for more, but it's best in class japanese->english machine translator) ; offline_argos for offline translation using Argos based on OpenNMT, which produces less accurate translations but is free, unlimited and does not require an internet connection.
translator_lib = online_free
# If online_free is the selected translator_lib, we can specify here the translator service to use. For a list of available services, see: https://github.com/UlionTse/translators#more-about-translators
translator_lib_online_free_service = google
# If translator_lib is set to deepl, the API authorization key must be set here
translator_lib_deepl_authkey = fa14ef6c-d...
# Hotkey to set the region on screen to capture future screenshots from. The region does not need to be precise, but must contain the region where text is likely to be found.
hotkey_set_region_capture = ctrl+shift+F3
# Hotkey to translate from the selected region
hotkey_translate_region_capture = ctrl+F3
# Hotkey to set a region AND translate it directly on mouse click release. This is useful for games where the contrast between the text and background is bad (eg, transparent dialog box), so reselecting a tight region for each dialogue may yield better results, this is a faster way to do that with one shortcut instead of 2.
hotkey_set_and_translate_region_capture = ctrl+F2
# Hotkey to preview in a window the postprocessed screenshot that is fed to OCR, this helps with tweaking parameters here and see how it improve the text contrast
hotkey_show_ocr_preview = ctrl+p
# On which monitor the screen region capture should display? If you have only one screen, leave this to 0 (first monitor)
monitor = 0
# Save all OCR'ed text into a log file? Set a path or file name different than None to activate (exemple: log_ocr = log_ocr.txt). This can be very useful for human translators to gather game text data.
log_ocr = None
# Save all translated text into a log file? Set a path or file name different than None to activate (exemple: log_translation = log_trans.txt).
log_translation = None
# Only capture text by OCR without translating (set this value to True, else to also translate set to False). This is useful if you only want to use pyugt as a OCR tool, or don't want to send your OCR'ed text to Google.
ocr_only = False
# Remove line returns automatically, so that we consider all sentences to be one (this can help the translator make more sense because it will have more context to work with).
remove_line_returns = True
# Preprocess screenshots to improve OCR?
preprocessing = True
# Preprocessing filters to apply (set to None to disable, else input a list of strings with strings being methods of PILLOW.ImageFilter). Example: ['SMOOTH', 'SHARPEN', 'UnsharpMask']
preprocessing_filters = ['SHARPEN']
# Preprocessing binarization of image? Set to None to disable, else set a value between 0-255 (255 being white)
preprocessing_binarize_threshold = 180
# Preprocessing invert image (if text is white, it's better to invert to get black text, Tesseract OCR will be more accurate). Set to False to disable.
preprocessing_invert = True
# Show debug information
debug = False

Shortcomings & advantages

Compared to UGT, the translated text is not overlaid over a screenshot of the original text. This could maybe be done as Tesseract provides some functions (image_to_boxes() or image_to_osd() or image_to_data()). PRs are very welcome if anyone would like to give it a try!

UGT can also directly select and translate the active window. We dropped this feature because it's platform dependent, so a region selection seemed like the most reliable and cross-platform way to implement screen capture, even if it adds one additional step.

On the other hand, there are several advantages to our approach:

  • it's cross-platform (Windows & Linux currently, MacOSX should also supported experimentally but global hotkeys may sometimes fail because the keyboard module only has experimental support for this platform),

  • we use Tesseract so that OCR is done locally (instead of via the Google Cloud Vision API) so we only send text which is a lot smaller footprint and thus less expensive (generally free), and a big advantage is that it's possible to freely resize the game window to a bigger size, with bigger characters improving the OCR recognition, and also no downscaling/quality reduction is necessary since there is no image transfer,

  • Regions can be selected, so that unnecessary screen objects that may confuse the OCR can be elimited with a carefully selected region (update: UGT now also implements this feature ๐ŸŽ‰),

  • We enforce the source and target languages, so that both the OCR and translator know what to expect, instead of trying to autodetect, which may fail particularly when there are names that may be written in another language or character form (eg, not in Kanji).

Similar projects

  • Universal Game Translator (Windows, opensource) - the inspiration for this project.

  • pyUGT fork by ByJacob (Windows, Linux, MacOS, opensource) - fork of this project but implementing the Marian Machine Translation offline translator instead of Argos-Translate. MMT is sometimes more accurate than Argos-Translate, but is slower and more space consuming. Both offline translators however remain less accurate than DeepL online translator.

  • Capture2Text (Windows, opensource) - OCR on-screen text, but no translation.

  • OwlOCR (MacOSX, freeware) - Similar to Capture2Text, OCR on-screen text, no translation.

  • Sugoi Japanese Translator combined with (Visual Novel OCR)[https://visual-novel-ocr.sourceforge.io/] by the same author: provides similar features to pyUGT but is closed-source, so that it cannot be improved upon. Also, it is unclear which OCR backend is used.

  • Translator++, a free but closed source app to translate non emulated games with access to the text, can leverage automatic machine translation.

  • ocrTranslator, a Python 3 desktop software that can be used like pyugt to OCR the same text zone (supports local OCR engines such as Tesseract and rapidOCR!) and also translate (but it only supports online translation services, not local ones, but it supports online GPT LLMs services). Includes a "game mode" to show the translated text as an overlay (similar to SethRobinson's original UGT). Tested only on Windows 10, support for other platforms is unclear (although it is using tkinter, so it should be easy to make crossplatform).

License

This software is made by Stephen Larroque and is published under the MIT Public License.

pyugt's People

Contributors

lrq3000 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

pyugt's Issues

Fix random crashes and memory leak due to Tkinter being called in a thread

Currently the app's architecture relies on a thread to call a tkinter.Toplevel() to show the TranslationBox. But this raises issues sometimes, in particular when first using CTRL+F3 at program's launch and then CTRL+F2. Indeed, Tkinter should never be used in multiple threads. So the app needs a rewrite.

I tried to fix it by various means but failed. My most advanced attempt was by using mttkinter for Python 3 (note there is another Python 3 port here I did not try) but screen capture (CTRL+F2) works only once, then it doesn't anymore. See attached file (based on v0.4.9).
pyugt_mttkinter.txt

At this point, I don't know what to do, if anyone can give me a hand, it would be tremendously useful! So for now, please expect random crashes from time to time, with the only workable solution in the meantime being to close and relaunch the app.

Here are some additional resources:

/UPDATE: this also leads to a memory leak of about 1-2MB per translation. That's not a lot, but over the course of a gaming session it will certainly break the roof! This is a critical bug.

pyugt maybe doesn't support python3

On Linux, when I'm trying to run pyugt as a root (the only way to run it, seems to be) I receive this message "ImportError: No module named configparser".

On some blogs I've read this problem is created because ConfigParser is changed in configparser or config-parser on python3.

Right now I cannot get rid of this problem but I'm trying to solve it.

Unable to start completely offline

I tried starting the program completely offline but with docker+LibreTranslate started locally.
It doesn't open the interface, I was able to take a screenshot
offline
and you can see that the program is blocked because it can't find the website to determine the geographic location.
Can this be avoided if you inpose Argos in the config file?

error

C:\Users\User>pyugt
pyugt - Python Universal Game Translator v0.5.3 - started
Languages available for OCR: ['afr', 'amh', 'ara', 'asm', 'aze', 'aze_cyrl', 'bel', 'ben', 'bod', 'b
os', 'bre', 'bul', 'cat', 'ceb', 'ces', 'chi_sim', 'chi_sim_vert', 'chi_tra', 'chi_tra_vert', 'chr',
'cos', 'cym', 'dan', 'deu', 'div', 'dzo', 'ell', 'eng', 'enm', 'epo', 'equ', 'est', 'eus', 'fao', '
fas', 'fil', 'fin', 'fra', 'frk', 'frm', 'fry', 'gla', 'gle', 'glg', 'grc', 'guj', 'hat', 'heb', 'hi
n', 'hrv', 'hun', 'hye', 'iku', 'ind', 'isl', 'ita', 'ita_old', 'jav', 'jpn', 'jpn_vert', 'kan', 'ka
t', 'kat_old', 'kaz', 'khm', 'kir', 'kmr', 'kor', 'lao', 'lat', 'lav', 'lit', 'ltz', 'mal', 'mar', '
mkd', 'mlt', 'mon', 'mri', 'msa', 'mya', 'nep', 'nld', 'nor', 'oci', 'ori', 'osd', 'pan', 'pol', 'po
r', 'pus', 'que', 'ron', 'rus', 'san', 'sin', 'slk', 'slv', 'snd', 'spa', 'spa_old', 'sqi', 'srp', '
srp_latn', 'sun', 'swa', 'swe', 'syr', 'tam', 'tat', 'tel', 'tgk', 'tha', 'tir', 'ton', 'tur', 'uig'
, 'ukr', 'urd', 'uzb', 'uzb_cyrl', 'vie', 'yid', 'yor']
Hit ctrl+shift+F3 to set the region to capture.
Hit ctrl+F3 to translate the region (make sure to close the translation window before requesting ano
ther one).
Hit ctrl+F2 to set AND translate a region.
Press CTRL+C or close this window to quit.
Exception in thread Thread-4:
Traceback (most recent call last):
File "c:\users\user\appdata\local\programs\python\python39\lib\threading.py", line 954, in _bootst
rap_inner
self.run()
File "c:\users\user\appdata\local\programs\python\python39\lib\threading.py", line 892, in run
self._target(*self.args, **self.kwargs)
File "c:\users\user\appdata\local\programs\python\python39\lib\site-packages\keyboard_generic.py"
, line 58, in process
if self.pre_process_event(event):
File "c:\users\user\appdata\local\programs\python\python39\lib\site-packages\keyboard_init
.py"
, line 218, in pre_process_event
callback(event)
File "c:\users\user\appdata\local\programs\python\python39\lib\site-packages\keyboard_init
.py"
, line 649, in
handler = lambda e: (event_type == KEY_DOWN and e.event_type == KEY_UP and e.scan_code in logic
ally_pressed_keys) or (event_type == e.event_type and callback())
File "c:\users\user\appdata\local\programs\python\python39\lib\site-packages\keyboard_init
.py"
, line 637, in
callback = lambda callback=callback: callback(*args)
File "c:\users\user\appdata\local\programs\python\python39\lib\site-packages\pyugt\pyugt.py", line
433, in translateRegion
transtext = gtranslate(ocrtext, langsource_trans, langtarget)
File "c:\users\user\appdata\local\programs\python\python39\lib\site-packages\pyugt\pyugt.py", line
350, in gtranslate
transobj = translator.translate(ocrtext, src=langsource_trans, dest=langtarget)
File "c:\users\user\appdata\local\programs\python\python39\lib\site-packages\googletrans\client.py
", line 182, in translate
data = self._translate(text, dest, src, kwargs)
File "c:\users\user\appdata\local\programs\python\python39\lib\site-packages\googletrans\client.py
", line 78, in _translate
token = self.token_acquirer.do(text)
File "c:\users\user\appdata\local\programs\python\python39\lib\site-packages\googletrans\gtoken.py
", line 194, in do
self._update()
File "c:\users\user\appdata\local\programs\python\python39\lib\site-packages\googletrans\gtoken.py
", line 62, in _update
code = self.RE_TKK.search(r.text).group(1).replace('var ', '')
AttributeError: 'NoneType' object has no attribute 'group'

Error trying to translate text

After selecting a region and trying to translate it, I get this error:

pyugt - Python Universal Game Translator v0.5.2 - started
Languages available for OCR: ['eng', 'jpn', 'jpn_vert', 'osd']
Hit ctrl+shift to set the region to capture.
Hit shift+x to translate the region (make sure to close the translation window before requesting another one).
Hit ctrl+x to set AND translate a region.
Press CTRL+C or close this window to quit.
Exception in thread Thread-4:
Traceback (most recent call last):
File "threading.py", line 926, in bootstrap_inner
File "threading.py", line 870, in run
File "site-packages\keyboard_generic.py", line 58, in process
File "site-packages\keyboard_init
.py", line 218, in pre_process_event
File "site-packages\keyboard_init_.py", line 649, in
File "site-packages\keyboard_init_.py", line 637, in
File "pyugt\pyugt.py", line 444, in selectAndTranslateRegion
File "pyugt\pyugt.py", line 420, in translateRegion
File "pyugt\pyugt.py", line 337, in gtranslate
File "site-packages\googletrans\client.py", line 172, in translate
File "site-packages\googletrans\client.py", line 75, in _translate
File "site-packages\googletrans\gtoken.py", line 200, in do
File "site-packages\googletrans\gtoken.py", line 65, in _update
AttributeError: 'NoneType' object has no attribute 'group'

I saw on a similar thread that running this command in anaconda console fixed the issue: pip install googletrans==4.0.0rc1
But after installing that I still get the same error.

AttributeError: '_thread._local' object has no attribute 'display'

When I press ctrl+shift+F3 I get this traceback.
kubuntu 22.04
Python 3.10.12

~/.local/bin/pyugt -c ~/.local/lib/python3.10/site-packages/pyugt/config.ini

pyugt - Python Universal Game Translator v1.0.6 - started
Languages available for OCR: []
Hit ctrl+shift+F3 to set the region to capture.
Hit ctrl+F3 to translate the region (make sure to close the translation window before requesting another one).
Hit ctrl+F2 to set AND translate a region.
Hit ctrl+p to show/hide OCR preview.
Press CTRL+C or close this window to quit.
selectRegion triggered
Exception in thread Thread-15 (process):
Traceback (most recent call last):
  File "/usr/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.10/threading.py", line 953, in run
    self._target(*self._args, **self._kwargs)
  File "/root/.local/lib/python3.10/site-packages/keyboard/_generic.py", line 58, in process
    if self.pre_process_event(event):
  File "/root/.local/lib/python3.10/site-packages/keyboard/__init__.py", line 218, in pre_process_event
    callback(event)
  File "/root/.local/lib/python3.10/site-packages/keyboard/__init__.py", line 649, in <lambda>
    handler = lambda e: (event_type == KEY_DOWN and e.event_type == KEY_UP and e.scan_code in _logically_pressed_keys) or (event_type == e.event_type and callback())
  File "/root/.local/lib/python3.10/site-packages/keyboard/__init__.py", line 637, in <lambda>
    callback = lambda callback=callback: callback(*args)
  File "/root/.local/lib/python3.10/site-packages/pyugt/pyugt.py", line 237, in selectRegion
    sct_img = sct.grab(sct.monitors[int(config['USER']['monitor'])])
  File "/root/.local/lib/python3.10/site-packages/mss/base.py", line 118, in monitors
    self._monitors_impl()
  File "/root/.local/lib/python3.10/site-packages/mss/linux.py", line 385, in _monitors_impl
    display = self._handles.display
AttributeError: '_thread._local' object has no attribute 'display'
^CTraceback (most recent call last):
  File "/root/.local/bin/pyugt", line 8, in <module>
    sys.exit(main())
  File "/root/.local/lib/python3.10/site-packages/pyugt/pyugt.py", line 691, in main
    time.sleep(1)

I tried putting in the config: monitor = 0 / monitor = :0 / monitor = #0

echo $DISPLAY
:0
tesseract  --version
tesseract 5.3.2
 leptonica-1.82.0
  libgif 5.1.9 : libjpeg 8d (libjpeg-turbo 2.1.1) : libpng 1.6.37 : libtiff 4.3.0 : zlib 1.2.11 : libwebp 1.2.2 : libopenjp2 2.4.0
 Found AVX512BW
 Found AVX512F
 Found AVX512VNNI
 Found AVX2
 Found AVX
 Found FMA
 Found SSE4.1
 Found OpenMP 201511
 Found libarchive 3.6.0 zlib/1.2.11 liblzma/5.2.5 bz2lib/1.0.8 liblz4/1.9.3 libzstd/1.4.8
 Found libcurl/7.81.0 OpenSSL/3.0.2 zlib/1.2.11 brotli/1.0.9 zstd/1.4.8 libidn2/2.3.2 libpsl/0.21.0 (+libidn2/2.3.2) libssh/0.9.6/openssl/zlib nghttp2/1.43.0 librtmp/2.3 OpenLDAP/2.5.16

information and suggestions

Hi, first of all congratulations for this program, it works well with windows [tesseract+docker with libreTranslator (argos)].
I would like to specify that with tesseract latest version (5.4.0) you get errors, so I reinstalled the old version (5.1.0) and so it works very well.
I would like to have a floating window above the text to be translated, so I can see above the result. For example, ctrl+F6 takes the translation and puts it in a window of the size I set with ctrl+F2 (with scroll if the translated text is longer than the original).
I hope it can be added.
Thank you for your work!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.