Code Monkey home page Code Monkey logo

attranslate's Introduction

attranslate - Semi-automated Text Translator for Websites and Apps

macOS/Ubuntu/Windows: Actions Status

attranslate is a tool for syncing translation-files, including JSON/YAML/XML and other formats. In contrast to paid services, any developer can integrate attranslate in a matter of minutes. attranslate will leave existing translations unchanged and only synchronize new translations.

Optionally, attranslate works with automated translation-services. For example, let's say that a translation-service achieves 80% correct translations. With attranslate, a fix of the remaining 20% may be faster than doing everything by hand. Other than that, attranslate supports purely manual translations or even file-format-conversions without changing the language.

Features

Preserve Manual Translations

attranslate recognizes that machine translations are not perfect. Therefore, whenever you are unhappy with the produced text, attranslate allows you to simply overwrite text in your target-files. attranslate will never overwrite any manual corrections in subsequent runs.

Available Services

attranslate supports the following services; many of them are free of charge:

  • openai: Uses a model like ChatGPT; free up to a limit
  • google-translate: Needs a GCloud account; free up to a limit
  • azure: Needs a Microsoft account; costs money
  • sync-without-translate: Does not change the language. This can be useful for converting between file formats, or for maintaining region-specific differences.
  • manual: Translates text with manual typing

Usage Examples

Translating a single file is as simple as the following line:

attranslate --srcFile=json-simple/en.json --srcLng=English --srcFormat=nested-json --targetFile=json-simple/es.json --targetLng=Spanish --targetFormat=nested-json --service=openai

If you have multiple target-languages, then you will need multiple calls to attranslate. You can write something like the following script:

# This example translates an english JSON-file into spanish and german.
BASE_DIR="json-advanced"
COMMON_ARGS=( "--srcLng=en" "--srcFormat=nested-json" "--targetFormat=nested-json" "--service=google-translate" "--serviceConfig=gcloud/gcloud_service_account.json" )

# install attranslate if it is not installed yet
attranslate --version || npm install --global attranslate

attranslate --srcFile=$BASE_DIR/en/fruits.json --targetFile=$BASE_DIR/es/fruits.json --targetLng=es "${COMMON_ARGS[@]}"
attranslate --srcFile=$BASE_DIR/en/fruits.json --targetFile=$BASE_DIR/de/fruits.json --targetLng=de "${COMMON_ARGS[@]}"

Similarly, you can use attranslate to convert between file-formats. See sample scripts for more examples.

Integration Guide

Firstly, ensure that nodejs is installed on your machine. Once you have nodejs, you can install attranslate via:

npm install --global attranslate

Alternatively, if you are a JavaScript-developer, then you can install attranslate via:

npm install --save-dev attranslate

Next, you should write a project-specific script that invokes attranslate for your specific files. See sample scripts for guidance on how to translate your project-specific files.

Usage Options

Run attranslate --help to see a list of available options:

Usage: attranslate [options]

Options:
  --srcFile <sourceFile>              The source file to be translated
  --srcLng <sourceLanguage>           A language code for the source language
  --srcFormat <sourceFileFormat>      One of "flat-json", "nested-json",
                                      "yaml", "po", "xml", "ios-strings",
                                      "arb", "csv"
  --targetFile <targetFile>           The target file for the translations
  --targetLng <targetLanguage>        A language code for the target language
  --targetFormat <targetFileFormat>   One of "flat-json", "nested-json",
                                      "yaml", "po", "xml", "ios-strings",
                                      "arb", "csv"
  --service <translationService>      One of "openai", "manual",
                                      "sync-without-translate",
                                      "google-translate", "azure"
  --serviceConfig <serviceKey>        supply configuration for a translation
                                      service (either a path to a key-file or
                                      an API-key)
  --matcher <matcher>                 An optional feature for string replacements. One of "none", "icu", "i18next",
                                      "sprintf" (default: "none")
  -v, --version                       output the version number

attranslate's People

Contributors

abichinger avatar cryptodev100 avatar dependabot[bot] avatar fkirc avatar johnfelipe avatar leolabs avatar oleliabo avatar onionhammer avatar paulxuca avatar ste74 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

attranslate's Issues

Fix problematic sync combinations

For most combinations of file formats, everything works as expected.
However, if the target file is an XML or YAML and the source file is something different, then things are messed up.
Details will follow.

Reduce outdated lint-packages

Currently, there is a huge number of outdated linting packages.
While linting is a great tool, it is not really worth to update all those packages to the latest versions.
attranslate should be simple to fork and simple to throw in new code, since the architecture allows for features to be isolated.

Target is up-to-date no matter what

Describe the bug
Hello,
I would love to use this tool but I cannot apparently. I work with .po files (But I've tried with csv and nested-json and I got the same problem).
It is well installed, it works (if I set a target location different from the source, the file is generated with the proper keys).
BUT nothing is translated. Even if I use service=manual, I don't get any prompt asking me to translate manually. I always get "Target is up-to-date".

To Reproduce
This is what I use :
attranslate --srcFile=/home/max/projects/yoschool_repo/main/locale/de/LC_MESSAGES/django.po --srcLng=en --srcFormat=po --targetFile=/home/max/projects/yoschool_repo/main/locale/de/LC_MESSAGES/django.po --targetLng=de --targetFormat=po --service=manual

Expected behavior
I expect a prompt asking me to enter a translation for each entry in the .po file

Files
Here is the beggining of my .po file


#, fuzzy
msgid ""
msgstr ""
"Project-Id-Version: PACKAGE VERSION\n"
"Report-Msgid-Bugs-To: \n"
"POT-Creation-Date: 2023-06-27 13:39+0200\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
"Language-Team: LANGUAGE <[email protected]>\n"
"Language: \n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n"
"Plural-Forms: nplurals=2; plural=(n != 1);\n"

#: enums/EmailSubject.py:10
msgid "$0 invites you to join $1 on YoPlanning"
msgstr ""

#: enums/EmailSubject.py:11
msgid "Welcome to Yoplanning!"
msgstr ""

Error: 8 RESOURCE_EXHAUSTED: Quota Error: RESOURCE_EXHAUSTED

I tried to translate content of XML file and ended with this error:

"8 RESOURCE_EXHAUSTED: Quota Error: RESOURCE_EXHAUSTED Quota exceeded for quota metric 'v2 and v3 general model characters' and limit 'v2 and v3 general model characters per minute' of service 'translate.googleapis.com' for consumer"

I have invoked 'google-translate' from 'pl' to 'sk' with 223559 inputs. Does anyone knows what should I do? Minute limite in GCloud is by default set to 6 000 000 char. per minute. If I want to increase limit, it has to be approved by service provider. If I should increase limit, to what number? Can anyone help? Thanks.

Azure Info

Great project!

I'm trying to configure attranslate to work with the Azure translator. Unfortunately, I'm not getting very far and I can't find any documentation that explains how to configure it.

Any pointers / documentation would be appreciated.

Thanks!

Using ICU matcher with multiple interpolations will only handle the first interpolation

Describe the bug

As described by the title, when using ICU matcher with multiple interpolations will only handle the first interpolation.

npx attranslate --srcFile=path/to/translations/en.json --srcLng=en --srcFormat=nested-json --targetFile sv.json --targetLng=sv --targetFormat=nested-json --matcher=icu --service google-translate

Example:

{
  "pageXofX": "{currentPage} of {numberOfPages}"
}

is generated into:

{
  "pageXofX": "{currentPage} av <span> 1 </span>"
}

To Reproduce

Translate the JSON example above.

Expected behavior

The generated file should not change/remove any of the interpolation values, i.e. {value}.

Files

N/A

Additional context

N/A

Double quotes in Localizable.strings

Describe the bug
If I run script to convert android xml to iOS Localizable.strings I have double quotes:

Android source:
"Incorrect id"

Actual results:
"incorrect_id" = ""Incorrect id.";

Expect results:
"incorrect_id" = "Incorrect id. ";

To Reproduce

ANDROID_EN="app/src/main/res/external/strings/values/strings.xml"
iOS_EN="/Base.lproj/Localizable.strings"

ANDROID_TO_iOS=( "--srcFormat=xml" "--targetFormat=ios-strings" "--service=sync-without-translate" "--cacheDir=android")

attranslate "${ANDROID_TO_iOS[@]}" --srcFile=$ANDROID_EN --targetFile=$iOS_EN --srcLng="en" --targetLng="en"

Expected behavior
Expect results:
"incorrect_id" = "Incorrect id. ";

Change of the license

Users of attranslate have nothing to worry about.
attranslate can still be used in any closed-source-projects.

However, if attranslate itself is modified, then those modifications must become opensource under the Gnu-GPL.

Remove --deleteStale=false option

By default, attranslate deletes all stale translations in target-files. This is fine because it helps to keep translation-files in sync.

However, if you set the option --deleteStale=false, then the outcome depends on the exact target file format.
For JSON targets and some other target formats, stale translations will remain as is.
But for yaml and xml targets, the --deleteStale flag will be completely ignored.

Arrays are converted to objects

Describe the bug
Currently when translating JSON files from source to target, it converts any arrays inside the source JSON files to object representations of such like so from:

"address": {
    "lines": [
        "Line 1",
        "Line 2",
        "Line 3"
    ]
},

to

"address": {
  "lines": {
    "0": "Line 1",
    "1": "Line 2",
    "2": "Line 3"
  }
},

To Reproduce

  1. Create a new source JSON file containing an array
  2. Convert the JSON file to another language

Expected behavior
The expected behavior would be that the datatype stays the same.

Files
This could be done using any simple JSON file containing the above example content.

Additional context
N/A

cant translate pot file en to es?

https://gist.github.com/0b4ecd4380f47794c6f5a04f54d85325

attranslate --srcFile=wp-erp/i18n/languages/erp.pot --srcLng=en --srcFormat=po --targetFile=wp-erp/i18n/languages/erp-es.pot --targetLng=es --targetFormat=po --service=google-translate --serviceConfig=traducciones-352415-79659df1c6f1.json

and show me this

Bypass 7129 strings because they are empty...
Invoke 'google-translate' from 'en' to 'es' with 11 inputs...
Add 7140 new translations
Write target '/root/attranslate/wp-erp/i18n/languages/erp-es.pot'

https://gist.github.com/1b715430007ff5aec99af3594a3b7010

tnks for your help in this one

im using before this one https://github.com/sourcecodeit/po-gtranslator

Support FormatJS ICU

Describe the feature

Support FormatJS ICU Messages Format

Example https://formatjs.io/docs/react-intl/components#formattedmessage

Source (flat-json, icu, en)

{
  "active_value": "{value, select, true {active} false {inactive} other {INVALID_ACTIVE}}",
}

What Happens now (current attranslate behavior) [target lang de]

{
    "active_value": "{value, select, true {active} falsch {inaktiv} andere {unbekannt}}",
}

Desired attranslate behavior [target lang de]

{
    "active_value": "{value, select, true {active} false {inaktiv} other {unbekannt}}",
}

Basically, on FormatJS message syntax, the lib shouldn't mutate the keys that are not wrapped in inner bracelets ({})...

Shutdown of zero-config translations

zero-config made it easy to use attranslate without configuring any API-keys.
However, the financial cost of zero-config was increasing rapidly over the last months.
Since I do not currently have any way of generating revenue, this cost is going to be unsustainable.
Therefore, I am shutting down zero-config and kindly request all existing users to migrate to google-translate or other serviceConfigs.

YAML variable name is being translated in language ES

Description

Using following YAML-file definition:

users:
  welcome(String value): "Hello $value!"

Running the following script:

COMMON_ARGS=("--srcFile=$INTL_DIR/messages.i18n.yaml" "--srcLng=en" "--srcFormat=yaml" "--targetFormat=yaml" "--service=google-translate" "--serviceConfig=$SERVICE_ACCOUNT_KEY" "--cacheDir=$INTL_DIR/cache" "--overwriteOutdated=true")

attranslate "${COMMON_ARGS[@]}" --targetFile=$TRANSLATION_DIR/messages_de.i18n.yaml --targetLng=de
attranslate "${COMMON_ARGS[@]}" --targetFile=$TRANSLATION_DIR/messages_es.i18n.yaml --targetLng=es

Generates the following, correct YAML in German...

users:
  welcome(String value): "Hallo $value!"

...and the following, faulty YAML in Spanish, where the variable $value suddenly shows up as $valor:

users:
  welcome(String value): "¡Hola $valor!"

Expected behavior
The Spanish YAML should look as follows:

users:
  welcome(String value): "¡Hola $value!"

SyntaxError: Unexpected end of JSON input

I have a properly formatted en.json file translation file in the following format:

{
  "LOGIN": {
    "FORGOT_USERNAME": "Forgot your username?",
    "FORGOT_PASSWORD": "Forgot your password?",
  },
  "CUSTOMERSEARCH": {
    "CONTACT_EMAIL": "Contact Email",
    "BILLING_EMAIL": "Billing Email",
   }
}

I ran the command as:

attranslate --srcFile=en.json --srcLng=English --srcFormat=nested-json --targetFile=fr.json --targetLng=French --service=openai --serviceConfig=[my OpenAI key here] --targetFormat=nested-json

This fails with:

SyntaxError: Unexpected end of JSON input
at JSON.parse ()
at readRawJson (C:\Users\user\AppData\Roaming\npm\node_modules\attranslate\dist\file-formats\common\managed-json.js:30:31)
at readManagedJson (C:\Users\user\AppData\Roaming\npm\node_modules\attranslate\dist\file-formats\common\managed-json.js:22:36)
at NestedJson.readTFile (C:\Users\user\AppData\Roaming\npm\node_modules\attranslate\dist\file-formats\nested-json\nested-json.js:11:57)
at readTFileCore (C:\Users\user\AppData\Roaming\npm\node_modules\attranslate\dist\core\core-util.js:43:34)
at async resolveOldTarget (C:\Users\user\AppData\Roaming\npm\node_modules\attranslate\dist\core\translate-cli.js:20:16)
at async translateCli (C:\Users\user\AppData\Roaming\npm\node_modules\attranslate\dist\core\translate-cli.js:72:23)
error: Failed to parse 'C:\Users\user\source\repos\portal\portal\src\assets\i18n\fr.json'

I assume nested-json is proper here. I tried flat-json but only get:

error: Failed to parse 'C:\Users\user\source\repos\portal\portal\src\assets\i18n\en.json' with expected format 'flat-json': Property 'LOGIN' is not a string or null

Watson IBM

will be great have translator with IBM Watson

yaml file first line language key and recognition of existing already translated lines

For yaml files, the first line is the language code, such as 'en:'

en:
  hello: "Hello world"

The translation to another language such as 'es' creates a new file es.yml where the first line language code is still 'en:'

en:
  hello: "Hola Mundo"

It would be good if i18n-auto-translation would recognize the first line language code and update it as part of the output translated language file.

The issue is that when the first line language code is updated manually, attranslate no longer recognizes that all following sub or indented keys and their translations are existing already-translated lines if attranslate is run again for updates:

Example:

First run:

Invoke 'google-translate' from 'en' to 'es' with 1 inputs...
Add 1 new translations
Write target 'C:\Users\KG\app\config\locales\es-test.yml'

Manually edit first line language code to es:

es:
  hello: "Hola Mundo"

Expected on second run with edited existing file es-test.yml:

Target is up-to-date: '.\config\locales\es-test.yml'

Actual:

Invoke 'google-translate' from 'en' to 'es' with 1 inputs...
Add 1 new translations
Delete 1 stale translations
Write target 'C:\Users\KG\app\config\locales\es-test.yml'

Please advise how I can handle yaml file updates, thanks.

cant translate with azure

File to translate
https://gist.github.com/164b623aadc5c2442005d5342f3c9c6b

root@ubuntu20portatil:~/attranslate# attranslate --srcFile=general.yml --srcLng=en --srcFormat=yaml --targetFile=general-es.yml --targetLng=es --targetFormat=yaml --service=azure --serviceConfig=blablablabla

Bypass 1 strings because they are empty...
Invoke 'azure' from 'en' to 'es' with 727 inputs...
An error occurred:
Azure Translation failed: {"error":{"code":401000,"message":"The request is not authorized because credentials are missing or invalid."}}
Error: Azure Translation failed: {"error":{"code":401000,"message":"The request is not authorized because credentials are missing or invalid."}}
    at AzureTranslator.translateBatch (/usr/local/lib/node_modules/attranslate/dist/services/azure-translator.js:26:19)
    at processTicksAndRejections (node:internal/process/task_queues:96:5)
    at async Promise.all (index 0)
    at async AzureTranslator.translateStrings (/usr/local/lib/node_modules/attranslate/dist/services/azure-translator.js:40:25)
    at async runTranslationService (/usr/local/lib/node_modules/attranslate/dist/core/invoke-translation-service.js:62:24)
    at async invokeTranslationService (/usr/local/lib/node_modules/attranslate/dist/core/invoke-translation-service.js:29:28)
    at async translateCore (/usr/local/lib/node_modules/attranslate/dist/core/translate-core.js:106:29)
    at async translateCli (/usr/local/lib/node_modules/attranslate/dist/core/translate-cli.js:84:20)

SNAG-0092

may be need to config region with any command?

attranslate set --key blablablabla
attranslate  set --region eastus

greetings

Error 16 UNAUTHENTICATED when invoking Google Cloud Translate

Describe the bug
I get an UNAUTHENTICATED error, despite having registered for Google Cloud Translate API and getting a key. I suspect my json file has the wrong structure.

To Reproduce
A description of what the bug is. When I attempt to run the script below

#!/bin/bash
set -e # abort on errors

# This example translates an english JSON-file into spanish, chinese and german. It uses Google Cloud Translate.
BASE_DIR="static/locales"
SERVICE_ACCOUNT_KEY="gcloud_translation_key.json"
COMMON_ARGS=( "--srcLng=en" "--srcFormat=nested-json" "--targetFormat=nested-json" "--service=google-translate" "--serviceConfig=$SERVICE_ACCOUNT_KEY" )

# install attranslate if it is not installed yet
attranslate --version || npm install -g attranslate

attranslate --srcFile=$BASE_DIR/en/translation.json --targetFile=$BASE_DIR/de/translation.json --targetLng=de "${COMMON_ARGS[@]}"

I get the following error

1.8.1
Invoke 'google-translate' from 'en' to 'de' with 8 inputs...
An error occurred:
16 UNAUTHENTICATED: Failed to retrieve auth metadata with error: key must be a string, a buffer or an object
Error: 16 UNAUTHENTICATED: Failed to retrieve auth metadata with error: key must be a string, a buffer or an object

I suspect it is because my gcloud_translation_key.json file is malformed

{
  "key": "<the key I got from 'Enable APIs and services' on GCloud for translation>",
  "project_id": "<my project id>"
}

Expected behavior
I expect the translated files to be output.

Files
I've added the script and config I use inline.

Additional context
Nothing else.

Thank you in advance for your help.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.