Code Monkey home page Code Monkey logo

language-translation-integration's Introduction

Language Translation Integration

This script was last tested in Nuix 9.0

View the GitHub project here or download the latest release here.

View the Java Docs here.

Overview

The Language Translation Integration project integrates with third-party translation services like Google Cloud Translation or Microsoft Cognitive Services. Translated text can be added as custom metadata, or appended to (or cleared from) an item's text.

Getting Started

Setup

Begin by downloading the latest release of this code. Extract the contents of the archive into your Nuix scripts directory. In Windows the script directory is likely going to be either of the following:

  • %appdata%\Nuix\Scripts - User level script directory
  • %programdata%\Nuix\Scripts - System level script directory

Prerequisites for LibreTranslate

This translator makes use of the LibreTranslate project. LibreTranslate provides a translation server which can be ran locally. You will need to install and run the LibreTranslate server.

Prerequisites for Google Cloud Translation

Google Cloud Translation API Access

You will need a Google Cloud Platform account to access the Google Cloud Translation API. Use the following steps to sign up for an account:

  1. Sign up for an account here, https://cloud.google.com/translate/
  2. From the Google Cloud Platform Console select APIs & Services
  3. From the API Dashboard select Enable APIs and Services
  4. Search for and enable the Google Cloud Translation API
  5. On the Google Cloud Translation API overview select Credentials
  6. Click Create credentials and select API Key
  7. Copy the API key provided

Easy Translate Gem

The script makes use of a RubyGem which must be installed using the following command run via Command Prompt from your Nuix Workstation installation directory

c:\Program Files\Nuix\Nuix 7.8>jre\bin\java -Xmx500M -classpath lib\* org.jruby.Main --command gem install easy_translate --user-install

Prerequisites for Microsoft Cognitive Services

Microsoft Translator Text API Access

You will need a key for Microsoft's Translator Text API. This can be obtained by:

  1. Signing into Microsoft Azure.
  2. Navigating to Cognitive Services.
  3. Adding & configuring a Text-Translation Service.
  4. Once created the API key is accessible from the console.

Running the Script

The script requires a case be opened and items are selected.

image

Once you have selected items and started the script, an input dialog will prompt the user to select from the available NuixTranslator options. The selected NuixTranslator will prompt for settings (if required) and present a progress dialog as it runs through the current selected items.

The script uses sticky settings which are kept within the script's directory. Each NuixTranslator will have its own sticky settings, and settings can be saved/loaded through JSON.

The NuixTranslator Class

NuixTranslator is the base class, for initializing settings and progress dialogs as it runs through the current selected items. It contains methods for getting an item's original text, appending translated text, or adding translations as custom metadata.

Translation options are implemented as NuixTranslator subclasses, each defining a constant NAME string for itself (mostly used when showing dialogs), and a public method .run(items) to get the input settings and iterate over the selected items.

New translation options can be added by creating a NuxiTranslator subclass .rb file in the script's "Translators" subdirectory.

Common Translation Settings

  • Language - The translation target language.
  • Operation - Append Text or Add Custom Metadata.
    • Append Text will use the separator: \n----------Translation to <Language>---------\n image
    • Add Custom Metadata will use the field name Translation to <Language> image

Translation Options

Google Cloud Translation

image

Uses Google Cloud Translation through the EasyTranslate gem.

Adds the ability to detect an item's language, annotating the item's language as a tag or custom metadata.

Detection Settings

  • Apply detected language as custom metadata
    • Custom Metadata Field Name - Custom metadata field name to use
  • Tag items with detected language?
    • Tag Name - Applied tag will be <Tag Name>|<Detected Language>

Microsoft Cognitive Services

image

Uses the Microsoft Translator Text API.

LibreTranslate

Once you have installed the LibreTranslate server and have it running, open a Nuix case, select the items you would like to translate and run the script. When prompted, select the choice Libre Translate.

If you have your LibreTranslate server running on localhost and port 5000 then for API URL you will provide the value http://localhost:5000/translate. Then choose the source language, translation destination language and other options.

Special thanks to @Trekky12 for contributing the LibreTranslate connector!

Clear Translations

Removes translation text from selected items, obtaining an item's original text using methods from NuixTranslator.

Cloning this Repository

This script relies on code from Nx to present a settings dialog and progress dialog. This JAR file is not included in the repository (although it is included in release downloads). If you clone this repository, you will also want to obtain a copy of Nx.jar by either:

  1. Building it from the source
  2. Downloading an already built JAR file from the Nx releases

Once you have a copy of Nx.jar, make sure to include it in the same directory as the script.

License

Copyright 2019 Nuix

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

language-translation-integration's People

Contributors

juicydragon avatar nuix-mrk avatar trekky12 avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar

Forkers

trekky12

language-translation-integration's Issues

Timeout Issues Libretranslate-Docker

Hey there !

Me and my team have issues with the script producing quite a few timouts with the rubyscript running in NUIX 9.10, which accesses a Libretranslate (v. 1.4.0.) Docker container for translations.
The issue we're facing is that although you can set a timeout when executing the LanguageTranslationIntegration script in NUIX, this timeout is not being passed on to the Docker container.
For example, we set the timeout ( 120s ) and start translating, the first items were tagged "success" bc there is no processed text in Nuix. The next one, big Excelsheet, runs into timeout and should be skipped for beeing too large for the short timeout. But as a result, when the timeout expires, the script execution continues in NUIX, starting the next item, while the Libretranslate container continues to process the first translation request, causing the queue to grow. Eventually, the queue grows to about 94 elements, and then it stops accepting new jobs. How can I effectively pass the timeout to the Docker container?

The bulk of items are nearly 800 items, including some excels with over 1 million lines, about 2 MB - 10 MB big.

Anyone facing issues like us and may help?

Thanks !

Greetings, Rezo

Suggested update of languages

Not so much an issue as a suggested update to the code to list supported languages (as of Jan 2023) alphabetical with Auto language detect included as an option (and English near the top as preference).

LANGUAGES = {
'auto' => 'Auto',
'en' => 'English',
'sq' => 'Albanian',
'ar' => 'Arabic',
'az' => 'Azerbaijani',
'zh' => 'Chinese',
'cs' => 'Czech',
'da' => 'Danish',
'nl' => 'Dutch',
'eo' => 'Esperanto',
'fi' => 'Finnish',
'fr' => 'French',
'gl' => 'Galician',
'de' => 'German',
'el' => 'Greek',
'he' => 'Hebrew',
'hi' => 'Hindi',
'hu' => 'Hungarian',
'id' => 'Indonesian',
'ga' => 'Irish',
'it' => 'Italian',
'ja' => 'Japanese',
'kab' => 'Kabyle',
'ko' => 'Korean',
'nb' => 'Norwegian Bokmål',
'oc' => 'Occitan',
'fa' => 'Persian',
'pl' => 'Polish',
'pt' => 'Portuguese',
'ru' => 'Russian',
'sk' => 'Slovak',
'es' => 'Spanish',
'sv' => 'Swedish',
'zgh' => 'Tamazight-Standard Moroccan',
'tr' => 'Turkish',
'uk' => 'Ukrainian',
'vi' => 'Vietnamese'
}.freeze

While there may be reason to not do so on large data sets due to delays, I found removing the ".freeze" enabled the "Auto" selection to identify different languages in files and translate as encountered.

LibreTranslator on NUIX

Hello everyone,
I have NUIX Version 9.10.18. I use Language-Translation-Integration 1.3.0. I use the LibreTranslator at a Ubuntu virtual machine.
LibreTranslator is arrival at the ip an I can use it with the Explorer.
For exampIe, I had marked a whatsapp chat with france language. I started the script and the translation starts.
The script tell me "No response received! Please try again later"
What's the problem? It looks like the script can't find the http response.
Can someone help me?
Thanks,
regards
Steffen

Easy_translate gem won't install in Nuix Workstation 8.2

Trying to install the easy_translate gem in Nuix 8.2 results in the error:

uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/rubygems/core_ext/kernel_require.rb:1: warning: It seems your ruby installation is missing psych (for YAML output).

The issue seems to be a problem with the bundled jruby 9.2.7.0. Swapping the bundled jruby .jar files to version 9.2.0.0 fixes the problem - although I don't know if it is going to break other features.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.