Code Monkey home page Code Monkey logo

Comments (14)

ZhymabekRoman avatar ZhymabekRoman commented on May 28, 2024 2

And yes something is happening with the caches

It's not a bug, it's a feature. When I designed the V2 translatepy architecture, I make a one cache instance avaible for all BaseTranslate class instances. In practice, it doesn't seem to be a good idea. If required, I can make PR to fix this, and integrate new LRU cache logic (#58).

class BaseTranslator(ABC):
"""
Base abstract class for a translate service
"""
_translations_cache = LRUDictCache()
_transliterations_cache = LRUDictCache()
_languages_cache = LRUDictCache()
_spellchecks_cache = LRUDictCache()
_examples_cache = LRUDictCache()
_dictionaries_cache = LRUDictCache()
_text_to_speeches_cache = LRUDictCache(8)

Caches initializes as class attributes, not instance. More info: https://stackoverflow.com/a/207128/13452914

from translate.

ZhymabekRoman avatar ZhymabekRoman commented on May 28, 2024 2

New PR done: #76

translate git:(main) ipython
Python 3.9.2 (default, Feb 28 2021, 17:03:44) 
Type 'copyright', 'credits' or 'license' for more information
IPython 8.8.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: from translatepy.translators.google import GoogleTranslate

In [2]: g = GoogleTranslate(service_url="translate.google.es")

In [3]: g.language("casa")
Out[3]: LanguageResult(service=Google, source=casa, result=spa)

In [4]: g = GoogleTranslate(service_url="translate.google.fr")

In [5]: g.language("casa")
Out[5]: LanguageResult(service=Google, source=casa, result=por)

from translate.

Animenosekai avatar Animenosekai commented on May 28, 2024 1

When I designed the V2 translatepy architecture, I make a one cache instance avaible for all BaseTranslate class instances.

Yes, I think this should be changed because people using translators separately expect different results from each instance.

Moreover, if they want a shared cache, they might just use the Translate class.

Also yea you can PR the new LRU logic anytime you want !

from translate.

ZhymabekRoman avatar ZhymabekRoman commented on May 28, 2024

Thanks for reporting this! This is strange, in my case even class GoogleTranslate doesn't recognize the language correctly. Problems seem to be on Google server side

translate git:(main) ipython3
Python 3.9.2 (default, Feb 28 2021, 17:03:44) 
Type 'copyright', 'credits' or 'license' for more information
IPython 8.8.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import translatepy

In [2]:  translatepy.translators.google.GoogleTranslateV1().language("casa")
Out[2]: LanguageResult(service=Google, source=casa, result=eng)
translate git:(main) ipython3
Python 3.9.2 (default, Feb 28 2021, 17:03:44) 
Type 'copyright', 'credits' or 'license' for more information
IPython 8.8.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import translatepy

In [2]:  translatepy.translators.google.GoogleTranslateV2().language("casa")
Out[2]: LanguageResult(service=Google, source=casa, result=eng)
translate git:(main) ipython3
Python 3.9.2 (default, Feb 28 2021, 17:03:44) 
Type 'copyright', 'credits' or 'license' for more information
IPython 8.8.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import translatepy

In [2]:  translatepy.translators.google.GoogleTranslate().language("casa")
Out[2]: LanguageResult(service=Google, source=casa, result=eng)

from translate.

joeperpetua avatar joeperpetua commented on May 28, 2024

Thanks for the response!
I experimented a little more, and it does seem that Google Translate is the issue.
Also, it seems that the first response will influence the subsequent results. For example:
Used GoogleTranslate() first, got result=eng. But then used Reverso, and the result was the same as the one from Google:

Python 3.11.1 (tags/v3.11.1:a7a450f, Dec  6 2022, 19:58:39) [MSC v.1934 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import translatepy
>>> translatepy.translators.google.GoogleTranslate().language("casa")
LanguageResult(service=Google, source=casa, result=eng)
>>> translatepy.translators.reverso.ReversoTranslate().language("casa")
LanguageResult(service=Reverso, source=casa, result=eng)

But, if you use Reverso first, then the result will be correct when using Google Translate:

Python 3.11.1 (tags/v3.11.1:a7a450f, Dec  6 2022, 19:58:39) [MSC v.1934 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import translatepy
>>> translatepy.translators.reverso.ReversoTranslate().language("casa")
LanguageResult(service=Reverso, source=casa, result=spa)
>>> translatepy.translators.google.GoogleTranslate().language("casa")
LanguageResult(service=Google, source=casa, result=spa)

Could this be related to the cache mechanism?

from translate.

Animenosekai avatar Animenosekai commented on May 28, 2024

(I guess because of the cache, but I may be wrong)

Yes, I would guess the same !

But interestingly enough, then, in the same session, using the base translator with the method translate(), the detection was off again

This is normal, because some translators, such as Google Translate, already returns the source language with their translation endpoint, and some need to first call the language endpoint.

So, even if you called the language endpoint first with Google Translate, the source language would be the one returned by the translation endpoint.

The weirdest thing is that Google Translate returned Spanish though.

Looking at the official website, we see that indeed the detected language is English

Screenshot 0005-01-23 at 21 18 59

from translate.

Animenosekai avatar Animenosekai commented on May 28, 2024

Also, it seems that the first response will influence the subsequent results

Now this is weird, because it shouldn't lol

This is the part where the GET cache is returned

_cache_key = str(url) + str(kwargs)
if _cache_key in self.GETCACHE and time() - self.GETCACHE[_cache_key]["timestamp"] < self.cache_duration:
return self.GETCACHE[_cache_key]["response"]

For the translator cache, here is the part where it gets the cache

if _cache_key in self._languages_cache:
# Taking the values from the cache
language = self._languages_cache[_cache_key]

But that's weird because we clearly see that you are creating two different instances of the Translator class

>>> translatepy.translators.reverso.ReversoTranslate().language("casa")
LanguageResult(service=Reverso, source=casa, result=spa)
>>> translatepy.translators.google.GoogleTranslate().language("casa")
LanguageResult(service=Google, source=casa, result=spa)

from translate.

joeperpetua avatar joeperpetua commented on May 28, 2024

Well, just found a very interesting behavior (or bug) from Google Translate.
It seems that it will detect a different language depending on the language of your Google account. For example:
GA - English | detects English:
image
GA - Spanish | detects Spanish:
image
GA - French and German | detect Portuguese:
image
image
From this, I guess that the best would be to just clean the cache in the production server and then go with Reverso to get the language and pass it explicitly.

from translate.

Animenosekai avatar Animenosekai commented on May 28, 2024

Well, just found a very interesting behavior (or bug) from Google Translate.
It seems that it will detect a different language depending on the language of your Google account. For example:

Wow now that's interesting...

I guess it might be a feature to guess better the expected result.

from translate.

Animenosekai avatar Animenosekai commented on May 28, 2024

But then it might change the result based on the service URL used 🤔

from translate.

Animenosekai avatar Animenosekai commented on May 28, 2024

Just confirmed it:

>>> from translatepy.translators.google import GoogleTranslate
>>> g = GoogleTranslate(service_url="translate.google.es")
>>> g.language("casa")
LanguageResult(service=Google, source=casa, result=spa)
>>> g = GoogleTranslate(service_url="translate.google.fr")
>>> g.language("casa")
LanguageResult(service=Google, source=casa, result=spa)
>>> g.clean_cache()
>>> g.language("casa")
LanguageResult(service=Google, source=casa, result=por)

And yes something is happening with the caches

from translate.

joeperpetua avatar joeperpetua commented on May 28, 2024

But that's weird because we clearly see that you are creating two different instances of the Translator class

Well, that is interesting indeed, I would have totally blamed it in the cache to be honest lol

I guess it might be a feature to guess better the expected result.
But then it might change the result based on the service URL used 🤔

Yeah, but I think it kinda makes sense for words that are the same in different languages, for example casa is the same in Spanish, Portuguese and Italian, so if your GA is set in Italian, the detection will go with Italian:
image

from translate.

joeperpetua avatar joeperpetua commented on May 28, 2024

And yes something is happening with the caches

Well, that is something lol, I tried checking in the source code before, but my python skills are not that sharp 😅 maybe you have a better eye to catch what's going on lol

from translate.

joeperpetua avatar joeperpetua commented on May 28, 2024

Thank you all guys for the help 🙌🙌

from translate.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.