Code Monkey home page Code Monkey logo

vocabulary's Introduction

1   Vocabulary

PyPI version License Python Versions Build Status Requirements Status Join the chat at https://gitter.im/prodicus/vocabulary

A dictionary magician in the form of a module!

Author:Tasdik Rahman

1.1   What is it

[back to top]

For a given word, using Vocabulary, you can get its

  • Meaning
  • Synonyms
  • Antonyms
  • Part of speech : whether the word is a noun, interjection or an adverb et el
  • Translate : Translate a phrase from a source language to the desired language.
  • Usage example : a quick example on how to use the word in a sentence
  • Pronunciation
  • Hyphenation : shows the particular stress points(if any)

1.2   Features

[back to top]

  • Written in uncomplicated Python
  • Returns JSON objects, PYTHON dictionaries and lists
  • Minimum dependencies ( just uses requests module )
  • Easy to install
  • A decent substitute to Wordnet(well almost!) Wanna see? Here is a small comparison
  • Stupidly easy to use
  • Fast!
  • Supports
    • both, python2.* and python3.*
    • Works on Mac, Linux and Windows

1.3   Why should I use Vocabulary

[back to top]

Wordnet is a great resource. No doubt about it! So why should you use Vocabulary when we already have Wordnet out there?

1.3.1   Wordnet Comparison

[back to top]

Let's say you want to find out the synonyms for the word car.

  • Using Wordnet
>>> from nltk.corpus import wordnet
>>> syns = wordnet.synsets('car')
>>> syns[0].lemmas[0].name
'car'
>>> [s.lemmas[0].name for s in syns]
['car', 'car', 'car', 'car', 'cable_car']

>>> [l.name for s in syns for l in s.lemmas]
['car', 'auto', 'automobile', 'machine', 'motorcar', 'car', 'railcar', 'railway_car', 'railroad_car', 'car', 'gondola', 'car', 'elevator_car', 'cable_car', 'car']
  • Doing the same using Vocabulary
>>> from vocabulary.vocabulary import Vocabulary as vb
>>> vb.synonym("car")
'[{
  "seq": 0,
  "text": "automobile"
}, {
  "seq": 1,
  "text": "cart"
}, {
  "seq": 2,
  "text": "automotive"
}, {
  "seq": 3,
  "text": "wagon"
}, {
  "seq": 4,
  "text": "motor"
}]'
>>> ## load the json data
>>> car_synonyms = json.loads(vb.synonym("car"))
>>> type(car_synonyms)
<class 'list'>
>>>

So there you go. You get the data in an easy JSON format.

You can go on comparing for the other methods too.

1.4   Installation

[back to top]

1.4.1   Option 1: installing through pip (Suggested way)

pypi package link

$ pip install vocabulary

If you are behind a proxy

$ pip --proxy [username:password@]domain_name:port install vocabulary

Note: If you get command not found then $ sudo apt-get install python-pip should fix that

1.4.2   Option 2: Installing from source (Only if you must)

$ git clone https://github.com/tasdikrahman/vocabulary.git
$ cd vocabulary/
$ pip install -r requirements.txt
$ python setup.py install

1.4.3   Demo

[back to top]

Demo link
Demo link

1.5   Documentation

[back to top]

For a detailed usage example, refer the documentation at Read the Docs

1.6   Contributing

[back to top]

Please refer Contributing page for details

1.6.1   Discuss

[back to top]

Join us on our Gitter channel if you want to chat or if you have any questions in your mind.

1.6.2   Contributers

[back to top]

  • Huge shoutout to @tenorz007 for adding the ability to return the API response as different data structures.
  • Thanks to Anton Relin for adding the translate module.
  • And a big shout out to all the contributers for their contributions

1.7   Changelog

[back to top]

Please refer Changelog page for details

1.8   Bugs

[back to top]

Please report the bugs at the issue tracker

1.9   Similar

[back to top]

Other similar software inspired by Vocabulary

  • Vocabulary : The Go lang port of this python counterpart
  • woordy : Gives back word translations
  • guile-words : The Guile Scheme port of this python counterpart

1.9.1   Known Issues

[back to top]

  • In python2, when using the method Vocabulary.synonym() or Vocabulary.pronunciation()
>>> vb.synonym("car")
[{
  "seq": 0,
  "text": "automotive"
}, {
  "seq": 1,
  "text": "motor"
}, {
  "seq": 2,
  "text": "wagon"
}, {
  "seq": 3,
  "text": "cart"
}, {
  "seq": 4,
  "text": "automobile"
}]
>>> type(vb.pronunciation("hippopotamus"))
<class 'list'>
>>> json.dumps(vb.pronunciation("hippopotamus"))
'[{"raw": "(h\\u012dp\\u02cc\\u0259-p\\u014ft\\u02c8\\u0259-m\\u0259s)", "rawType": "ahd-legacy", "seq": 0}, {"raw": "HH IH2 P AH0 P AA1 T AH0 M AH0 S", "rawType": "arpabet", "seq": 1}]'
>>>

You are being returned a list object instead of a JSON object. When returning the latter, there are some unicode issues. A fix for this will be released soon.

I may suggest python-ftfy which can help you in this matter.

1.10   License :

[back to top]

Built with ♥ by Tasdik Rahman under the MIT License ©

You can find a copy of the License at http://prodicus.mit-license.org/

1.11   Donation

Paypal badge

Instamojo

gratipay

patreon

vocabulary's People

Contributors

d4d3vd4v3 avatar gitter-badger avatar infinite-joy avatar lethargilistic avatar monkpit avatar pasoevi avatar prodicus avatar pyup-bot avatar relisher avatar taranjeet avatar tasdikrahman avatar tenorz007 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

vocabulary's Issues

Contribute more backends, extend API

I've looked over your project and it seems to be very interesting. In fact I've done something similar for my own purposes. My goal, however, was to build a small set of tools to 1) extract words from my dictionary 2) obtain translation/usage/examples/synonyms etc 3) show that in conky 4) convert that to Anki format for future learning. Take a look if you're interested: https://github.com/balta2ar/yandex-slovari-tetradki/

I've implemented a small set of backends for slightly different sites, the implementation might be not very robust and production ready (it rather scrapes the HTML code, not API), but I still think we have common ideas in our implementations. In this issue I wanted to inquire whether you think that support for additional sites (something along the sites that my script supports) would be appropriate in your project?

Return False always

Hi. I installed successfully this package. (no errors). I try your examples

from vocabulary import Vocabulary as vb
print(vb.meaning('car'))
print(vb.synonym('car'))

Python version 3.5.2 (2.7.12) - I tried it on windows and linux and same result.
It returns me always false no matter what word I use.

only this works:
print(vb.hyphenation('car'))

[{"text": "car", "seq": 0}]

Thanks for help

Blazing fast, but a significant issue

Hi,

Love this library and thank god it exists! However, the synonyms don't seem to be very accurate. For example, "small" returns "fly" as the first result, and "bridge" as the second. Whilst I can see how the "fly" can be a synonym (at a stretch), not sure how bridge is "synonymous" with "small"? Am wondering if there is anyway to improve the accuracy of synonyms?

Cheers

A lot of duplicates from meaning()

When using the Vocabulary.meaning() function there is an enormous amount of duplicates. In some cases like the word cat, some meanings are duplicated 4 or even 5 times. Is there any way to only display one of the similar meanings?

Installation issues on Mac

Hey,
I tried to install the package using pip but I'm getting an error. I have added the traceback below:

$ pip3 install vocabulary
Collecting vocabulary
  Downloading Vocabulary-1.0.2.tar.gz
    Complete output from command python setup.py egg_info:
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/private/var/folders/bd/tp7n9v393ml9src1bt4m70180000gn/T/pip-build-4zb4lgql/vocabulary/setup.py", line 22, in <module>
        with open(path.join(here, 'requirements.txt'), encoding='utf-8') as f:
    FileNotFoundError: [Errno 2] No such file or directory: '/private/var/folders/bd/tp7n9v393ml9src1bt4m70180000gn/T/pip-build-4zb4lgql/vocabulary/requirements.txt'

    ----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in /private/var/folders/bd/tp7n9v393ml9src1bt4m70180000gn/T/pip-build-4zb4lgql/vocabulary/

I wasn't able to install Vocabulary in a clean virtual environment (with no installations of requests and mock) either.

$ pip install vocabulary
Collecting vocabulary
  Using cached Vocabulary-1.0.2.tar.gz
    Complete output from command python setup.py egg_info:
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/private/var/folders/bd/tp7n9v393ml9src1bt4m70180000gn/T/pip-build-erniz75j/vocabulary/setup.py", line 12, in <module>
        from vocabulary.version import VERSION
      File "/private/var/folders/bd/tp7n9v393ml9src1bt4m70180000gn/T/pip-build-erniz75j/vocabulary/vocabulary/__init__.py", line 5, in <module>
        from .vocabulary import Vocabulary
      File "/private/var/folders/bd/tp7n9v393ml9src1bt4m70180000gn/T/pip-build-erniz75j/vocabulary/vocabulary/vocabulary.py", line 25, in <module>
        import requests
    ImportError: No module named 'requests'

    ----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in /private/var/folders/bd/tp7n9v393ml9src1bt4m70180000gn/T/pip-build-erniz75j/vocabulary/

As can be observed in the traceback, this issue is due to the import statement in __init__.py module which .
How about we move define the version and release in the setup.py itself so as to keep things condensed ?

On a separate note, I believe that we are using rather constrained dependency package versions.
For example, any version of requests above 2.0.0 should work fine as a dependency.
What's your take on this?

Odd requirements.txt

The requirements.txt file should be a list of packages required by your package, for parsing by pip.

In your requirements.txt, we have the following:

requests==2.8.1
Vocabulary==0.0.3
wheel==0.26.0
  • requests==2.8.1 - Why this version? Is there some quirk you require? If you're just using the normal API, you can leave out the explicit version and just use requests here.
  • Vocabulary==0.0.3 - This is your package, right? This could cause some nasty dependency loops in pip, it shouldn't be here.
  • wheel==0.26.0 - Why? Where are you using wheel?

docs autogeneration

For generating docs using sphinx, one has to manually run the make html command if some changes are made.

Using CI/CD would remove this manual work

Installation issue on Ubuntu 16.04

Faced the following error while installing via pip:

Collecting vocabulary
Using cached Vocabulary-1.0.2.tar.gz
Complete output from command python setup.py egg_info:
Traceback (most recent call last):
File "", line 1, in
File "/tmp/pip-build-PWXQ1D/vocabulary/setup.py", line 19, in
with open(path.join(here, 'requirements.txt')) as f:
IOError: [Errno 2] No such file or directory: '/tmp/pip-build-PWXQ1D/vocabulary/requirements.txt'


Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-build-PWXQ1D/vocabulary/

I ran pip install requests before this since it failed with the error "No such module : requests" earlier.

Unusual docstring format

While this isn't really a huge problem, you may wish to reformat your docstrings according to the established formats.

PEP 287 recommends that reST (reStructuredText) format be used, but Epytext is popular as well - They're used because Python IDEs are often able to parse them to get more information about the things they document, for example, for auto-completion.

Rate Limit?

This is similar to the issue #23
I installed vocabulary on 3 different machines. On each machine, I could obtain the meaning, synonym, etc. of the word for only a few function calls (around 800 requests). I then receive 'False' for all the subsequent calls except hyphenation()

Is there a rate limit? If so, can we include it in the docs?

JSON vs Python objects

Why do all the methods return a string of JSON? What advantage does this have over returning plain Python objects (dicts, lists, etc)?

It seems a little backwards to return JSON as part of the API that's supposed to be easy to use - now I also have to parse that myself. If you require JSON strings, perhaps you could provide a keyword argument, eg json=True or json=False that will handle this accordingly.

Use nose for finding tests

the current way of running tests in the CI is kind of hacky. Using nose for finding/running tests is desirable

[Problems] Inconsistent data

  • Pronunciation 'seq' key is always set to 0
  • PartOfSpeech doesn't return all data gotten from API, only returns the first
  • Antonym value is always a list (Known Issue) and doesn't return all data gotten from the API; this is because the API returns a list and the package handles it as if its a text and just sets it without parsing it first

Initial Update

Hi 👊

This is my first visit to this fine repo, but it seems you have been working hard to keep all dependencies updated so far.

Once you have closed this issue, I'll create seperate pull requests for every update as soon as I find one.

That's it for now!

Happy merging! 🤖

Installation fails on Windows

On Windows using Python 3.4, I get an error during installation.
The error happens during install through pip, as well as installation through git-clone / setup.py.

See #1 for fix - explicitly passing encoding="utf8" to the open method when reading the README file.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.