Comments (2)
It would be nice to have a standard interface to access the license and citation interface, as we do for the readme.
I suggest adding to the API something like:
def license(self):
"""
Return the contents of the corpus LICENSE file, if it exists.
"""
if os.path.exist(self._root.join("LICENSE")):
return self.open("LICENSE").read()
else:
return "No LICENSE found for this corpus (maybe check the README)"
And the same for '''citation' which would look for "citation.bib".
Then we need to add the LICENSE and citation info for each corpus. I will do it for the open multilingual wordnet and wordnet.
from nltk.
Good idea. I've added this, but without the conditional, since I think it's fine let Python generate an error message. I see that there are many corpora containing README.txt
instead of README
, so that also needs to be standardized.
from nltk.
Related Issues (20)
- `TreebankWordDetokenizer().detokenize()` introduces unexpected spaces before periods.
- KneserNeyInterpolated has problem with OOV words during testing and perplexity is always inf HOT 7
- Dispersion Plot was not populating in correct order on Y axis. I have corrected that order. Please use the below code in dispersion.py file. HOT 2
- Not able to download the NLTK data module (python as well as manual download) HOT 2
- import error with numpy 1.24.4 HOT 3
- A potential edge case for WordNetLemmatizer.lemmatize() HOT 1
- module 'nltk' has no attribute 'data HOT 2
- Fraction object creation fails with extra kwargs in bleu_score.py HOT 2
- i want to write python script i have italian text files that who i verify my word in italian dictionery please solve HOT 1
- Reversed y labels in dispersion_plot HOT 1
- Best NLTK books
- It would be nice to have a mapping from arpabet to IPA for the cmudict HOT 1
- stem accuracy HOT 1
- UTF-8 codec can't decode byte 0×e9 in position 122
- Questions about Copilot + Open Source Software Hierarchy
- Duplicates in wordnet hypernyms closure
- Failed to run post install script for guardrails/toxic_language HOT 2
- Downloader race condition with multiple processes HOT 5
- SnowballStemmer: how to avoid transliteration? HOT 1
- Normalize function in sentence_bleu HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from nltk.