wzbsocialsciencecenter / germalemma Goto Github PK
View Code? Open in Web Editor NEWA lemmatizer for German language text
License: Apache License 2.0
A lemmatizer for German language text
License: Apache License 2.0
Thanks for this Lemmatizer! It is exactly what I have been looking for. At first, I was a bit disappointed reading that the pattern
package would only support Python 2 in the README.
Fortunately, version 3.6 of pattern
, which has been released on pypi in August 2018, supports Python 3. I was able to install it by just typing pip install pattern==3.6
and it plays together with germalemma
without any problems.
Thus, this passage in the README could be corrected.
When I try to install using the $ pip3 install -U germalemma
command, I get this error:
Collecting germalemma
Using cached https://files.pythonhosted.org/packages/ff/f9/9fb28336e480b0e3744a8633813f9e1bc3f49a4eb3d7f6ad23e923a5a5b4/germalemma-0.1.1-py3-none-any.whl
Requirement already satisfied, skipping upgrade: Pyphen>=0.9.5 in /usr/local/lib/python3.7/site-packages (from germalemma) (0.9.5)
Collecting Pattern>=3.6 (from germalemma)
Using cached https://files.pythonhosted.org/packages/1e/07/b0e61b6c818ed4b6145fe01d1c341223aa6cfbc3928538ad1f2b890924a3/Pattern-3.6.0.tar.gz
Collecting future (from Pattern>=3.6->germalemma)
Using cached https://files.pythonhosted.org/packages/90/52/e20466b85000a181e1e144fd8305caf2cf475e2f9674e797b222f8105f5f/future-0.17.1.tar.gz
Collecting backports.csv (from Pattern>=3.6->germalemma)
Using cached https://files.pythonhosted.org/packages/71/f7/5db9136de67021a6dce4eefbe50d46aa043e59ebb11c83d4ecfeb47b686e/backports.csv-1.0.6-py2.py3-none-any.whl
Collecting mysqlclient (from Pattern>=3.6->germalemma)
Using cached https://files.pythonhosted.org/packages/f4/f1/3bb6f64ca7a429729413e6556b7ba5976df06019a5245a43d36032f1061e/mysqlclient-1.4.2.post1.tar.gz
Complete output from command python setup.py egg_info:
/bin/sh: mysql_config: command not found
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/private/var/folders/_b/h6xt6cb97hx66z_101nmhmqc0000gp/T/pip-install-qgfzyd98/mysqlclient/setup.py", line 16, in <module>
metadata, options = get_config()
File "/private/var/folders/_b/h6xt6cb97hx66z_101nmhmqc0000gp/T/pip-install-qgfzyd98/mysqlclient/setup_posix.py", line 51, in get_config
libs = mysql_config("libs")
File "/private/var/folders/_b/h6xt6cb97hx66z_101nmhmqc0000gp/T/pip-install-qgfzyd98/mysqlclient/setup_posix.py", line 29, in mysql_config
raise EnvironmentError("%s not found" % (_mysql_config_path,))
OSError: mysql_config not found
----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in /private/var/folders/_b/h6xt6cb97hx66z_101nmhmqc0000gp/T/pip-install-qgfzyd98/mysqlclient/
Version: '0.1.1'
Python: 3.7
Solution: Only to install install MySQL using brew
or pip
$ brew install mysql
$ pip3 install mysql
I'll open a PR for this.
Hello there, I am relatively new to programming in general and language processing in particular, so please excuse if this seems like a silly problem:
After installing germalemma using pip, downloading the TIGER corpus in the CONLL09 format and also installing Pyphen, I am struggeling to covert the TIGER corpus into pickle format. (I used the anaconda prompt for all of this, which seemed useful as I use Jupyter Notebooks to code)
When I apply the code specified in the readme document, I get this error:
"python: can't open file 'germalemma/init.py': [Errno 2] No such file or directory"
I suspect I might have made a mistake with a detail a more experienced person would find too obvious to specify in the document?
Any help, suggestion or hint would be greatly appreciated! Thank you.
(Maybe there is even a more detailed tutorial on using germalemma somewhere? So far I haven't found anything)
Hi,
I am not sure if I understood the code right, but I am guessing that this library uses pattern to find the lemmas in the TIGER corpus and then generates a file lemmata.pickle
that can be loaded and used directly.
My question is: is it possible for me to just have the lemmata.pickle and not have to do all these steps? This would make life much easier (I just need a simple lemmatizer, and this looked perfect).
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.