emilmont / pystatparser Goto Github PK
View Code? Open in Web Editor NEWSimple Python Statistical Parser
License: Apache License 2.0
Simple Python Statistical Parser
License: Apache License 2.0
This is a great parser, thanks for sharing it!
Just wondering if you could include or link to a list of all the part-of-speech tags used and their descriptions (similar to this). I've been asking the internet, but haven't yet found a list that has all the tags that show up in pyStatParser
, so I have to look up each one individually. Thank you!
OH MY GOSH THANK YOU!!! I have been looking for a tool like this for the past few days and I am so relieved that someone made this. Thank you again. I also am hoping to enter the field of NLP and this is certainly a good step towards me learning more. Thank your for your good work :)
Hi, I am trying to port this into another language (GDScript). The other language deals differently with file opening, writing, etc, and also implements data types a bit differently (lists and tuples are Arrays). So some adaptations were required during porting.
I have the parser working fine as long as I have the models already build. If I build them in the python version (this repo) and copy the files from the TEMP folder, all works. But if I have the GDScript code trying to build the models, it fails.
I don't have the theoretical knowledge to troubleshoot the model building. Some files mention a coursera course which no longer exists.
Is there any material around I could be reading to understand how to build the PCFG model and populate the TEMP dir?
Noticed when my installation tried to write to /usr/local/lib/python2.7/dist-packages/stat_parser/temp
Building the Grammar Model
---------------------------------------------------------------------------
OSError Traceback (most recent call last)
<ipython-input-2-29f6a952f468> in <module>()
4 os.environ['DISPLAY'] = 'localhost:10.0'
5 sent = "Each of us is full of shit in our own special way"
----> 6 parser = Parser()
7 parser.parse(sent)
8 tree = parser.parse(sent) # returns nltk Tree instance
/usr/local/lib/python2.7/dist-packages/stat_parser/parser.pyc in __init__(self, pcfg)
78 def __init__(self, pcfg=None):
79 if pcfg is None:
---> 80 pcfg = build_model()
81
82 self.pcfg = pcfg
/usr/local/lib/python2.7/dist-packages/stat_parser/learn.pyc in build_model()
26
27 if not exists(TEMP_DIR):
---> 28 makedirs(TEMP_DIR)
29
30 # Normalise the treebanks
/usr/lib/python2.7/os.pyc in makedirs(name, mode)
155 if tail == curdir: # xxx/newdir/. exists if xxx/newdir exists
156 return
--> 157 mkdir(name, mode)
158
159 def removedirs(name):
OSError: [Errno 13] Permission denied: '/usr/local/lib/python2.7/dist-packages/stat_parser/temp'
Wouldn't it be better to use a https://docs.python.org/2/library/tempfile.html or something else?
Fixed by manually creating that file and chmod'ing it as work around.
Hi
Please could anyone help with why I get this error when I run the example code?
Would REALLY appreciate a response
Thank you!
Supposed to have
from nltk.grammar import CFG
grammar = CFG.fromstring("""
# Grammatical productions.
S -> NP VP
NP -> Det N PP | Det N
VP -> V NP PP | V NP | V
PP -> P NP
# Lexical productions.
NP -> 'I'
Det -> 'the' | 'a'
N -> 'man' | 'park' | 'dog' | 'telescope'
V -> 'ate' | 'saw'
P
will pyStatParser
output a CFG string for the grammar?
The program trew an error while parsing a sentence with brackets in it. If the part in brackets is removed, the sentence gets parsed successfully.
print parser.parse ("(CCC 2313) Defending one's country against aggression is permitted, but we should never forget that every human life, from the moment of conception, is sacred because it is made in God's image and likeness.")
Traceback (most recent call last):
File "<pyshell#224>", line 1, in
print parser.parse ("(CCC 2313) Defending one's country against aggression is permitted, but we should never forget that every human life, from the moment of conception, is sacred because it is made in God's image and likeness.")
File "stat_parser\parser.py", line 111, in nltk_parse
return nltk_tree(self.raw_parse(sentence))
File "stat_parser\parser.py", line 106, in raw_parse
tree = self.norm_parse(sentence)
File "stat_parser\parser.py", line 92, in norm_parse
if is_cap_word(words[0]):
File "stat_parser\word_classes.py", line 6, in is_cap_word
return CAP.match(word) is not None
TypeError: expected string or buffer
print parser.parse ("Defending one's country against aggression is permitted, but we should never forget that every human life, from the moment of conception, is sacred because it is made in God's image and likeness.")
(S+VP
(VBG defending)
(NP
(NP (PRP one) (POS 's))
(NN country)
(SBAR
(IN against)
(S
(VP
(VB aggression)
(VBZ is)
(UCP
(VP (JJ permitted))
(, ,)
(CC but)
(S
(NP (PRP we))
(VP
(MD should)
(ADVP (RB never))
(VB forget)
(PP (IN that) (NP (DT every) (JJ human) (NN life)))))
(, ,)
(PP
(IN from)
(NP
(NP (DT the) (NN moment))
(PP (IN of) (NP (NN conception)))))
(, ,)
(VP
(VBZ is)
(VBD sacred)
(SBAR
(IN because)
(S
(NP (PRP it))
(VP
(VBZ is)
(VBN made)
(PP (IN in) (NP (NNP God) (POS 's)))))))
(NN image)
(CC and)
(JJ likeness)))
(. .)))))
Hi, I cloned the master branch and tried the example in README, but encountered the following error. Could anyone know how to fix it? Thanks!
Python 2.7.12 (default, Jul 18 2016, 15:02:52)
[GCC 4.8.4] on linux2
Type "help", "copyright", "credits" or "license" for more information.
from stat_parser import Parser
parser = Parser()
print parser.parse("How can the net amount of entropy of the universe be massively decreased?")
Traceback (most recent call last):
File "", line 1, in
File "stat_parser/parser.py", line 112, in nltk_parse
return nltk_tree(self.raw_parse(sentence))
File "stat_parser/parser.py", line 107, in raw_parse
tree = self.norm_parse(sentence)
File "stat_parser/parser.py", line 104, in norm_parse
return CKY(self.pcfg, norm_words)
File "stat_parser/parser.py", line 74, in CKY
_, top = max([(pi[1, n, X], bp[1, n, X]) for X in pcfg.N])
ValueError: max() arg is an empty sequence
Hi,
I can't seem to install the library from PyPi using pip3. I think I've tried all the possible combinations of pyStatParser name to install it via pip3.
What's the library name when installing with pip3? Cloning and installing with python setup.py install --user
(python3) works fine.
Hi,
I have a dataset that the speech are changes to the text data and they are wrong sentences and my sentences must do corrected. do your parser repair my wrong text?
parser.norm_parse(text)
returns None
for some sentences, hence parser.parse(text)
crashes because is uses norm_parse
internally.
The string "this is bad"
falls into that category and cannot be parsed with norm_parse
, but "this is very bad"
works well.
It looks like this package is currently unavailable on pypi. Would you consider uploading it ease of use?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.