Code Monkey home page Code Monkey logo

java-probabilistic-earley-parser's People

Contributors

digitalheir avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

java-probabilistic-earley-parser's Issues

Do not allow malformed grammars

Ensure that the probabilities in a SCFG are proper and consistent as defined in Booth and Thompson (1973), and that the grammar contains no useless nonterminals (ones that can never appear in a derivation).

check that no rules are doubled with different probabilities (in which case we either have undefined dehaviour or conflate the rules?)

EXAMPLE PROJECT

Can you make a project, of this code, that I can execute/run on Mac OS by terminal, please? I haven't familiarity with Maven.
Thanks a lot in advance.

P.S if you make this example project, say me how can run it. Thanks!

left-recursive grammar breaks the parser

I tried the following grammar:

S -> a
S -> S a

Reading it like this:

Grammar<String> grammar = Grammar.parse(
                Paths.get("/some/path/test.cfg"), Charset.forName("UTF-8"));

Results in:

java.lang.RuntimeException: Matrix is singular.

	at org.leibnizcenter.cfg.algebra.matrix.LUDecomposition.solve(LUDecomposition.java:140)
	at org.leibnizcenter.cfg.algebra.matrix.Matrix.solve(Matrix.java:346)
	at org.leibnizcenter.cfg.algebra.matrix.Matrix.inverse(Matrix.java:357)
	at org.leibnizcenter.cfg.grammar.Grammar.getReflexiveTransitiveClosure(Grammar.java:134)
	at org.leibnizcenter.cfg.grammar.Grammar.<init>(Grammar.java:102)
	at org.leibnizcenter.cfg.grammar.Grammar$Builder.build(Grammar.java:416)
	at org.leibnizcenter.cfg.grammar.Grammar.parse(Grammar.java:183)
	at org.leibnizcenter.cfg.grammar.Grammar.parse(Grammar.java:166)
	at com.vision4j.internal.cli.PlayTest.cfg(PlayTest.java:48)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
	at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
	at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
	at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
	at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
	at com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:51)
	at com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:237)
	at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70)

Converting the grammar to right-recursive avoids this issue:

S -> a
S -> a S

I am using the latest version in Maven: 0.9.12
Is there something I misunderstood about the behaviour of the grammar or is this bug?

Question regarding grammar

Is it possible to write a grammar to parse the following pattern:

...anything RULE1 anything RUL2 anything...

What i want is match the rules defined in the sentence, and ignore the noises (anything -> may be any characters)

Writing cf-gammars without probabilities

Dear colleagues,

I'm exploring a good Earley parsers for writing cf-grammars, and this one seems to be friendly for me. Could you tell please, does this parser allow to write cf-grammars without probabilities setting?

P.S. I need Java parser like Lark (Python) for directly rule writing.

Thanks,
Daria

Example of drawing a parse tree when using JPEP as a library?

Hi again,

The example of how to use JPEP as a library is for "Parser.recognize". It would be nice to add a "println" of a parse tree, just like the command-line app does.

PS: in a previous issue, I mentioned I'm struggling with CommandLine's "argument magic". What I meant is that CommanLIne draws the parse by calling "System.out.println(parse.parseTree);", where "parse" is an object of class "ParseTreeWithScore", taking the arguments to the ParseTreeWithScore form the command-line arguments in a somewhat complex way (to me, at least). So I guess the question is how to build an object of type "ParseTreeWithScore" when using JPEP as a library, given a particular grammar and a set of tokens (as you would from the command-line).

Again, thanks and regards!

Implement ε-rules (empty rules)

The parser currently can't handle rules of the form

X → ε            (p)

where ε is the empty string.

See section 4.7 Null Productions on page 19 of Stolcke's paper.

We have the choice of extending prediction and completion to work with ε-rules, but this is a bit complicated. Another possibility is to rewrite the grammar to eliminate these productions, described at the end of page 20, 4.7.4 Eliminating null productions.

Best to implement the simpler solution first, and implement the philosphically correct version later.

Error: "The method parse(Path, Charset) is undefined for the type Grammar"

Hello,

I'm trying to use java-probabilistic-earley-parser as a library. Following the instructions:

You can parse .cfg files as follows:

Grammar<String> g = Grammar.parse(Paths.get("path", "to", "grammar.cfg"), Charset.forName("UTF-8"));

I get (in Eclipse) the error:
Error: "The method parse(Path, Charset) is undefined for the type Grammar"

I'm not using the Maven dependency, I'm just adding the latest jar to my project.

From the command line, everything works and I get a nice parse tree based on my grammar file, but the CommandLIne class does some "magic" with the arguments and I'm struggling to figure out how to do the equivalent thing without command-line arguments.

Thanks in advance!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.