Code Monkey home page Code Monkey logo

Comments (10)

digitalheir avatar digitalheir commented on June 15, 2024

You don't need Maven strictly, you can just include the jar on the classpath.
https://github.com/digitalheir/java-probabilistic-earley-parser/releases

I only now realized you might not know how to compile Java code? I had the assumption that anyone that wants to use this project is a Java developer. So I'm quite interested in your use case, could you share?

Anyway it's a good idea to make the product runnable to anyone who at least can write a Context Free Grammar, I'll work on that at some point. It shouldn't be much work.

I don't have access to a dev device currently and I'm working full time, so I can make a project for you probably next week.

from java-probabilistic-earley-parser.

 avatar commented on June 15, 2024

I meant if your project require some special setting to be execute. Anyway, I know how run java code :-) I'm a computer engineer. Thanks a lot! I wait your project example for the next week, if you can. Thanks again! :-)

from java-probabilistic-earley-parser.

digitalheir avatar digitalheir commented on June 15, 2024

I released a new version with usage details, check it out: https://github.com/digitalheir/java-probabilistic-earley-parser/releases

Thanks for reporting this. It forced me to run through a command-line example and fix a bug!

Let me know how it works for you. ~

from java-probabilistic-earley-parser.

 avatar commented on June 15, 2024

Hi! Your code works but there is a little problem.
If a digit:

java -jar probabilistic-earley-parser-0.9.11-jar-with-dependencies.jar -i grammar.cfg -goal S the heavy ball

The result is

0.128
└──
└── S
├── NP
│ ├── Det
│ │ └── the (the)
│ └── N
│ └── heavy (heavy)
└── VP
└── V
└── heave (heave)

In this case input string belong to grammar, but if a digit a string with one word that doesn't belong to input grammar, like:

java -jar probabilistic-earley-parser-0.9.11-jar-with-dependencies.jar -i grammar.cfg -goal S the heavy ball

(ball doesn't belong to grammar), the result is:

Exception in thread "main" java.lang.NullPointerException
at org.leibnizcenter.cfg.earleyparser.CommandLine.main(CommandLine.java:44)

There is a problem in CommandLine class.

from java-probabilistic-earley-parser.

digitalheir avatar digitalheir commented on June 15, 2024

Well, what do you expect? It's an illegal sentence, so the code throws an exception. I'll make the output a bit prettier I guess, as part of the error handling issue #5

from java-probabilistic-earley-parser.

 avatar commented on June 15, 2024

Output would be an error such as: symbol "ball" don't belong to input CFG.
We must correct this error how explained at the end of issues #5, for example inserting the word with high probability.

from java-probabilistic-earley-parser.

digitalheir avatar digitalheir commented on June 15, 2024

Hi Dan. Additional functionality is coming. I'll try to finish it this evening. I'll add command line options for difference scan modes, either:

  1. throw an error (strict mode as it it now, with better logging);
  2. ignore the unfound word (act as if it didn't exist);
  3. replace the unfound word with a wildcard that matches any category

It is difficult to pick a word from the lexicon, because the parser does not know (and should not know) the words that follow, so which words make a correct sentence.

For the wildcard option, you'll find the most likely category with the Viterbi parse. In post-processing you can select a random word from your lexicon, but idk if that should not be a task for this library because it seems pretty application-specific to me.

from java-probabilistic-earley-parser.

 avatar commented on June 15, 2024

Thanks a lot! You're the best 😃

from java-probabilistic-earley-parser.

digitalheir avatar digitalheir commented on June 15, 2024

I am building the new version right now. 0.9.12 will be available soon.

You can set the parse mode to lenient using either -scanmode drop or -scanmode wildcard.

I'm still thinking about how to communicate error events following an incident like this.

I'm thinking of passing a list of Exceptions to the ParseTree object that the parser ends up with.

from java-probabilistic-earley-parser.

 avatar commented on June 15, 2024

Thanks again!
Yes, it can be a great solution. To resolve error caused by wrong word in input string, you can substitute it with an other (of the same pruduction with the same head) with higher probability.
Have you implement "synchonizing token" method to error recovery?

from java-probabilistic-earley-parser.

Related Issues (13)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.