Code Monkey home page Code Monkey logo

nltk_tgrep's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

nltk_tgrep's Issues

Fix regexp node names

The node name /SBJ/ should match nodes names SBJ, SBJ1, and NP-SBJ. Currently, it doesn't match the last, because I'm using re.match. This should be changed to re.search.

Implement tgrep2 labeled nodes and segmented patterns

Sections 4.6 and 4.7 of the tgrep2 manual.

  1. How complicated is the distinction between back-links and cross-links?
  2. Is it sufficient to greedily assign the first matching node to a given label, or would a proper implementation of linking potentially mean back-tracking during matching?

Implement case insensitive node names with i@

This is implemented in _tgrep_node_action, but is not defined in the search string grammar. Probably could be hacked by tweaking the regexp for node names, though perhaps this should be done properly, with a constant 'i@' prefix before, say node literal and node regexp values (so that one can't write i@*).

There is a commented-out test case for this.

Checking for terminals?

Hi @wroberts, just wanted to say this tool is incredibly invaluable! Really a big time saver.

This question is more of a tgrep question, but I thought I might pose it here... I'm trying to develop a tregex for A with no children, or, in other words A as a leaf.

I've tried

"A < {}", "A !< *", and a few others...

Do you know how I can form this tregex or will I need to check the ParsedTree for subtrees of size 0?

Fix quoted node names

Node name quotation permits node names to contain quotes (these must be escaped with a backslash).

Currently, the code doesn't check for this, or perform un-escaping. The current code calls .strip('"'), which is wrong.

Write unit tests for remaining untested code

  • catching AttributeError in various places
  • node name formats:
    • tgrep2 print command (' prefix to node name)
    • quoted node names
    • i@ case insensitive node names
  • link relations:
    • <<, <<1
    • ,

    • <<'
    • '

    • undefined link relation (raises AssertionError right now)
  • treepositions_no_leaves function
  • tgrep_positions with search_leaves set to False

See, e.g., first coveralls build

NLTK 3.0.0 compatibility

Hallo,

Your readme example won't work in NLTK 3.0.0

tree = ParentedTree('(S (NP (DT the) (JJ big) (NN dog)) (VP bit) (NP (DT a) (NN cat)))')
ParentedTree: Expected a node value and child list

This works, however:

tree = ParentedTree.fromstring('(S (NP (DT the) (JJ big) (NN dog)) (VP bit) (NP (DT a) (NN cat)))')

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.