Code Monkey home page Code Monkey logo

Comments (8)

MegaIng avatar MegaIng commented on May 29, 2024 1

@munching We aren't talking about your issue directly. If lark behaved correctly, you would have gotten an AttributeError from accessing those names on .data, which probably would have clued you in faster. That is what we are talking about fixing.

For collecting comments we normally suggest terminal callbacks, but that doesn't easily bring it into the tree, that is a bit of extra work.

from lark.

MegaIng avatar MegaIng commented on May 29, 2024

node.data isn't part of the input, but of the grammar. Use node.meta instead.

@erezsh We really need to change it back so that Tree doesn't take Token, but just their values. Those being Token's results in many issues.

from lark.

munching avatar munching commented on May 29, 2024

Hi @MegaIng
Thank you for the quick response. I totally misunderstood the "data" thing, now it works fine. Thank you very much for helping!

from lark.

erezsh avatar erezsh commented on May 29, 2024

@MegaIng Yeah, makes sense. The tokens of the parsed grammar aren't relevant to the output tree.

from lark.

munching avatar munching commented on May 29, 2024

Not sure what do you mean by that, I'm relatively new to Lark. But trying to access start_pos / end_pos of tokens is my attempt to bring comments into the tree that were ignored on parsing stage. I've done some research and it looks like there isn't a way to easily do that. I'm parsing a Pascal-like language and must use LALR parser because of its speed: my usual input is roughly 350 mb of source code and Earley works unacceptably slow. Tried to rewrite my grammar to not ignore the code but that's probably not possible with LALR. So in my case having tokens with information on where they were found in the input is the last hope of bringing in the comments.

from lark.

erezsh avatar erezsh commented on May 29, 2024

@munching You might find this useful: https://lark-parser.readthedocs.io/en/latest/recipes.html#collect-all-comments-with-lexer-callbacks

from lark.

munching avatar munching commented on May 29, 2024

@erezsh @MegaIng
Thank you very much for the explanations!
I'm actually already using terminal callbacks to collect comments and then knowing their start/end I go through the tree and try to find the "tightest" token that fully enclose my comment. Then the task is to figure out in between what children to put it to. And that's where I was stuck because data.start_pos was giving me seemingly irrelevant numbers :)

from lark.

erezsh avatar erezsh commented on May 29, 2024

There is some code that I wrote once that did something like it.

Maybe I should clean it up and add it to Lark, as a utility function.

I don't know if it will be helpful for you, but this is the code:

def assign_comments(tree, comments):
    nodes_by_line = classify(tree.iter_subtrees(), lambda t: getattr(t.meta, 'line', None))
    nodes_by_line.pop(None,None)
    rightmost_nodes = {line: max(nodes, key=lambda n: n.meta.column) for line, nodes in nodes_by_line.items()}
    leftmost_nodes  = {line: min(nodes, key=lambda n: (n.meta.column, -(n.meta.end_pos - n.meta.start_pos))) for line, nodes in nodes_by_line.items()}

    for c in comments:
        if c.line == c.end_line:
            n = rightmost_nodes[c.end_line]
            assert not hasattr(n.meta, 'inline_comment')
            n.meta.inline_comment = c
        else:
            if c.end_line not in leftmost_nodes:
                # Probably past the end of the file
                # XXX verify this is the case
                continue

            n = leftmost_nodes[c.end_line]
            header_comments = getattr(n.meta, 'header_comments', [])
            n.meta.header_comments = header_comments + [c]

P.S. classify() is basically like itertools.groupby but it returns a dict.

from lark.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.