I understand this is WIP - but I noticed missing closing parens cause problems. Is thi

Hey <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

unmatched parentheses about tree-sitter-haskell HOT 4 CLOSED

tree-sitter commented on June 12, 2024

unmatched parentheses

from tree-sitter-haskell.

Comments (4)

maxbrunsfeld commented on June 12, 2024 1

Basically, when there's an error - use indentation level as a heuristic to resume at the next declaration.

Yeah, Tree-sitter doesn't do this. I don't think that would be a helpful approach in this particular case anyway, because the error would not be detected until the middle of the second declaration, unless I'm missing something.

from tree-sitter-haskell.

maxbrunsfeld commented on June 12, 2024

Hey @dpren, it'd be great to have another set of 👀 on this parser. Glad you're trying it out!

Could you explain the behavior you expect in a bit more detail? A better error recovery?

It looks like right now, the parser can't detect an error until the end of file, because it sees the second = token as a valid variable_operator.

@rewinfrey Is it correct that a single = can be used as an operator? I think we might want to restrict the definition of variable_symbol so that this won't happen.

If I hack that change in, then the parser can detect the error a bit earlier, at the second = instead of at the EOF. We then get a better recovery:

(module [0, 0] - [3, 0]
  (function_declaration [0, 0] - [2, 9]
    (variable_identifier [0, 0] - [0, 5])
    (ERROR [0, 6] - [2, 5]
      (function_application [0, 9] - [2, 5]
        (parenthesized_expression [0, 9] - [0, 12]
          (integer [0, 10] - [0, 11]))
        (variable_identifier [2, 0] - [2, 5])))
    (function_body [2, 6] - [2, 9]
      (integer [2, 8] - [2, 9]))))

Basically, we treat the = ((1) test as an error and parse the remainder as a single declaration. That's now a pretty reasonable recovery, seeing as how the error cannot theoretically be detected until the second =.

from tree-sitter-haskell.

maxbrunsfeld commented on June 12, 2024

@dpren For some context on Tree-sitter's error recovery algorithm - It currently works by skipping one or more tokens and inserting at most one token, in some sequence that overlaps or is adjacent to the error detection point. Tree-sitter tries several ways of recovering from the error and chooses the one that ends up minimizing a pre-defined 'cost' metric.

Because it uses LR parsing, Tree-sitter always detects an error at the earliest possible point in the file, which in this case is the second = sign. So it can only perform sequences of skipping / insertion operations that touch that token. That's why it can't simply skip the redundant open-paren or insert a closing paren to match it: those operations are not adjacent to the point of error detection, so they would be prohibitively expensive to consider.

from tree-sitter-haskell.

dpren commented on June 12, 2024

@maxbrunsfeld Thanks for the detailed reply! I'm looking for better recovery. For context, I'm considering tree-sitter for an IDE focused parser for the Elm language.

This last section "Recovery" of this HN comment sums up what I have in mind: https://news.ycombinator.com/item?id=13918175

Basically, when there's an error - use indentation level as a heuristic to resume at the next declaration.

I'm guessing tree-sitter can't support this sort of thing? I could always re-parse sections manually, but at that point I wonder if I should just be using a different tool or going handwritten.

from tree-sitter-haskell.

unmatched parentheses about tree-sitter-haskell HOT 4 CLOSED

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent