Code Monkey home page Code Monkey logo

Comments (10)

tek avatar tek commented on June 12, 2024 1

yep, already have that tab open πŸ˜‰

from tree-sitter-haskell.

tek avatar tek commented on June 12, 2024

right! I added some basics now, but there are some more missing.

from tree-sitter-haskell.

maralorn avatar maralorn commented on June 12, 2024

Thank you for the quick reaction. Yeah, those are probably the most important, nice.

Here is the list of all symbols, not so many are missing:

https://downloads.haskell.org/ghc/latest/docs/users_guide/exts/unicode_syntax.html

from tree-sitter-haskell.

timtro avatar timtro commented on June 12, 2024

Thanks in advance. I wish I could help solve and not merely report the issue. But I'm getting errors when I simply use unicode characters in/as identifiers.

Possibly helpful links:

Example below, and please, don't judge me for the quality of this code. It's my first Haskell program, and it's fit for a very specific purpose which is not production. (It mirrors a theoretical construction in my PhD thesis in systems theory.)

{-# LANGUAGE InstanceSigs  #-}
{-# LANGUAGE UnicodeSyntax #-}

module PrAlgebra where

import           Data.Fix (Fix (Fix), foldFix, unFix)

(β–½) :: (a β†’ c) β†’ (b β†’ c) β†’ Either a b β†’ c
(β–½) = either

(β–³) :: (b β†’ c) β†’ (b β†’ c') β†’ b β†’ (c, c')
(β–³) f g x = (f x, g x)

newtype π˜—α΅£ hd tl = Pα΅£ (Maybe (tl, hd))

instance Functor (π˜—α΅£ hd) where
  fmap :: (a β†’ b) β†’ π˜—α΅£ hd a β†’ π˜—α΅£ hd b
  fmap f (Pα΅£ Nothing)         = Pα΅£ Nothing
  fmap f (Pα΅£ (Just (tl, hd))) = Pα΅£ (Just (f tl, hd))

type π˜—α΅£Algebra state value =  π˜—α΅£ value state β†’ state

type Snoc hd = Fix(π˜—α΅£ hd)

snoc :: Snoc a β†’ a β†’ Snoc a
snoc xs x = Fix (Pα΅£ (Just (xs, x)))

In terms of syntax highlighting, everything is coloured as a type. Here is a screenshot where a constructor is being called a type.
Screenshot from 2022-12-22 09-50-47

The tree is listed below.

pragma [0, 0] - [0, 30]
pragma [1, 0] - [1, 30]
module: module [3, 7] - [3, 16]
where [3, 17] - [3, 22]
ERROR [5, 0] - [25, 37]
  import [5, 0] - [5, 53]
    qualified_module [5, 17] - [5, 25]
      module [5, 17] - [5, 21]
      module [5, 22] - [5, 25]
    import_list [5, 26] - [5, 53]
      import_item [5, 27] - [5, 36]
        type [5, 27] - [5, 30]
        import_con_names [5, 31] - [5, 36]
          constructor [5, 32] - [5, 35]
      comma [5, 36] - [5, 37]
      import_item [5, 38] - [5, 45]
        variable [5, 38] - [5, 45]
      comma [5, 45] - [5, 46]
      import_item [5, 47] - [5, 52]
        variable [5, 47] - [5, 52]
  pat_literal [7, 0] - [7, 5]
    con_unit [7, 0] - [7, 5]
      ERROR [7, 1] - [7, 4]
        ERROR [7, 1] - [7, 4]
  type_parens [7, 9] - [7, 18]
    fun [7, 10] - [7, 17]
      type_name [7, 10] - [7, 11]
        type_variable [7, 10] - [7, 11]
      type_name [7, 16] - [7, 17]
        type_variable [7, 16] - [7, 17]
  type_parens [7, 23] - [7, 32]
    fun [7, 24] - [7, 31]
      type_name [7, 24] - [7, 25]
        type_variable [7, 24] - [7, 25]
      type_name [7, 30] - [7, 31]
        type_variable [7, 30] - [7, 31]
  type_apply [7, 37] - [7, 47]
    type_name [7, 37] - [7, 43]
      type [7, 37] - [7, 43]
    type_name [7, 44] - [7, 45]
      type_variable [7, 44] - [7, 45]
    type_name [7, 46] - [7, 47]
      type_variable [7, 46] - [7, 47]
  constraint [7, 52] - [25, 37]
    class: class_name [7, 52] - [7, 53]
      type_variable [7, 52] - [7, 53]
    type_literal [8, 0] - [8, 5]
      con_unit [8, 0] - [8, 5]
        ERROR [8, 1] - [8, 4]
          ERROR [8, 1] - [8, 4]
    ERROR [8, 6] - [8, 7]
    type_name [8, 8] - [8, 14]
      type_variable [8, 8] - [8, 14]
    type_literal [10, 0] - [10, 5]
      con_unit [10, 0] - [10, 5]
        ERROR [10, 1] - [10, 4]
          ERROR [10, 1] - [10, 4]
    ERROR [10, 6] - [10, 8]
    type_parens [10, 9] - [10, 18]
      fun [10, 10] - [10, 17]
        type_name [10, 10] - [10, 11]
          type_variable [10, 10] - [10, 11]
        type_name [10, 16] - [10, 17]
          type_variable [10, 16] - [10, 17]
    ERROR [10, 19] - [10, 22]
    type_parens [10, 23] - [10, 33]
      fun [10, 24] - [10, 32]
        type_name [10, 24] - [10, 25]
          type_variable [10, 24] - [10, 25]
        type_name [10, 30] - [10, 32]
          type_variable [10, 30] - [10, 32]
    ERROR [10, 34] - [10, 37]
    type_name [10, 38] - [10, 39]
      type_variable [10, 38] - [10, 39]
    ERROR [10, 40] - [10, 43]
    type_tuple [10, 44] - [10, 51]
      type_name [10, 45] - [10, 46]
        type_variable [10, 45] - [10, 46]
      comma [10, 46] - [10, 47]
      type_name [10, 48] - [10, 50]
        type_variable [10, 48] - [10, 50]
    type_literal [11, 0] - [11, 5]
      con_unit [11, 0] - [11, 5]
        ERROR [11, 1] - [11, 4]
          ERROR [11, 1] - [11, 4]
    type_name [11, 6] - [11, 7]
      type_variable [11, 6] - [11, 7]
    type_name [11, 8] - [11, 9]
      type_variable [11, 8] - [11, 9]
    type_name [11, 10] - [11, 11]
      type_variable [11, 10] - [11, 11]
    ERROR [11, 12] - [11, 13]
    type_tuple [11, 14] - [11, 24]
      type_apply [11, 15] - [11, 18]
        type_name [11, 15] - [11, 16]
          type_variable [11, 15] - [11, 16]
        type_name [11, 17] - [11, 18]
          type_variable [11, 17] - [11, 18]
      comma [11, 18] - [11, 19]
      type_apply [11, 20] - [11, 23]
        type_name [11, 20] - [11, 21]
          type_variable [11, 20] - [11, 21]
        type_name [11, 22] - [11, 23]
          type_variable [11, 22] - [11, 23]
    type_name [13, 0] - [13, 7]
      type_variable [13, 0] - [13, 7]
    ERROR [13, 8] - [13, 15]
      ERROR [13, 8] - [13, 15]
    type_name [13, 16] - [13, 18]
      type_variable [13, 16] - [13, 18]
    type_name [13, 19] - [13, 21]
      type_variable [13, 19] - [13, 21]
    ERROR [13, 22] - [13, 23]
    type_name [13, 24] - [13, 25]
      type [13, 24] - [13, 25]
    ERROR [13, 25] - [13, 28]
      ERROR [13, 25] - [13, 28]
    type_parens [13, 29] - [13, 45]
      type_apply [13, 30] - [13, 44]
        type_name [13, 30] - [13, 35]
          type [13, 30] - [13, 35]
        type_tuple [13, 36] - [13, 44]
          type_name [13, 37] - [13, 39]
            type_variable [13, 37] - [13, 39]
          comma [13, 39] - [13, 40]
          type_name [13, 41] - [13, 43]
            type_variable [13, 41] - [13, 43]
    type_name [15, 0] - [15, 8]
      type_variable [15, 0] - [15, 8]
    type_name [15, 9] - [15, 16]
      type [15, 9] - [15, 16]
    type_parens [15, 17] - [15, 29]
      ERROR [15, 18] - [15, 25]
        ERROR [15, 18] - [15, 25]
      type_name [15, 26] - [15, 28]
        type_variable [15, 26] - [15, 28]
    type_name [15, 30] - [15, 35]
      type_variable [15, 30] - [15, 35]
    type_name [16, 2] - [16, 6]
      type_variable [16, 2] - [16, 6]
    ERROR [16, 7] - [16, 9]
    type_parens [16, 10] - [16, 19]
      fun [16, 11] - [16, 18]
        type_name [16, 11] - [16, 12]
          type_variable [16, 11] - [16, 12]
        type_name [16, 17] - [16, 18]
          type_variable [16, 17] - [16, 18]
    ERROR [16, 20] - [16, 31]
      ERROR [16, 24] - [16, 31]
    type_name [16, 32] - [16, 34]
      type_variable [16, 32] - [16, 34]
    type_name [16, 35] - [16, 36]
      type_variable [16, 35] - [16, 36]
    ERROR [16, 37] - [16, 48]
      ERROR [16, 41] - [16, 48]
    type_name [16, 49] - [16, 51]
      type_variable [16, 49] - [16, 51]
    type_name [16, 52] - [16, 53]
      type_variable [16, 52] - [16, 53]
    type_name [17, 2] - [17, 6]
      type_variable [17, 2] - [17, 6]
    type_name [17, 7] - [17, 8]
      type_variable [17, 7] - [17, 8]
    type_parens [17, 9] - [17, 23]
      type_apply [17, 10] - [17, 22]
        type_name [17, 10] - [17, 11]
          type [17, 10] - [17, 11]
        ERROR [17, 11] - [17, 14]
          ERROR [17, 11] - [17, 14]
        type_name [17, 15] - [17, 22]
          type [17, 15] - [17, 22]
    ERROR [17, 32] - [17, 33]
    type_name [17, 34] - [17, 35]
      type [17, 34] - [17, 35]
    ERROR [17, 35] - [17, 38]
      ERROR [17, 35] - [17, 38]
    type_name [17, 39] - [17, 46]
      type [17, 39] - [17, 46]
    type_name [18, 2] - [18, 6]
      type_variable [18, 2] - [18, 6]
    type_name [18, 7] - [18, 8]
      type_variable [18, 7] - [18, 8]
    type_parens [18, 9] - [18, 31]
      type_apply [18, 10] - [18, 30]
        type_name [18, 10] - [18, 11]
          type [18, 10] - [18, 11]
        ERROR [18, 11] - [18, 14]
          ERROR [18, 11] - [18, 14]
        type_parens [18, 15] - [18, 30]
          type_apply [18, 16] - [18, 29]
            type_name [18, 16] - [18, 20]
              type [18, 16] - [18, 20]
            type_tuple [18, 21] - [18, 29]
              type_name [18, 22] - [18, 24]
                type_variable [18, 22] - [18, 24]
              comma [18, 24] - [18, 25]
              type_name [18, 26] - [18, 28]
                type_variable [18, 26] - [18, 28]
    ERROR [18, 32] - [18, 33]
    type_name [18, 34] - [18, 35]
      type [18, 34] - [18, 35]
    ERROR [18, 35] - [18, 38]
      ERROR [18, 35] - [18, 38]
    type_parens [18, 39] - [18, 56]
      type_apply [18, 40] - [18, 55]
        type_name [18, 40] - [18, 44]
          type [18, 40] - [18, 44]
        type_tuple [18, 45] - [18, 55]
          type_apply [18, 46] - [18, 50]
            type_name [18, 46] - [18, 47]
              type_variable [18, 46] - [18, 47]
            type_name [18, 48] - [18, 50]
              type_variable [18, 48] - [18, 50]
          comma [18, 50] - [18, 51]
          type_name [18, 52] - [18, 54]
            type_variable [18, 52] - [18, 54]
    type_name [20, 0] - [20, 4]
      type_variable [20, 0] - [20, 4]
    ERROR [20, 5] - [20, 12]
      ERROR [20, 5] - [20, 12]
    type_name [20, 12] - [20, 19]
      type [20, 12] - [20, 19]
    type_name [20, 20] - [20, 25]
      type_variable [20, 20] - [20, 25]
    type_name [20, 26] - [20, 31]
      type_variable [20, 26] - [20, 31]
    ERROR [20, 32] - [20, 42]
      ERROR [20, 35] - [20, 42]
    type_name [20, 43] - [20, 48]
      type_variable [20, 43] - [20, 48]
    type_name [20, 49] - [20, 54]
      type_variable [20, 49] - [20, 54]
    ERROR [20, 55] - [20, 58]
    type_name [20, 59] - [20, 64]
      type_variable [20, 59] - [20, 64]
    type_name [22, 0] - [22, 4]
      type_variable [22, 0] - [22, 4]
    type_name [22, 5] - [22, 9]
      type [22, 5] - [22, 9]
    type_name [22, 10] - [22, 12]
      type_variable [22, 10] - [22, 12]
    ERROR [22, 13] - [22, 14]
    type_name [22, 15] - [22, 18]
      type [22, 15] - [22, 18]
    type_parens [22, 18] - [22, 30]
      ERROR [22, 19] - [22, 26]
        ERROR [22, 19] - [22, 26]
      type_name [22, 27] - [22, 29]
        type_variable [22, 27] - [22, 29]
    type_name [24, 0] - [24, 4]
      type_variable [24, 0] - [24, 4]
    ERROR [24, 5] - [24, 7]
    type_name [24, 8] - [24, 12]
      type [24, 8] - [24, 12]
    type_name [24, 13] - [24, 14]
      type_variable [24, 13] - [24, 14]
    ERROR [24, 15] - [24, 18]
    type_name [24, 19] - [24, 20]
      type_variable [24, 19] - [24, 20]
    ERROR [24, 21] - [24, 24]
    type_name [24, 25] - [24, 29]
      type [24, 25] - [24, 29]
    type_name [24, 30] - [24, 31]
      type_variable [24, 30] - [24, 31]
    type_name [25, 0] - [25, 4]
      type_variable [25, 0] - [25, 4]
    type_name [25, 5] - [25, 7]
      type_variable [25, 5] - [25, 7]
    type_name [25, 8] - [25, 9]
      type_variable [25, 8] - [25, 9]
    ERROR [25, 10] - [25, 11]
    type_name [25, 12] - [25, 15]
      type [25, 12] - [25, 15]
    type_parens [25, 16] - [25, 37]
      type_apply [25, 17] - [25, 36]
        type_name [25, 17] - [25, 18]
          type [25, 17] - [25, 18]
        ERROR [25, 18] - [25, 21]
          ERROR [25, 18] - [25, 21]
        type_parens [25, 22] - [25, 36]
          type_apply [25, 23] - [25, 35]
            type_name [25, 23] - [25, 27]
              type [25, 23] - [25, 27]
            type_tuple [25, 28] - [25, 35]
              type_name [25, 29] - [25, 31]
                type_variable [25, 29] - [25, 31]
              comma [25, 31] - [25, 32]
              type_name [25, 33] - [25, 34]
                type_variable [25, 33] - [25, 34]

from tree-sitter-haskell.

tek avatar tek commented on June 12, 2024

I added three more symbols for built-in syntax.

I also took a look at the symbolic operator situation, and it's a little bit more difficult.
Legal characters for these varsyms are determined by membership in unicode categories, which contain about 6000 code points in noncontiguous intervals.

We are parsing varsyms in the scanner, which means we don't have access to the unicode category regex classes that are provided by tree-sitter.
I couldn't find a method to do this in standard C, but maybe someone knows better?
For what it's worth, I tried adding a switch with 6k cases and performance only degraded by about 1%.

from tree-sitter-haskell.

maralorn avatar maralorn commented on June 12, 2024

I am not sure, what the rules here are, but would it be terrible to over-approximate here? (Also don’t know if it would simplify things) I would assume that by allowing a larger class of unicode symbols that is maybe easier to check it would be unlikely to miss-parse valid Haskell?

from tree-sitter-haskell.

tek avatar tek commented on June 12, 2024

possibly, but I'm absolutely uncertain. 6k code points in a range of 130k seems quite disproportionate, and they are spaced out pretty wide.
We could try > N for some value and test all smaller ones explicitly.
But since performance doesn't take a significant hit, we could also just put the 6k cases in a separate file in a switch and be done with it πŸ™ƒ

from tree-sitter-haskell.

maralorn avatar maralorn commented on June 12, 2024

Your call. I would also wonder a bit how much bigger the grammar would become …

from tree-sitter-haskell.

tek avatar tek commented on June 12, 2024

the haskell.so grows by 10kB. (total 3.6MB)

from tree-sitter-haskell.

tek avatar tek commented on June 12, 2024

the arrow notation operators appear not to be within the categories used for the PR we just merged. also unsure about those banana brackets, they would probably need special treatment.

from tree-sitter-haskell.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.