Comments (10)
yep, already have that tab open π
from tree-sitter-haskell.
right! I added some basics now, but there are some more missing.
from tree-sitter-haskell.
Thank you for the quick reaction. Yeah, those are probably the most important, nice.
Here is the list of all symbols, not so many are missing:
https://downloads.haskell.org/ghc/latest/docs/users_guide/exts/unicode_syntax.html
from tree-sitter-haskell.
Thanks in advance. I wish I could help solve and not merely report the issue. But I'm getting errors when I simply use unicode characters in/as identifiers.
Possibly helpful links:
- This SE post which links to the chapter on Lexical structure in the Haskell 98 report
- The chapter itself
Example below, and please, don't judge me for the quality of this code. It's my first Haskell program, and it's fit for a very specific purpose which is not production. (It mirrors a theoretical construction in my PhD thesis in systems theory.)
{-# LANGUAGE InstanceSigs #-}
{-# LANGUAGE UnicodeSyntax #-}
module PrAlgebra where
import Data.Fix (Fix (Fix), foldFix, unFix)
(β½) :: (a β c) β (b β c) β Either a b β c
(β½) = either
(β³) :: (b β c) β (b β c') β b β (c, c')
(β³) f g x = (f x, g x)
newtype πα΅£ hd tl = Pα΅£ (Maybe (tl, hd))
instance Functor (πα΅£ hd) where
fmap :: (a β b) β πα΅£ hd a β πα΅£ hd b
fmap f (Pα΅£ Nothing) = Pα΅£ Nothing
fmap f (Pα΅£ (Just (tl, hd))) = Pα΅£ (Just (f tl, hd))
type πα΅£Algebra state value = πα΅£ value state β state
type Snoc hd = Fix(πα΅£ hd)
snoc :: Snoc a β a β Snoc a
snoc xs x = Fix (Pα΅£ (Just (xs, x)))
In terms of syntax highlighting, everything is coloured as a type. Here is a screenshot where a constructor is being called a type.
The tree is listed below.
pragma [0, 0] - [0, 30]
pragma [1, 0] - [1, 30]
module: module [3, 7] - [3, 16]
where [3, 17] - [3, 22]
ERROR [5, 0] - [25, 37]
import [5, 0] - [5, 53]
qualified_module [5, 17] - [5, 25]
module [5, 17] - [5, 21]
module [5, 22] - [5, 25]
import_list [5, 26] - [5, 53]
import_item [5, 27] - [5, 36]
type [5, 27] - [5, 30]
import_con_names [5, 31] - [5, 36]
constructor [5, 32] - [5, 35]
comma [5, 36] - [5, 37]
import_item [5, 38] - [5, 45]
variable [5, 38] - [5, 45]
comma [5, 45] - [5, 46]
import_item [5, 47] - [5, 52]
variable [5, 47] - [5, 52]
pat_literal [7, 0] - [7, 5]
con_unit [7, 0] - [7, 5]
ERROR [7, 1] - [7, 4]
ERROR [7, 1] - [7, 4]
type_parens [7, 9] - [7, 18]
fun [7, 10] - [7, 17]
type_name [7, 10] - [7, 11]
type_variable [7, 10] - [7, 11]
type_name [7, 16] - [7, 17]
type_variable [7, 16] - [7, 17]
type_parens [7, 23] - [7, 32]
fun [7, 24] - [7, 31]
type_name [7, 24] - [7, 25]
type_variable [7, 24] - [7, 25]
type_name [7, 30] - [7, 31]
type_variable [7, 30] - [7, 31]
type_apply [7, 37] - [7, 47]
type_name [7, 37] - [7, 43]
type [7, 37] - [7, 43]
type_name [7, 44] - [7, 45]
type_variable [7, 44] - [7, 45]
type_name [7, 46] - [7, 47]
type_variable [7, 46] - [7, 47]
constraint [7, 52] - [25, 37]
class: class_name [7, 52] - [7, 53]
type_variable [7, 52] - [7, 53]
type_literal [8, 0] - [8, 5]
con_unit [8, 0] - [8, 5]
ERROR [8, 1] - [8, 4]
ERROR [8, 1] - [8, 4]
ERROR [8, 6] - [8, 7]
type_name [8, 8] - [8, 14]
type_variable [8, 8] - [8, 14]
type_literal [10, 0] - [10, 5]
con_unit [10, 0] - [10, 5]
ERROR [10, 1] - [10, 4]
ERROR [10, 1] - [10, 4]
ERROR [10, 6] - [10, 8]
type_parens [10, 9] - [10, 18]
fun [10, 10] - [10, 17]
type_name [10, 10] - [10, 11]
type_variable [10, 10] - [10, 11]
type_name [10, 16] - [10, 17]
type_variable [10, 16] - [10, 17]
ERROR [10, 19] - [10, 22]
type_parens [10, 23] - [10, 33]
fun [10, 24] - [10, 32]
type_name [10, 24] - [10, 25]
type_variable [10, 24] - [10, 25]
type_name [10, 30] - [10, 32]
type_variable [10, 30] - [10, 32]
ERROR [10, 34] - [10, 37]
type_name [10, 38] - [10, 39]
type_variable [10, 38] - [10, 39]
ERROR [10, 40] - [10, 43]
type_tuple [10, 44] - [10, 51]
type_name [10, 45] - [10, 46]
type_variable [10, 45] - [10, 46]
comma [10, 46] - [10, 47]
type_name [10, 48] - [10, 50]
type_variable [10, 48] - [10, 50]
type_literal [11, 0] - [11, 5]
con_unit [11, 0] - [11, 5]
ERROR [11, 1] - [11, 4]
ERROR [11, 1] - [11, 4]
type_name [11, 6] - [11, 7]
type_variable [11, 6] - [11, 7]
type_name [11, 8] - [11, 9]
type_variable [11, 8] - [11, 9]
type_name [11, 10] - [11, 11]
type_variable [11, 10] - [11, 11]
ERROR [11, 12] - [11, 13]
type_tuple [11, 14] - [11, 24]
type_apply [11, 15] - [11, 18]
type_name [11, 15] - [11, 16]
type_variable [11, 15] - [11, 16]
type_name [11, 17] - [11, 18]
type_variable [11, 17] - [11, 18]
comma [11, 18] - [11, 19]
type_apply [11, 20] - [11, 23]
type_name [11, 20] - [11, 21]
type_variable [11, 20] - [11, 21]
type_name [11, 22] - [11, 23]
type_variable [11, 22] - [11, 23]
type_name [13, 0] - [13, 7]
type_variable [13, 0] - [13, 7]
ERROR [13, 8] - [13, 15]
ERROR [13, 8] - [13, 15]
type_name [13, 16] - [13, 18]
type_variable [13, 16] - [13, 18]
type_name [13, 19] - [13, 21]
type_variable [13, 19] - [13, 21]
ERROR [13, 22] - [13, 23]
type_name [13, 24] - [13, 25]
type [13, 24] - [13, 25]
ERROR [13, 25] - [13, 28]
ERROR [13, 25] - [13, 28]
type_parens [13, 29] - [13, 45]
type_apply [13, 30] - [13, 44]
type_name [13, 30] - [13, 35]
type [13, 30] - [13, 35]
type_tuple [13, 36] - [13, 44]
type_name [13, 37] - [13, 39]
type_variable [13, 37] - [13, 39]
comma [13, 39] - [13, 40]
type_name [13, 41] - [13, 43]
type_variable [13, 41] - [13, 43]
type_name [15, 0] - [15, 8]
type_variable [15, 0] - [15, 8]
type_name [15, 9] - [15, 16]
type [15, 9] - [15, 16]
type_parens [15, 17] - [15, 29]
ERROR [15, 18] - [15, 25]
ERROR [15, 18] - [15, 25]
type_name [15, 26] - [15, 28]
type_variable [15, 26] - [15, 28]
type_name [15, 30] - [15, 35]
type_variable [15, 30] - [15, 35]
type_name [16, 2] - [16, 6]
type_variable [16, 2] - [16, 6]
ERROR [16, 7] - [16, 9]
type_parens [16, 10] - [16, 19]
fun [16, 11] - [16, 18]
type_name [16, 11] - [16, 12]
type_variable [16, 11] - [16, 12]
type_name [16, 17] - [16, 18]
type_variable [16, 17] - [16, 18]
ERROR [16, 20] - [16, 31]
ERROR [16, 24] - [16, 31]
type_name [16, 32] - [16, 34]
type_variable [16, 32] - [16, 34]
type_name [16, 35] - [16, 36]
type_variable [16, 35] - [16, 36]
ERROR [16, 37] - [16, 48]
ERROR [16, 41] - [16, 48]
type_name [16, 49] - [16, 51]
type_variable [16, 49] - [16, 51]
type_name [16, 52] - [16, 53]
type_variable [16, 52] - [16, 53]
type_name [17, 2] - [17, 6]
type_variable [17, 2] - [17, 6]
type_name [17, 7] - [17, 8]
type_variable [17, 7] - [17, 8]
type_parens [17, 9] - [17, 23]
type_apply [17, 10] - [17, 22]
type_name [17, 10] - [17, 11]
type [17, 10] - [17, 11]
ERROR [17, 11] - [17, 14]
ERROR [17, 11] - [17, 14]
type_name [17, 15] - [17, 22]
type [17, 15] - [17, 22]
ERROR [17, 32] - [17, 33]
type_name [17, 34] - [17, 35]
type [17, 34] - [17, 35]
ERROR [17, 35] - [17, 38]
ERROR [17, 35] - [17, 38]
type_name [17, 39] - [17, 46]
type [17, 39] - [17, 46]
type_name [18, 2] - [18, 6]
type_variable [18, 2] - [18, 6]
type_name [18, 7] - [18, 8]
type_variable [18, 7] - [18, 8]
type_parens [18, 9] - [18, 31]
type_apply [18, 10] - [18, 30]
type_name [18, 10] - [18, 11]
type [18, 10] - [18, 11]
ERROR [18, 11] - [18, 14]
ERROR [18, 11] - [18, 14]
type_parens [18, 15] - [18, 30]
type_apply [18, 16] - [18, 29]
type_name [18, 16] - [18, 20]
type [18, 16] - [18, 20]
type_tuple [18, 21] - [18, 29]
type_name [18, 22] - [18, 24]
type_variable [18, 22] - [18, 24]
comma [18, 24] - [18, 25]
type_name [18, 26] - [18, 28]
type_variable [18, 26] - [18, 28]
ERROR [18, 32] - [18, 33]
type_name [18, 34] - [18, 35]
type [18, 34] - [18, 35]
ERROR [18, 35] - [18, 38]
ERROR [18, 35] - [18, 38]
type_parens [18, 39] - [18, 56]
type_apply [18, 40] - [18, 55]
type_name [18, 40] - [18, 44]
type [18, 40] - [18, 44]
type_tuple [18, 45] - [18, 55]
type_apply [18, 46] - [18, 50]
type_name [18, 46] - [18, 47]
type_variable [18, 46] - [18, 47]
type_name [18, 48] - [18, 50]
type_variable [18, 48] - [18, 50]
comma [18, 50] - [18, 51]
type_name [18, 52] - [18, 54]
type_variable [18, 52] - [18, 54]
type_name [20, 0] - [20, 4]
type_variable [20, 0] - [20, 4]
ERROR [20, 5] - [20, 12]
ERROR [20, 5] - [20, 12]
type_name [20, 12] - [20, 19]
type [20, 12] - [20, 19]
type_name [20, 20] - [20, 25]
type_variable [20, 20] - [20, 25]
type_name [20, 26] - [20, 31]
type_variable [20, 26] - [20, 31]
ERROR [20, 32] - [20, 42]
ERROR [20, 35] - [20, 42]
type_name [20, 43] - [20, 48]
type_variable [20, 43] - [20, 48]
type_name [20, 49] - [20, 54]
type_variable [20, 49] - [20, 54]
ERROR [20, 55] - [20, 58]
type_name [20, 59] - [20, 64]
type_variable [20, 59] - [20, 64]
type_name [22, 0] - [22, 4]
type_variable [22, 0] - [22, 4]
type_name [22, 5] - [22, 9]
type [22, 5] - [22, 9]
type_name [22, 10] - [22, 12]
type_variable [22, 10] - [22, 12]
ERROR [22, 13] - [22, 14]
type_name [22, 15] - [22, 18]
type [22, 15] - [22, 18]
type_parens [22, 18] - [22, 30]
ERROR [22, 19] - [22, 26]
ERROR [22, 19] - [22, 26]
type_name [22, 27] - [22, 29]
type_variable [22, 27] - [22, 29]
type_name [24, 0] - [24, 4]
type_variable [24, 0] - [24, 4]
ERROR [24, 5] - [24, 7]
type_name [24, 8] - [24, 12]
type [24, 8] - [24, 12]
type_name [24, 13] - [24, 14]
type_variable [24, 13] - [24, 14]
ERROR [24, 15] - [24, 18]
type_name [24, 19] - [24, 20]
type_variable [24, 19] - [24, 20]
ERROR [24, 21] - [24, 24]
type_name [24, 25] - [24, 29]
type [24, 25] - [24, 29]
type_name [24, 30] - [24, 31]
type_variable [24, 30] - [24, 31]
type_name [25, 0] - [25, 4]
type_variable [25, 0] - [25, 4]
type_name [25, 5] - [25, 7]
type_variable [25, 5] - [25, 7]
type_name [25, 8] - [25, 9]
type_variable [25, 8] - [25, 9]
ERROR [25, 10] - [25, 11]
type_name [25, 12] - [25, 15]
type [25, 12] - [25, 15]
type_parens [25, 16] - [25, 37]
type_apply [25, 17] - [25, 36]
type_name [25, 17] - [25, 18]
type [25, 17] - [25, 18]
ERROR [25, 18] - [25, 21]
ERROR [25, 18] - [25, 21]
type_parens [25, 22] - [25, 36]
type_apply [25, 23] - [25, 35]
type_name [25, 23] - [25, 27]
type [25, 23] - [25, 27]
type_tuple [25, 28] - [25, 35]
type_name [25, 29] - [25, 31]
type_variable [25, 29] - [25, 31]
comma [25, 31] - [25, 32]
type_name [25, 33] - [25, 34]
type_variable [25, 33] - [25, 34]
from tree-sitter-haskell.
I added three more symbols for built-in syntax.
I also took a look at the symbolic operator situation, and it's a little bit more difficult.
Legal characters for these varsyms are determined by membership in unicode categories, which contain about 6000 code points in noncontiguous intervals.
We are parsing varsyms in the scanner, which means we don't have access to the unicode category regex classes that are provided by tree-sitter.
I couldn't find a method to do this in standard C, but maybe someone knows better?
For what it's worth, I tried adding a switch
with 6k cases and performance only degraded by about 1%.
from tree-sitter-haskell.
I am not sure, what the rules here are, but would it be terrible to over-approximate here? (Also donβt know if it would simplify things) I would assume that by allowing a larger class of unicode symbols that is maybe easier to check it would be unlikely to miss-parse valid Haskell?
from tree-sitter-haskell.
possibly, but I'm absolutely uncertain. 6k code points in a range of 130k seems quite disproportionate, and they are spaced out pretty wide.
We could try > N
for some value and test all smaller ones explicitly.
But since performance doesn't take a significant hit, we could also just put the 6k cases in a separate file in a switch and be done with it π
from tree-sitter-haskell.
Your call. I would also wonder a bit how much bigger the grammar would become β¦
from tree-sitter-haskell.
the haskell.so
grows by 10kB. (total 3.6MB)
from tree-sitter-haskell.
the arrow notation operators appear not to be within the categories used for the PR we just merged. also unsure about those banana brackets, they would probably need special treatment.
from tree-sitter-haskell.
Related Issues (20)
- Combining characters in identifiers are not parsed correctly HOT 1
- Include . from qualified modules and variables HOT 6
- Segfault on large files (in Neovim) HOT 1
- Upgrade node-gyp dependency HOT 2
- Components parser as type when they are not HOT 1
- Include ! from strictness annotations
- exp_section_right not parsed when containing a hash HOT 3
- Incorrect parse for function with where-clause and comments HOT 4
- Can't npm install tree-sitter-haskell on Mac M3 Node.js v20.10? HOT 1
- Incorrect parse due to top-level splices HOT 11
- Failed builds due to very big file(s). HOT 3
- Qualified/unqualified module paths colored differently HOT 3
- Typed Template Haskell quotations / splices not handled correctly HOT 3
- "finally" is highlighted like a language keyword HOT 1
- Hangs when highlighting/parsing `data Aa = Bb | Cc | ` HOT 3
- Update package for tree-sitter 0.21 HOT 3
- "undefined symbol: tree_sitter_haskell_external_scanner_create" when running "tree-sitter test" HOT 7
- Support `OverloadedRecordDot` HOT 8
- I added three more symbols for built-in syntax.
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from tree-sitter-haskell.