Comments (7)
Hey, thanks for the feedback and suggestions! I've already started working on lookahead/backtracking combinators and playing around with non-consuming parsers in general, so your use cases and thoughts will be a great help for sure.
I have to take care of some personal stuff though, so I'll get back to this and other issues on weekends-holidays.
from sigma.
I took a glance at your examples and can tell you right away what the problem is: the missing g
flag in regular expressions. This "caveat" is mentioned in the docs, as well as the other one.
Under the hood regexp
parser uses RegExp.exec
(for speed) and resets lastIndex
to parsing position, so yeah, use the g
flag.
As for the word boundaries... Well, it's sad it doesn't work, but I believe this should be handled on the library consumer's side, as this seems to be a rather specific issue with regular expressions in JS. Besides, nothing stops you from either wrapping regexp
parser and providing it with a fixed regular expression, or even writing a custom parser (this actually should be documented, hopefully I'll be able to spare some time for that).
Ultimately, I would like to keep the core of the library as small as possible.
As for the other stuff in the issue, I'll reply later and link a PR.
from sigma.
The
not
combinator is kinda tricky, not only because it can't be typed correctly without ugly casts, but also because I have no clear understanding on how to properly integrate it into the existing wiring. I'll leave it for now along withcontext
, will be added as separate issues to backlog.
The idea must be silly but what if we restrict the use of not
parser only for work with contexts (lookahead/behind
) and make it return boolean
? So its only responsibility would be to answer the question whether the input was matched or not which is the only thing that lookahead/behind
requires in regexp world.
I know that it conflicts with the overall API design but maybe it's not needed to make it work the way other parsers do. It may be just fine to make it work as a support function for lookahead/behind
functionality only.
from sigma.
Thank you for your work, I really appreciate it!
I've just realized that I forgot to save changes in the codesandbox and provided you incorrect samples. Now it should work: codesandbox.
from sigma.
Oh, adding g
flag actually resolved the issue. I was thinking all this time that I must not use g flag
, seems like I missread it.
What concerns word boundaries it was my desire to completely migrate from regexps and I'm just fine to implement it myself! I do not expect that this library will cover all the features of regular expressions. And if there will be such a need I guess it should be done through an extension package e.g. sigma/extra
or something like that.
from sigma.
Returning to the first part of the issue...
-
I believe thecontext
combinator you suggested will be essentiallyattempt(takeMid(before, target, after))
orlookahead(takeMid(before, target, after))
. Both will be rip-offs of try and lookAhead from Parsec respectively. -
The
not
combinator is kinda tricky, not only because it can't be typed correctly without ugly casts, but also because I have no clear understanding on how to properly integrate it into the existing wiring. I'll leave it for now along withcontext
, will be added as separate issues to backlog. -
The
when
combinator was originally conceived to allow creating parsers dynamically, not statically (likesequence
). See the example below.when(context, parser)
It is consuming if
context
is consuming and it will callparser
callback only ifcontext
succeeds. Why? Because allowing to produce something on failure would be virtually a way for performing recovery, which I want to be a separate combinator's responsibility. Also I kinda regret of not giving it a better name, likechain
, but I already hadchainl
at the moment which provides completely orthogonal functionality, so it would be confusing. Naming is damn hard...Also I have to admit that the example in the docs is hilariously bad, will be fixed. A more elaborate example would look like this:
const Parser = when(takeLeft(letters(), whitespace()), ({ value }) => { switch (value) { case 'integer': return integer() case 'string': return letters() case 'bracketed': return takeMid(string('('), letters(), string(')')) // TODO: Add `success` and `failure` helpers. default: return failure('Expected integer, string or bracketed string.') } }) console.log(run(Parser).with('integer 42')) // { // isOk: true, // pos: 10, // value: 42 // } console.log(run(Parser).with('string Something')) // { // isOk: true, // pos: 16, // value: 'Something' // } console.log(run(Parser).with('bracketed (Something)')) // { // isOk: true, // pos: 21, // value: 'Something' // } console.log(run(Parser).with('unknown input')) // { // isOk: false, // pos: 8, // expected: 'Expected integer, string or bracketed string.' // }
from sigma.
Hm-m, that could work, thanks for the idea! I'll have to add names or some other identification means for parsers and combinators, but I'll need to add it anyway for future tracing/debugging features.
from sigma.
Related Issues (20)
- feat: error recovery, handling and mapping
- feat: throwing runner HOT 3
- feat: make regexp parser throw an error if the global flag is missing HOT 4
- bug: `choice` incorrectly infers a type if given a spreaded array of parsers HOT 2
- docs: add `consuming` and `non-consuming` labels/badges
- docs(vitepress): explore automatic code snippet imports for signatures
- docs(vitepress): automate sidebar construction HOT 1
- feat(combinators/not): add `not` combinator
- feat: optional spaces and whitespaces HOT 2
- sepBy mutates position even on no match HOT 4
- Postinstall fails on 3.6.3 HOT 1
- Result model: span vs pos HOT 5
- defer() should error? HOT 1
- Feature: grammar helper HOT 11
- Feature: error printer
- docs(types): thoroughly document user-facing types
- ustring() docs/questions HOT 3
- Docs: `run` and `tryRun` aren't parsers HOT 1
- Dependents HOT 1
- Optimizations
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from sigma.