Code Monkey home page Code Monkey logo

Comments (9)

LeeTibbert avatar LeeTibbert commented on May 24, 2024 2

fenginsc,

Thank you for reporting this defect. Sorry that you are experiencing it.

The provided code made it easier to take a first look at this.

I confirm that the provided code works using scala-cli with Scala 3.3.1, & JVM (17) and
fails with Scala Native 4.16 (and probably others, including 5.0-SNAPSHOT).

The difference in my configuration is that I am using Linux (6.6.8).

I suggest removing the "on windows" from the topic line now that the defect
has been reproduced on another OS. I do not have the privs necessary to do that edit.

I am off to see if I can minimize the regex.

from scala-native.

ekrich avatar ekrich commented on May 24, 2024

I confirmed this works in Scala JVM as well.

from scala-native.

LeeTibbert avatar LeeTibbert commented on May 24, 2024

Eric, thank you for the data. OS? I presume macOS.

from scala-native.

LeeTibbert avatar LeeTibbert commented on May 24, 2024

Next Step:

  • I need to check the RE2 GitHub website to see if there are fixes there which might resolve this Issue.

  • Looks like SN regex was last updated to re2j version 1.3 about two years ago. re2j is now at 1.7.

  • There is one fix in re2j to a bug in its original Go code which deals with parsing OR conditions.
    google/re2j#93 "Incorrect match found when capturing groups are not used"
    Next sprint I may try creating a reproducer which creates groups for only the common prefix
    in the original regex. That may tell me/us if we are on the trail of a solution or just a path-to-nowhere.

  • Later: original regex with grouping of only the OR clauses with a common prefix works. So next
    step is to study the re2j fix and possibly port it.
    "^(\\-|\\+)?(0\\.[0-9]+|([1-9][0-9]*\\.[0-9]+)|([1-9][0-9]*)|0)$"

Progress:

  • I have been able to isolate a simpler reproducer. This may not be the simplest reproducer
    but it is what I have not. The presence and placement of the grouping parentheses and
    the presence of the OR operator seem to be important.

  • The working goal is for SN to parse the presented Regex.
    I have been able to create a slightly altered Regex which works for both SN & JVM.
    The idea is to understand the problem by perturbing the inputs: here manually grouping the
    clauses of OR operators.
    ^(\\-|\\+)?(0\\.[0-9]+)|([1-9][0-9]*\\.[0-9]+)|([1-9][0-9]*)|0$

  • I think I have run out of cleverness for this year.

from scala-native.

LeeTibbert avatar LeeTibbert commented on May 24, 2024

So that I understand, is the original regex just to illustrate the problem or is it intended to be
used in production?

If the latter, is the intent that inputs have been put in some cannonical form so that then
regex need not recognize otherwise valid numbers such as "00" or "02.3"?

from scala-native.

fenginsc avatar fenginsc commented on May 24, 2024

Thank you, I've changed the title, it was just an exercise to test if a string was a legitimate number, obviously "00" or "02.3" is not a mathematically legitimate number.

from scala-native.

LeeTibbert avatar LeeTibbert commented on May 24, 2024

re: obviously "00" or "02.3" is not a mathematically legitimate number.

That is not so obvious to Java. Running each of the lines below in scastie
succeeds and returns the expected value.

Granted that those strings do not represent the canonical or most reduced
form of a Platonic number.

java.lang.Integer.parseInt("00")
java.lang.Double.parseDouble("01.618")

from scala-native.

LeeTibbert avatar LeeTibbert commented on May 24, 2024

@fenginsc

At first blush, this defect looked like the defect reported fixed in PRs #2410 & #1701. They were merged into Scala Native
4.n & 5.n two plus years ago.

Further study narrowed the defect down to a mistake I made porting from the original re2j code five or so
years ago. I apologize for the time you have spent on my error and the inconvenience it caused.

Thank you, once again, for flushing it out and reporting it.

PR #3642 contains a fix and corresponding test.

I have marked that PR for backporting to the 4.n stream.

I do not know when the next 4.n release will be. A release of Scala itself is coming up and
Scala Native 4.n release which supports the new Scala is usually released very soon after.

Until then, can you use the workaround of using parenthesis around any OR clause which
has more than two elements? In "(a+b+c+)| z" the first clause has three elements. Without
the parentheses, I believe that will provoke the reported defect. With them, I believe that
Scala Native will handle the string.

If you can not use the workaround, let me know here and I will see what else I might be able
to figure out.

This one was a pretty good brain teaser.

from scala-native.

fenginsc avatar fenginsc commented on May 24, 2024

@LeeTibbert
Thank you very much for fixing this issue, I've fixed it by simplifying that regular expression ran my program successfully.
"^[-+]?(0|[1-9][0-9]*)(\\.[0-9]+)?$"
So I don't need the previous regular expression anymore, it's just an example to demonstrate the bug.
happy new year!

from scala-native.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.