Comments (4)
After looking through the jsonparse code and reading up on utf8 a little bit more, I think I mischaracterized the issue. n = 174
seems to be the second byte, the continuation character, of the 2-byte "registered trademark" character. For some reason, the value of i
isn't getting properly incremented when the first byte (0xc2) is read. This should happen at https://github.com/creationix/jsonparse/blob/master/jsonparse.js#L142
I'm working on creating a test that replicates this issue outside of my use case.
from jsonparse.
Thanks for looking into this. I'm currently super busy with other projects, but I'll happily review a pull request when you get one.
from jsonparse.
The string that's causing the problem in my JSON stream is "At The Learning Experience® (TLE®) ...".
I fired up a debugger and set a breakpoint inside the code block that handles the STRING1
state. and this is what I saw:
When it hits the '®' character, n = buffer[i] = 174
, buffer[i-1] = 101
and buffer[i-2] = 99
.
So, buffer[i-2]
is 99, the 'c' character, buffer[i-1]
is 101, the 'e' character, and buffer[i]
is 174, the second byte of the two byte utf8 '®' character. It seems like n = buffer[i]
should be the first byte of a utf8 character, 194 (0xc2). It seems like this byte is skipped entirely in the buffer.
I wasn't really sure how the leading UTF8 character got dropped in the buffer until I realized that the JSON is encoded as ISO-8559-1, not UTF-8.
from jsonparse.
Voting to reopen due to this SO question: the issue is still prevalent and according to OP it occurs in later versions as well - I'm not sure how much credence to give that, but given that there was no PR and this issue was closed by the issue owner and there are no referenced commits its possible the bug is still present.
A quick fix might be to ensure that no Unicode characters are passed to the stream, but something more permanent would be nice.
I'm not sure if you've fixed the issue in later releases, @creationix, but OP is using 0.0.5 which predates this issue.
from jsonparse.
Related Issues (20)
- Not working with browserify buffers HOT 6
- Streaming multi-byte UTF8 characters not being parsed correctly HOT 1
- Dubious-looking assignment needs a comment (or fix). HOT 3
- Is this module compatible with Streams2? HOT 1
- how to parse selected values from json? HOT 1
- Please sign cryptographilly git tag and release signature file HOT 1
- Erroneously rounds long integer numbers HOT 13
- Infinite loop still exists for some characters HOT 4
- Status of this Library HOT 4
- Allow pars escaped surrogate pairs HOT 5
- Simpler implementation is possible? HOT 8
- Fix depreciation warning
- RangeError for toString('utf8') on Nodejs 8.6.0 HOT 1
- Invalid JSON (Invalid UTF-8 character at position 0 in state STRING1) HOT 3
- "buffer" and "i" on line 413 undefined?
- Some big numbers not converted to string
- new Buffer() constructor is deprecated
- Enable parsing of UTF-8 characters HOT 2
- Memory leak HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from jsonparse.