Comments (8)
Well, they use different code paths (the String
constructor for byte arrays, and CharsetDecoder
for streams), but obviously I'd expect them to be equivalent. Can you provide a failing test case?
from byte-streams.
I'm afraid I'm struggling to create a minimal example here. I have 4k of text that demonstrates the problem that I can't share. I have so far failed to minimize this further or find a good random string that demonstrates the same problem.
Could you list the conversion steps from ByteArrayInputStream to String? I'm having a hard time understanding the high-level conversions that are made using the graph by step debugging byte-streams. Maybe I can take my private example through these manually to see where the errors arise.
from byte-streams.
The InputStream
is turned into a ByteSource
[1], then the ByteSource
is turned into a CharSequence
[2]. Hope that helps, let me know if you have any other questions.
[1] https://github.com/ztellman/byte-streams/blob/master/src/byte_streams.clj#L526
[2] https://github.com/ztellman/byte-streams/blob/master/src/byte_streams/char_sequence.clj#L81
from byte-streams.
My hunch is that this is an issue of single characters spanning a chunk-size boundary in byte-streams.char-sequence/lazy-char-buffer-sequence
causing a unicode replacement character to be used by the decoder.
from byte-streams.
My understanding is that the CharsetDecoder
should handle that properly, but if so then we'd expect the malformed characters to show up at the 4096th byte, since that's the default chunk size. Maybe to make a more minimal test case you can try specifying {:chunk-size 16}
or something in the convert
call?
from byte-streams.
I need to sleep now, but yes, I can easily create a minimal test case now as I'm pretty certain that the problem is as described above. I confirmed this using the method you described. If I simply create a string of 3-byte chars I can see the error on the 4096th byte boundary. If I reduce the chunk size I see a lot more errors.
from byte-streams.
Okay, I'll see if I can track down what's happening.
from byte-streams.
Fixed by 29f50f7
from byte-streams.
Related Issues (20)
- Unable to load byte-streams ns more than once HOT 11
- print-bytes calls .release ByteBuf? HOT 2
- cannot compile due to No such var: p/min HOT 9
- lein uberjar fails with type hints in graph.clj HOT 12
- Use InputStream#transferTo? HOT 3
- Can't convert stream of byte arrays to seq of byte arrays HOT 1
- `closeable-seq` may end prematurely after GC? HOT 3
- Undeclared behavior for transfer {:close? true} HOT 2
- Lein javac options break on JDKs >= 12 HOT 1
- Reflection warnings in byte-streams.clj HOT 6
- "Don't know how to convert class manifold.stream.BufferedStream into class java.io.InputStream" HOT 1
- `:tag` metadata can be wrong
- Remove use of clj-tuple
- Deprecation of `byte-streams` namespace is undocumented.
- Late declarations of lower-cost conversions are ineffective HOT 3
- Lazy converter instantiation performance gotcha HOT 1
- Release 0.4.0? HOT 6
- Single-segment and clj-commons namespaces do not share conversion graph HOT 1
- `def-conversion` sometimes breaks when AOT-compiled HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from byte-streams.