Comments (5)
Yes, I was using an old version, thanks for helping me debug!
from jcodings.
Assuming the encoding of these characters is the same in Ruby, the GB18030 bytes appear to match:
[] ~/projects/jruby $ ruby -e 'p "Lašas".encode("gb18030").bytes'
[76, 97, 129, 48, 148, 56, 97, 115]
I do not have an explanation for why the Java GB18030 encoder produces different output.
from jcodings.
Based on this online converter, we also match:
$ ruby -e 'p "Lašas".encode("gb18030").bytes.map{|i| i.to_s(16)}'
["4c", "61", "81", "30", "94", "38", "61", "73"]
I would say the Java encoder is in error here.
from jcodings.
Actually now I see that the Java getBytes
matches Ruby but the manually transcoded result is not correct in your example.
I made this into a test class and I believe the latest jcodings should match. Perhaps you are running against an old version?
$ java -cp ../jcodings/target/jcodings.jar:. Blah
[76, 97, -127, 48, -108, 56, 97, 115]
[76, 97, -127, 48, -108, 56, 97, 115]
import org.jcodings.*;
import org.jcodings.transcode.*;
import java.util.*;
public class Blah {
public static void main(String[] args) throws Throwable {
EConv econv = TranscoderDB.open("UTF-8", "gb18030", 0);
byte[] src = "Lašas".getBytes("UTF-8");
byte[] dest = new byte["Lašas".getBytes("gb18030").length];
econv.convert(src, new Ptr(0), 6, dest, new Ptr(0), dest.length, 0);
System.out.println(Arrays.toString(dest));
// [76, 97, -127, 48, 18, 56, 97, 115]
System.out.println(Arrays.toString("Lašas".getBytes("gb18030")));
// [76, 97, -127, 48, -108, 56, 97, 115]
}
}
from jcodings.
Possibly fixed by @k77ch7 in 408210c. In any case, it's no longer broken.
from jcodings.
Related Issues (20)
- Exception happens when send email with inline attachment HOT 3
- Mongolian vowel separator needs to be removed from CR_Bank range HOT 4
- Doesn't build with Java 8 HOT 1
- warning: could not load encoding for file.encoding of MS932 in jruby HOT 3
- 1.0.19 build failure on test
- Regression in transcoding UTF-8 -> Windows-1255 HOT 2
- org.jcodings.exception.InternalException: encoding class <UTF8> not found HOT 11
- ArrayIndexOutOfBounds exception GB18030 HOT 2
- Unable to find org.jcodings.specific.BaseUTF8Encoding.mbcCaseFold HOT 8
- Implement approximate length and other length routines for proper broken character processing
- Verify org.jruby.jcodings:jcodings:1.0.44 is JDK11 compatible HOT 6
- A way to access the Unicode version and the Unicode emoji version HOT 5
- Should encoding exceptions capture the stack trace HOT 4
- ArrayIndexOutOfBoundsExceptions when transcoding UTF8-SoftBank=>SJIS-KDDI or CP51932=>CP50220 HOT 2
- Add JRuby encoding tests to jcodings CI
- Copyright notice is missing in License
- jcodings does not properly produce error for at least two encodings on bad codepoint HOT 1
- Copyright issue HOT 2
- CP1251Encoding.INSTANCE.getCharset() is null HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from jcodings.