orkon / base-x-rs Goto Github PK
View Code? Open in Web Editor NEWEncode/decode any base
License: MIT License
Encode/decode any base
License: MIT License
Considering the nature of base-x, it might be worth considering to limit the encoder input and return an error if limit is reached (yes, I know I just had a PR merged that removed Result type from encode
, bare with me).
The way encoder works now is continously reallocating a Vec as new digits are pushed onto it (Rust will grow the vec exponentially in this case, so it's not super terrible), then reverse the Vec to obtain the encoded value. Since what we are trying to do is, effectively, treat the input byte buffer as a single integer and modulo/divide it by base to obtain digits, it might be worth considering an optimization usually done when stringifying integers:
Vec
, create a reasonably big byte array buffer on stack. When stringifying u64
to decimal the buffer would need to be 20 bytes long since that's how many decimal digits u64 can produce. In our case a buffer of 1024 bytes could be enough for virtually all use-cases. The buffer can be uninitialized via unsafe { mem::uinitialized() }
, at which point it really doesn't matter how big it is.Vec
.Pros:
Cons:
Per https://github.com/cryptocoinjs/base-x
WARNING: This module is NOT RFC3548 compliant, it cannot be used for base16 (hex), base32, or base64 encoding in a standards compliant manner.
This is because standards-compliant base64 aligns the start of processing with the start of the data (hence implicitly padding the bottom with zeroes), whereas base-x aligns with the end (therefore padding the top) - it will work sometimes, if the number of bits in the data is divisible by 6 (for base64) as there are no bits to pad. I found https://www.lucidchart.com/techblog/2017/10/23/base64-encoding-a-visual-explanation/ useful to help my understanding)
(as it happens, I think it might be ok for base16 because bytes are always divisible into 4 bits, but 5 bits with base32 surely has the same problem)
Concretely, encoding 'A' with a standards compliant base64 encoder results in QQ
, but results in BB
with base-x.
Hi, I am scanning the base-x in the latest version with my own static analyzer tool.
Unsafe conversion found at: src/bigint.rs:115:30: 115:65.
let mut bytes = Vec::with_capacity(len);
unsafe {
bytes.set_len(len);
let chunks_ptr = (self.chunks.as_ptr() as *const u8).offset(skip as isize);
ptr::copy_nonoverlapping(chunks_ptr, bytes.as_mut_ptr(), len);
}
This unsound implementation would create a misalignment issues from different integers.
The problematic value is further manipulated at: src/bigint.rs:115:30: 115:87
This would potentially cause undefined behaviors in Rust. If we further manipulate the problematic converted types, it would potentially lead to different consequences. I am reporting this issue for your attention.
Using an array of bytes as the alphabet is fine if all we are using is ASCII characters that are byte each, but however Rust strings are UTF-8 strings by default, meaning the characters in strings can be multiple bytes (char
type is 4 bytes).
I might do a pull request on this later.
No_alloc support seems to be possible, but would require a change in the API. Instead of constructing Strings or Vec, just use mutable slice passed in. This change would either require a seperate section of the API dedicated to no_alloc, or causing breaking changes in the current one.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.