Comments (10)
The compliance baseline for this library should be the relevant RFC 4648. I never liked Postel's law much ("be strict in what you send, and liberal in what you accept"). Being liberal just invites abuse. All your examples should throw an error when they violate the preconditions.
from base64.
I'll check the RFC 4648
and try to make a PR to update the test suite and library to handle invalid input properly.
from base64.
Thanks! Let's have a library that can at least strictly validate. If at any later moment we want to relax the validation, we could create a flag for it.
(Personally, I think that any string that violates the base64 spec as given in the RFC, is not a valid base64 string, and should not be given the benefit of the doubt.)
from base64.
Maybe one comment: it's quite common for base64 input to contain whitespace (newlines and such), and it would be helpful to have a flag to tell the library to ignore it. This does not negate the fact that a bona-fide base64 string with such characters is invalid, but it would indicate that it would be useful to have a flag that tells the library to strip out those characters before processing. (See also #15).
from base64.
So, not much else than what you're saying after reviewing RFC 4648
.
First step, report all violations.
Second step, allow whitespace. See #33
from base64.
For now, I added validation for base64_decode
function only.
The streaming implementation would require a final step (as can be seen in validation for base64_decode
). This requires adding a new function in the API or asking integrators to do the same check as what's done for base64_decode
.
I'm more in favor of adding a new API function. Could be something like:
int
base64_stream_decode_final (struct base64_state *state)
{
state->eof = BASE64_EOF;
if (state->bytes == 0) {
return 1;
}
return 0;
}
from base64.
Hm, I'd like to avoid changing the API if possible. Your suggested addition basically checks if there are no trailing characters in the decode buffer. I'm not sure I consider that something that we must intercept and note. My attitude would be "as long as we honour the EOF marker and output no further bytes, who cares about trailing bytes". As far as decoding is concerned, those bytes are neither here or there, and it seems pedantic to point out to the user that his buffer length calculation is "optimistic".
What do you think? Maybe the right way is to be pedantic and technically correct.
from base64.
Well, if I quote what you said in an earlier comment:
The compliance baseline for this library should be the relevant RFC 4648. I never liked Postel's law much ("be strict in what you send, and liberal in what you accept"). Being liberal just invites abuse. All your examples should throw an error when they violate the preconditions.
I'd say that the library should throw an error even when there are trailing bytes remaining. For sure, it shall be the case for the non streaming API. Since there's no API to get the number of unread bytes, I'd say the library shall return an error in this case (it might be possible to compute this with this version but adding whitespace character stripping will remove that possibility).
I did not understand this:
As far as decoding is concerned, those bytes are neither here or there, and it seems pedantic to point out to the user that his buffer length calculation is "optimistic".
The only thing that will be pointed out to the user is that he's feeding too much (or not enough) data in the decoder.
The API change might be considered an enhancement only. Backward compatibility is still here. If the user wants a stricter behavior then, he can integrate the new function in his code.
from base64.
The viewpoint changes depending on how you conceptualize the buffer. If you define it as being strictly a Base64-encoded string and nothing more, you get a different viewpoint than when you define it as a buffer that contains a properly-terminated Base64-encoded string starting at offset 0.
In the first case trailing bytes are an error, while in the second case trailing bytes after a valid string are simply irrelevant (which is what I meant with "neither here nor there": they don't matter).
I've been holding more or less the second view of the buffer, whereas you are closer to the first. I don't think my view contradicts what I said earlier about violating the preconditions, because the preconditions are defined for the Base64-encoded string itself and not the buffer it's embedded in. Of your four examples, I believe the first three should throw errors, whereas the fourth is probably OK (but no bytes should be output after seeing the EOF marker).
Concretely, I've assumed that it's not an error to pass an oversized buffer (of "optimistic" length :), as long as the Base64 string starting at offset 0 is valid.
Maybe a stricter check would be a good idea, though. I do still think that it's better to be strict than liberal in these cases.
from base64.
OK, I see what you meant now.
What bothers me with that viewpoint is that, if I'm not mistaken or missing something else, it's only valid for "terminated" Base64-encoded string. That is, for Base64-encoded strings that are multiple of 4 characters, what remains will be decoded (or return an error at some point).
Given what I just said, I don't know what are the real use cases for using an "optimistic" length. It's seems to be error prone more than anything else. Still, I might be missing something.
from base64.
Related Issues (20)
- Decoding data containing <NUL> values HOT 6
- bin/base64: modernize the demo program
- Add a macro to calculate encoded size from raw size and vice versa
- Investigate `gf2p8affineqb` for the shuffle step
- enc: asm: add memory and flags as clobbers
- Integrate with google/oss-fuzz for continuous fuzz testing
- Create release 0.5.1
- Codec detection doesnโt work in test_base64 on musl libc HOT 3
- v0.5.1 breaks `base64 -d` on Alpine Linux (musl libc) HOT 15
- Build of 0.5.1 broken with MinGW HOT 15
- I made a online base64 decoder tool. We can use it HOT 1
- Simplify codepath selection HOT 4
- build failure using mingw32: `error: 'asm' operand has impossible constraints` HOT 4
- CMakeLists: update version to 0.5.2
- bin/base64: add command line options for decoding
- Add `BASE64_FORCE_INLINE` macro to always inline inner loop functions
- Decoding error on Windows (CRLF?) HOT 18
- Codepage error on Windows HOT 2
- Build fails on macOS: `make: objcopy: No such file or directory` HOT 2
- Add installation guide for lib HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ๐๐๐
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google โค๏ธ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from base64.