Comments (6)
Validating user input makes sense to me, but I worry about cases where we can't foresee exactly what validity properties are needed at the point of receiving input. For example, it wasn't immediately obvious to me when reviewing #459 that some multiplications at type size_t couldn't overflowm, where the quantities being multiplied are not related in any remotely transparent way to user input. Mathieu convinced me that those particular multiplications are fine, but if we could reach for __builtin_mul_overflow it would be clear from the code that if overflow did occur we wouldn't continue operating in an unsafe state.
In general, I agree with doing validation up-front where that's practical, but I'm strongly in favor of making liberal use of checked arithmetic for (1) cases where early validation just isn't practical, and (2) even where we can and do validate the inputs, to provide a second line of defense in the event of faulty reasoning on our part, refactorings that introduce unanticipated states, etc. With unchecked arithmetic, the worst case for those kinds of subtle bugs is that we introduce UB; with checked arithmetic, the worst case is that we get a predictable crash that at least points to a call site.
I see your point. To me it sounds like the check should be some kind of assertion (possibly with a macro if that's more convenient), which makes the program abort, in the same vein of all the assertions we have all around the code. Especially at the beginning of functions we place a number of assertions to make sure that the assumptions we make locally about current state or function arguments actually hold, precisely because the reasons you mention, it's hard sometimes to reason through all code and it helps against regressions in refactorings. It could be as easy as placing a couple of assertions before a multiplication, to validate that the terms of the multiplications meet the expectations and assumptions.
Our testing code (unit tests all the way up to jenkins) should be good enough to trigger those assertions in case we screw up something.
Note that in some sense these are not genuine failure modes for which we should really return errors, they are more "bugs" in the code, e.g. we didn't perform validation or we broke some assumptions or similar things. Those are really software defects that should be fixed rather than bumped to the user.
FWIW SQLite uses this kind of technique very extensively (via assertions indeed). There's also an interesting take about assertions here: https://www.sqlite.org/assert.html, which I find quite sensible.
from dqlite.
In the specific case that is mentioned, I think that the value of c->request.words
should be checked at the time the request received by the server from the client over the wire gets decoded, basically right after this line:
Line 194 in e618e62
The other fields of c->request
that are decoded by message__decode
should be checked too in that spot, so that subsequent code and rely on them to be sane.
Validating the values that the client sends us over the wire is something we probably don't always do, so it'd be nice to check that too.
I'm not entirely sure what macros can do in a case like the mentioned one, what would the macro do?
from dqlite.
My thinking was that we could have a set of macros like
#define CHECKED_ADD(TY, A, B) \
{ \
/* If the (infinite-precision) result of adding A to B fits in the type TY, return that value; otherwise, abort/return an error code. */ \
}
(Or, more likely, with an int return value and out parameter for the result to allow more flexibility for what happens on overflow/wrapping.) It would be a function rather than a macro to handle a variety of input/output types, but we could also just write monomorphic functions for particular sets of types as needed. Then in a case like the one I linked we'd use CHECKED_MUL instead of a regular multiplication, erroring out if overflow/wrapping is detected.
Rather than reinventing the wheel, we could use the GCC/Clang builtins that exist for this purpose:
https://gcc.gnu.org/onlinedocs/gcc/Integer-Overflow-Builtins.html
https://clang.llvm.org/docs/LanguageExtensions.html#checked-arithmetic-builtins
(There was some kind of proposal to standardize these in C23, but it seems not to have made it into the final cut.)
from dqlite.
The drawback I see is that it's difficult to understand what went wrong, because the check does not happen when the user provides the input, but at some point later down the road, so it'd be difficult to produce a meaningful error message (e.g. "invalid message words count: %d", to stick with the mentioned example).
Given that input should be ideally validated, I'd suggest to do check all our assumptions as soon as user input is received. After that point we can safely assume that those assumptions are met (e.g. the words count is not greater than X). Note that there might be other types of issues related to invalid user input that don't fall under the overflow category, so it feels to me that anyway the generic solution would be to validate the input. That should cover the overflow problem and any other ones where we make some sort of assumption on the input.
from dqlite.
Validating user input makes sense to me, but I worry about cases where we can't foresee exactly what validity properties are needed at the point of receiving input. For example, it wasn't immediately obvious to me when reviewing #459 that some multiplications at type size_t couldn't overflowm, where the quantities being multiplied are not related in any remotely transparent way to user input. Mathieu convinced me that those particular multiplications are fine, but if we could reach for __builtin_mul_overflow it would be clear from the code that if overflow did occur we wouldn't continue operating in an unsafe state.
In general, I agree with doing validation up-front where that's practical, but I'm strongly in favor of making liberal use of checked arithmetic for (1) cases where early validation just isn't practical, and (2) even where we can and do validate the inputs, to provide a second line of defense in the event of faulty reasoning on our part, refactorings that introduce unanticipated states, etc. With unchecked arithmetic, the worst case for those kinds of subtle bugs is that we introduce UB; with checked arithmetic, the worst case is that we get a predictable crash that at least points to a call site.
from dqlite.
Note also that placing individual assertions about the individual arguments of a multiplication just before the multiplication takes place helps in identifying what exactly is wrong, as opposed to just checking the result of the multiplication, in which case you don't know what factor is to blame. This is probably irrelevant for the specific case that was outlined above, for the words count (since one of the two factors is a constant), but it might be true in general.
from dqlite.
Related Issues (20)
- Multi threading bug in tracef() HOT 1
- Jepsen: assertion failure in vfs.c HOT 1
- Nested barriers
- 1.16.0: test suite fails HOT 17
- Method to be informed about latest change in table HOT 2
- Several singular nodes joined later together HOT 2
- Run dqlite tests against most recent SQLite release. HOT 1
- Clarification on dqlite read operations HOT 1
- 1.16.2: build fails with raft 0.22.0 HOT 3
- Fix Coverity
- Use execinfo.h instead of libbacktrace HOT 3
- Set up tmate action
- Set up a canary PPA to get feedback from LP builders before merge HOT 2
- no known leader error forever
- Node.js bindings HOT 1
- Fix cases where the server can return a FAILURE message with result code 0
- Investigate what causes LXD's dqlite to create lots of singleton closed segments
- Flaky test: cluster/restart HOT 5
- 1.16.4: build fails on linking with missing `raft_register_state_cb` symbol HOT 1
- posix_fallocate extremely slow on ext3 file system HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dqlite.