Comments (3)
This resolves $1.
Now why are we still getting segfaults here and there? Well turns out we still stumble upon scenarios where
hash != NULL
finfo[i].hash == NULL
finfo[i].was_hash_printed == 1
(potentially should rename was_hash_printed
to was_hash_computed
)
Uncertain how this happens. Only folders should be getting a NULL
hash. Will look into it.
from build-recorder.
@zvr Now that I've come to understand sync(2)
and syncfs(2)
for that matter, do you assume that the scenario mentioned above is only to be expected if and only if there is a sync(2)
or syncfs(2)
in there?
Like this:
open f1 = "/some/file" for writing
//write
syncfs(f1)
open f2 = "/some/file" for reading
close f1(write)
close f2(read)
And if yes, do you propose that we should trace sync(2)
calls as well and store the hash of the file they sync upon meeting? That's quite easy in terms of syncfs(2)
but if we were to talk about sync(2)
we would need to know which files have pending file operations to avoid re-hashing everything, which will be a performance hit.
from build-recorder.
@zvr Okay story time!
The Bug
Remember this issue? The bug $1
i supposedly fixed? Well turns out it was also the root of bug $2
(who would have thought)!
Here is the code for handle open:
static void
handle_open(PROCESS_INFO *pi, int fd, int dirfd, void *path, int purpose)
{
path = get_str_from_process(pi->pid, path);
char *abspath = absolutepath(pi->pid, dirfd, path);
if (abspath == NULL)
error(EXIT_FAILURE, errno, "on handle_open absolutepath");
char *hash = NULL;
FILE_INFO *f = NULL;
if ((purpose & O_ACCMODE) == O_RDONLY) {
hash = get_file_hash(abspath);
f = find_finfo(abspath, hash);
}
if (!f) {
f = next_finfo();
finfo_new(f, path, abspath, hash);
} else {
free(path);
free(abspath);
free(hash);
}
*finfo_at(pi, fd) = f - finfo;
record_fileuse(pi->outname, f->outname, purpose);
if (!f->was_hash_printed && (purpose & O_ACCMODE) == O_RDONLY) {
f->was_hash_printed = 1;
record_hash(f->outname, hash);
}
}
And also find_finfo
FILE_INFO *
find_finfo(char *abspath, char *hash)
{
int i = numfinfo;
while (i >= 0 && !(!strcmp(abspath, finfo[i].abspath)
&& (!(finfo[i].was_hash_printed)
|| (hash == NULL && finfo[i].hash == NULL)
|| !strcmp(hash, finfo[i].hash)))) {
--i;
}
if (i < 0) {
return NULL;
}
return finfo + i;
}
To find out the bug we simply have to follow the path of handle_open(3)
given that f1
is now opened for reading.
We get that find_finfo(3)
thanks to the !(finfo[i].was_hash_printed)
line actually returns the half-completed file used for writing. Here:
if ((purpose & O_ACCMODE) == O_RDONLY) {
hash = get_file_hash(abspath);
f = find_finfo(abspath, hash);
}
Some code later we reach here:
if (!f->was_hash_printed && (purpose & O_ACCMODE) == O_RDONLY) {
f->was_hash_printed = 1;
record_hash(f->outname, hash);
}
f
, which was initially a file opened for writing but not yet close(2)
d. Actually has its was_hash_printed
field set to 1
. As a result f
is now an entry with its hash
field equal to NULL
and its was_hash_printed
set to 1
. Recipe for disaster if f
was to be re-used by a future read.
Solution
Given our discussion about how to handle the sync(2)
family of system calls, the way i handled bug $1
was wrong.
The way it should be handled is that if an entry has its was_hash_printed
field set to 0(that is, on writes), it should be skipped altogether from the re-usage
find_finfo(3)
call.
from build-recorder.
Related Issues (20)
- configure should check everything HOT 1
- `rename(2)` entry handler is invalid HOT 1
- `PROCESS_INFO` structure issues, missing free on PROCESS_INFO::entry_info in case of syscall failure
- Missing man page HOT 1
- Tests makefile integration HOT 2
- Standardized benchmarking HOT 16
- Extended functionality for build systems that interact with remote repositories.
- Implement tests HOT 1
- `build-recorder` build failure. HOT 2
- Output has to use the schema HOT 2
- `time(1)` is confused with the shell's `time` built-in keyword.
- Invalid use of `AC_CHECK_PROGS` HOT 4
- Rebuilding stuff after SOME changes.
- `time` `awk` and `wget/curl` unnecessarily required to build `build-recorder` HOT 2
- Failure building `build-recorder` with automated build & packaging `xbps-src` tool. HOT 2
- possible useful background information HOT 1
- Cannot build from first release
- xxd -n option HOT 8
- `build-recorder` executable placed under `src/` instead of toplevel. HOT 2
- First release tarball isn't updated after #213
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from build-recorder.