Using csmith (a random valid c program generator) we can stress the compiler with random programs. When we find an interesting case (ICE, execution mismatch) we can use creduce or cvise to reduce the testcase.
I (Patrick) have been doing this with success for a bit now and it has helped find issues with the riscv vector targets (and a generic issue too!)
I recommend focusing on ISA strings with "clean" testsuites (no ICEs or execution fails) since that means every new failure will be novel.
There is a docker image if you just want to start fuzzing riscv-gcc.
Example command:
export RUNNER_NAME="local"
sudo docker pull ghcr.io/patrick-rivos/compiler-fuzz-ci:latest && sudo docker run -v ~/csmith-discoveries:/compiler-fuzz-ci/csmith-discoveries ghcr.io/patrick-rivos/compiler-fuzz-ci:latest sh -c "date > /compiler-fuzz-ci/csmith-discoveries/$RUNNER_NAME && nice -n 15 parallel --link \"./scripts/fuzz-qemu.sh $RUNNER_NAME-{1} {2}\" ::: $(seq 1 $(nproc) | tr '\n' ' ') ::: '-march=rv64gcv -ftree-vectorize -O3' '-march=rv64gcv_zvl256b -ftree-vectorize -O3' '-march=rv64gcv -O3' '-march=rv64gcv_zvl256b -O3' '-march=rv64gcv -ftree-vectorize -O3 -mtune=generic-ooo' '-march=rv64gcv_zvl256b -ftree-vectorize -O3 -mtune=generic-ooo' '-march=rv64gcv -O3 -mtune=generic-ooo' '-march=rv64gcv_zvl256b -O3 -mtune=generic-ooo'"
Command structure:
sudo docker pull ghcr.io/patrick-rivos/compiler-fuzz-ci:latest \ # Clone most recent container
&& sudo docker run \
-v ~/csmith-discoveries:/compiler-fuzz-ci/csmith-discoveries \ # Map the container's output directory with the user's desired output. Follows the format -v <SELECTED DIR>:<CONTAINER OUTPUT DIR>
ghcr.io/patrick-rivos/compiler-fuzz-ci:latest \ # Run this container
sh -c "date > /compiler-fuzz-ci/csmith-discoveries/$RUNNER_NAME \ # Record the start time
&& nice -n 15 \ # Run at a low priority so other tasks preempt the fuzzer
parallel --link \ # Gnu parallel. Link the args so they get mapped to the core enumeration
\"./scripts/fuzz-qemu.sh $RUNNER_NAME-{1} {2}\" \ # For each core provide a set of args
::: $(seq 1 $(nproc) | tr '\n' ' ') \ # Enumerate cores
::: '-march=rv64gcv -ftree-vectorize -O3' '-march=rv64gcv_zvl256b -ftree-vectorize -O3' '-march=rv64gcv -O3' '-march=rv64gcv_zvl256b -O3' '-march=rv64gcv -ftree-vectorize -O3 -mtune=generic-ooo' '-march=rv64gcv_zvl256b -ftree-vectorize -O3 -mtune=generic-ooo' '-march=rv64gcv -O3 -mtune=generic-ooo' '-march=rv64gcv_zvl256b -O3 -mtune=generic-ooo'"
# ^ All the compiler flags we're interested in
git submodule update --init csmith
sudo apt install -y g++ cmake m4
mkdir csmith-build
cd csmith
cmake -DCMAKE_INSTALL_PREFIX=../csmith-build .
make && make install
Bump GCC to use tip-of-tree & build:
git submodule update --init riscv-gnu-toolchain
cd riscv-gnu-toolchain
git submodule update --init gcc
cd gcc
git checkout master
cd ..
cd ..
mkdir build-riscv-gnu-toolchain
cd build-riscv-gnu-toolchain
../riscv-gnu-toolchain/configure --prefix=$(pwd) --with-arch=rv64gcv --with-abi=lp64d
make linux -j32
make build-qemu -j32
Update scripts compiler.path qemu.path scripts.path with the absolute paths to each of those components.
./scripts/fuzz-ice.sh csmith-tmp-1 "-march=rv64gcv -mabi=lp64d -ftree-vectorize -O3"
Running a single script is good, but if you have multiple cores (you probably do!) you can use them all!
parallel --lb "nice -n 15 ./fuzz-qemu.sh csmith-tmp-{} '-march=rv64gcv -mabi=lp64d -ftree-vectorize -O3'" ::: {0..$(nproc)}
gnu parallel makes running multiple copies of a script easy.
nice -n 15
basically tells linux "this process is low priority".
By setting this, we can leave the fuzzer going in the background and linux will automatically de-prioritize the fuzzer when more important tasks happen (like when building GCC/running a testsuite/terminal sessions/anything)
Once you've found a bug you could submit it directly to bugzilla, but it's pretty big and can probably be reduced in size!
Here's what your bug could look like after reducing it pr112561:
int printf(char *, ...);
int a, b, c, e;
short d[7][7] = {};
void main() {
short f;
c = 0;
for (; c <= 6; c++) {
e |= d[c][c] & 1;
b &= f & 3;
}
printf("%X\n", a);
}
- Set up scripts directory
Fill out compiler.path, csmith.path, qemu.path, and scripts.path More info.
- Create triage directory & copy over the testcase
This will hold the initial testcase (rename it to raw.c) and the reduced testcase (red.c)
cd
into the triage folder- Preprocess the initial testcase (raw.c)
../scripts/preprocess.sh '<gcc-opts>'
- Edit
cred-ice.sh
orcred-qemu.sh
to use the correct compilation options
Ensure the behavior is present by running the script:
../scripts/cred-ice.sh
or ../scripts/cred-ice.sh
This is a great time to try to reduce the command line args/ISA string. Edit compiler-opts.txt and see if removing some extensions still causes the issue to show up.
- Reduce!
You can use creduce or cvise for this. I prefer creduce so that's what I'll use for the examples, but I use them interchangebly. I think the cli/options are the same for both.
creduce ../scripts/cred-ice.sh red.c compiler-opts.txt
and let it reduce!
Some helpful options:
creduce ../scripts/cred-ice.sh red.c compiler-opts.txt --n 12
- Use 12 cores instead of the default 4
creduce ../scripts/cred-ice.sh red.c compiler-opts.txt --sllooww
- Try harder to reduce the testcase. Typically takes longer to reduce so I'll reduce it without --sllooww
and then use --sllooww
after the initial reduction is done.
cvise can be run with a subset of passes. This is helpful for testcases that tend to reduce to undefined behavior. More info can be found in /cvise-passes
- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112855
- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112801
- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112561
- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113087
- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112929
- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112988
- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112932
- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113206
- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113209
- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281
- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113431
- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113607
- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113796
- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114027
- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114028
- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114200
- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114202
- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114247
- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114396
- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114485
- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114665
- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114666
- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114668
- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114733
- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114734
- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112481
- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112535
- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112554
- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112552
- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112733
- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112773
- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112813
- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112852
- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112851
- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112871
- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112854
- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112872
- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112469
- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112971
- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113001
- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113210
- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113228
- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113603
- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114195
- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114196
- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114197
- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114198
- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114386
- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114749
- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114314
- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114608
- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114671
- RISCV64 miscompile at -O1
- RISCV64 miscompile at -O2/-O1
- RISCV64 vector miscompile at -O2
- RISCV vector zvl256b miscompile at -O2
- [RISC-V] Miscompile at -O2
- [RISC-V] Miscompile at -O2
- [RISC-V] Vector -flto -O2 miscompile
- [RISC-V][SLP] Sign extension miscompile
- [SLP] Missing sign extension of demoted type before zero extension
- [RISC-V][SLPVectorizer] rv64gcv miscompile
- RISCV64 backend segfault in RISC-V Merge Base Offset
- RISCV64 backend "Invalid size request on a scalable vector"
- [LSR][term-fold] Ensure the simple recurrence is reachable from the current loop
- [InstCombine] Infinite loop/hang
- [Pass Manager] Infinite loop of scheduled passes
- [DAGCombiner][RISC-V] DAGCombiner.cpp:8692: Assertion `Index < ByteWidth && "invalid index requested"' failed.
- [RISC-V] Segfault during pass 'RISC-V DAG->DAG Pattern Instruction Selection'
- [InstCombine][RISC-V] UNREACHABLE executed at InstCombineCompares.cpp:2788
- [LoopVectorize] Assertion `OpType == OperationType::DisjointOp && "recipe cannot have a disjoing flag"' failed.
- [SLP] Attempted invalid cast from VectorType to FixedVectorType
- [LoopVectorize][VPlan] Unreachable executed "Unhandled opcode!"
- [LoopVectorize][VPlan] Assertion `MinBWs.size() == NumProcessedRecipes && "some entries in MinBWs haven't been processed"' failed.
- [LoopVectorize][VPlan] Assertion "Trying to access a single scalar per part but has multiple scalars per part." failed.
- [Inline] Assert getOperand() out of range! failed.
- [RISC-V] Error in backend: Invalid size request on a scalable vector.
- [VectorCombine] Assertion 'isa(Val) && "cast() argument of incompatible type!"' failed.
- [CodeGen][RISC-V] Assertion `(!MMO->getSize().hasValue() || !getSize().hasValue() || MMO->getSize() == getSize()) && "Size mismatch!"' failed.
- [LoopVectorize] Assertion 'VecTy.SimpleTy != MVT::INVALID_SIMPLE_VALUE_TYPE && "Simple vector VT not representable by simple integer vector VT!"' failed.
- [LoopVectorize][VPlan] Found non-header PHI recipe in header - Assertion `verifyVPlanIsValid(*Plan) && "VPlan is invalid"' failed.
- [Clang] Assertion isCurrentFileAST() && "dumping non-AST?" failed. with -module-file-info
- [Clang][Interp] Assertion 'Offset + sizeof(T) <= Pointee->getDescriptor()->getAllocSize()' failed. with -fexperimental-new-constant-interpreter
- [Clang] Segfault with -fcoverage-mapping -fcs-profile-generate -fprofile-instr-generate
- [X86][RISC-V][AARCH64] fatal error: error in backend: Can only embed the module once with -fembed-bitcode -ffat-lto-objects -flto
- [RISC-V] Unhandled encodeInstruction length! at RISCVMCCodeEmitter.cpp:338 with -fglobal-isel -fstack-protector-all
- [RISC-V] LLVM ERROR: unable to legalize instruction with -fglobal-isel -finstrument-functions -flto -fuse-ld=lld
- [X86] LLVM ERROR: cannot select with -fglobal-isel -finstrument-functions -flto
- [LLD] Unreachable executed with -fsplit-stack
- [RISC-V] Unresolvable relocation with -fdirect-access-external-data -fstack-protector-all
- [Clang] Assertion 'Symbol' failed. with -fdebug-macro -gline-directives-only
- [CodeGen] Assertion 'Offset >= Size' failed. with -mms-bitfields
Have an improvement? PRs are welcome!