rust-lang / crater Goto Github PK

Run experiments across parts of the Rust ecosystem!

Rust 93.65% CSS 1.30% HTML 3.95% JavaScript 0.11% Python 0.56% Dockerfile 0.43%

crater's Introduction

Crater

Crater is a tool to run experiments across parts of the Rust ecosystem. Its primary purpose is to detect regressions in the Rust compiler, and it does this by building a large number of crates, running their test suites and comparing the results between two versions of the Rust compiler.

It can operate locally (with Docker as the only dependency) or distributed on the cloud. It only works on Linux at the moment, and it's licensed under both the MIT and Apache 2.0 licenses.

The current features of Crater are:

Discover Rust codebases on crates.io and GitHub
Execute experiments on custom Rust toolchains
Run cargo build and cargo test over all the discovered codebases
Build and test without dependency updates or network access
Run arbitrary tests over all the discovered codebases
Generate HTML reports with results and logs
Isolate tests in Docker containers

Crater is a successor to taskcluster-crater. It was subsequently named cargobomb before resuming the Crater name.

⚠️ DO NOT RUN CRATER IN AN UNSANDBOXED ENVIRONMENT ⚠️
Crater executes malicious code that will destroy what you love.

Documentation

Want to contribute to Crater? Check out the contribution guide.

User documentation:

Operations documentation:

Technical documentation:

crater's People

Contributors

Stargazers

Watchers

Forkers

frewsxcv tomprince steveklabnik erickt simonrw jdm notriddle jrmuizel aidanhs aturon alanhdu colinbankier oli-obk lambda hanna-kruppe cwndrws simonsapin pramodbisht jkordish fintelia canndrew pietroalbini csmoe gnzlbg sgrif cryze xd009642 buikristy andrewspeed devonhollowood bofh69 zachlute aaron1011 boxfort zeegomo ecstatic-morse marwes faern gbuglione rylev yaahc matthiaskrgr spennydl centril johntitor mibac138 nemo157 sangirash rbtcollins isgasho tshepang ralfjung jyn514 alex lightningcreations global-localhost global19 global19-atlassian-net isabella232 tryweirder ehuss tmandry freax13 kzys saethlin mark-simulacrum mangoboxlabs noracodes willcrichton enselic cad97 dtolnay-contrib wesleywiser est31 guillaumegomez gaurotello oatall sarvex chyba1997 iq-scm seanpm2001 lqd skgland nbdd0121 mrnmrsbenejam wangkirin

crater's Issues

Support running custom toolchains.

Filter failures by timeout / compiler error / LLVM assertion.

From the original issue (rust-lang/rust#33638), by eddyb:

We could be missing out on huge bugs just because they aren't obvious regressions.
And on the other hand, timeouts just fudge the stats without being relevant most of the time.

Failure without an error message

Of the latest cargobomb run by @tomprince , the following regressions were reported without any error message:

Blacklist update needed

Here are some flaky/buggy tests:

rust-lang/rust#43957
rust-lang/rust#43958
rust-lang/rust#43959
rust-lang/rust#43960
rust-lang/rust#43961
rust-lang/rust#43962
rust-lang/rust#43963

Crates which create files in build scripts can't be cargobombed

For instance any crate using LALRPOP needs to generate a .rs file from the grammar file but due to the file system being read-only.

Filter failures by whether the dependencies passed.

Hi,

It often happens that a crate brakes causing many downstream build failures. It would be nice to know in the ui if a crate build-fail on its own code or in a dependency. It may be possible to find this info in the logs, but if not we are also testing all the dependencies, so if any of the dependencies are build-fail then this crate is build-dependencies-fail.

Just my unrequested 2cents.

Getting Servo tested by crater

rust-lang/rust#43880 proposed a change that could potentially break some code, and cargobomb was used to evaluate the impact. This found a couple issues that were fixed, and then the PR was landed. It only after this change reached the Nightly channel that the Travis-CI cron job at https://github.com/servo/servo-with-rust-nightly/ found that this change broke Servo.

I’d like Servo, or parts of it, to be added to the set of crates that are tested by cargobomb for PRs like rust-lang/rust#43880. Is there some list we can add to?

The two main entry points in https://github.com/servo/servo/ are:

Servo itself. Build with ./mach build --dev or (cd ports/servo && cargo build). This uses unstable features and is known to compile with the Nightly version (date) specified in ./rust-toolchain. That version is generally updated within days of breakage reaching the Nightly channel.
The parts going into Firefox (currently only Stylo a.k.a. Quantum CSS). Build with ./mach build-geckolib or (cd ports/geckolib && cargo build). This is known to compile with the release specified in ./rust-stable-version and should work with any later version or Nightly.

By default mach will download appropriate versions of Rust and Cargo and not use those in $PATH. This can be changed with a config file, but running cargo directly might be easier.

We can definitely change or add things to the repository to make it easier for cargobomb to discover what to build/run.

CC @metajack @jonathandturner

Limit size of logs being created by default

We truncate logs manually to 100MB.

This needs doing automatically (actually truncating it, not just partially uploading it, so we save disk space) and reducing it to 5MB. Maybe even 2MB (which, for context, is about how big a log for a successful build of Rust in travis is)!

Any thoughts on the size off the top of your head @Mark-Simulacrum?

Report generation no longer uploads `shas.json`.

link to the source in the ui

Hi,

I was trying to do some drive by triaging. Small thing but, It would make it easier if the website had a link to the source code for each crate. I.e. a link to the Repository for things from crates.io, and the Github url for things from Github.

Just my unrequested 2cents.

Reconsider running generate-lockfiles in docker

Containerization seems to add significant time to prepare-ex, and there's no sandboxing requirement here.

auto-bisect regressions based on the pr-by-pr builds

Since we now have builds available from every PR, when cargobomb detects a regression, it could in principle bisect to find the precise PR that caused it. This would be amazing.

Cap lints

It would be cool if cargobomb would cap the lints so that crates which did #![deny(warnings)] don't get any regressions reported.

Tracking issue: cargobomb -> crater rename

We're bringing all things 'ecosystem testing' back under the crater name, since the precise mechanism for performing the tests isn't important, just that they get done. Possibly incomplete list of things that should be done (or at least thought about):

User-facing

Rename repo
Update readme in this repo
Update references on doc sites if relevant, e.g. https://forge.rust-lang.org/
Update tag names on rust-lang repo

Code

Update code in this repo

Infra

Update instance hostnames
Change s3 bucket

These are in approximate order of priority - docs, then code, then infra. This is in rough difficulty order as well, since (for example) we should make at least some effort to keep existing links alive when changing s3 bucket.

For now I'd like to get the docs section done, then see how things look after impl period.

Ignore blacklisted crates

Similar to #99, we maintain a blacklist in blacklist.md of crates with unreliable test suites, but it is not used by cargobomb. Teach cargobomb to ignore these, perhaps by reporting their result as 'blacklisted'.

Allow some file system usage by tests of a crate

Hi,

Some crates may need file system access/operations for tests or examples, I got a crate that has file system access and the test fails( here is an example log:https://gist.github.com/tglman/6347a78ea3695ad0b2bbfa1158c6f3f1), in the specific of this crate the test do operations on the root folder of the crate but it's not a requirement probably have access to a tmp file system is good enough.

include error messages in reports w/o requiring clicks

Going through cargobomb reports is rather tedious and involves quite a lot of clicking. I think the reports could summarize some information that would make them so much more useful. For example, they could grep through the output for the word "error" and include, with each entry, the error lines (and ideally a link into the log to see more).

Once we have that, we could start grouping instances of the same error message (ideally, "similar" error messages), but that's probably another feature.

Building the docker container takes longer than the timeout, so fails.

This should probably be fixed by making the timeout for commands configurable.

Windows support

There's a large number of crates out there which are Windows specific, so it is essential that crater be able to test those crates.

CLI usage on Windows

#267: Have Crater successfully build on windows
#275: Run cargo check on Appveyor
#280: Run cargo build on Appveyor
- C dependencies needs to be downloaded in the Appveyor image
Implement and test all the functions in src/native/windows.rs
- #332: It would be nice to isolate all platform-specific code into a native module
Create the Docker image used by Crater to test crates
Successfully execute a demo run on Windows
Run cargo test on Appveyor
- The test suite needs to be working on Windows
Successfully execute a full run on Windows
- This probably needs to be done on a server, full runs takes ages

Craterbot support for Windows

Add a way to categorize agents by platform
Add support for scheduling runs on multiple platforms at the same time
Add support for rendering multiple platforms together in the report
Ensure the agent works on Windows

Reference material

https://docs.microsoft.com/en-us/virtualization/windowscontainers/about/

prepare-ex command has extreme latency due to null registry updates

Each call to cargo generate-lockfile does a registry update, so this step takes a great amount of time. Cargo itself does not expose a way to avoid this. It is possible to do with the cargo API, but I'd prefer to just use cargo from the command line if possible.

cc rust-lang/cargo#3479

Switch to using xz packages when installing custom toolchains.

This depends on using a version of rustup newer than 1.3.

Replace shelling out git in `src/git.rs` with usage of `git2` crate.

demo list is not working

$ cargo run -- define-ex beta nightly
    Finished dev [unoptimized + debuginfo] target(s) in 0.0 secs
     Running `target/debug/cargobomb define-ex beta nightly`
May 13 21:36:25.611 INFO program args: define-ex beta nightly
thread 'main' panicked at 'assertion failed: `(left == right)` (left: `1`, right: `2`)', src/ex.rs:92
note: Run with `RUST_BACKTRACE=1` for a backtrace.
May 13 21:36:25.865 ERRO panicked: assertion failed: `(left == right)` (left: `1`, right: `2`)
May 13 21:36:25.866 INFO command failed
May 13 21:36:25.866 INFO logs: ./work/logs/2017-05-13T21-36-25.601166413
May 13 21:36:25.866 INFO duration: 0s

logging to file not working correctly after slog changes

there should be lines here about what failed: http://35.184.54.15/ex/default/res/beta/reg/serde_json-1.0.2/log.txt

broke after #28

Crater fails to build/run on macOS

I am trying to run crater on macOS 10.13.1 (17B48) and getting the following failure:

error: failed to run custom build command for `libgit2-sys v0.6.12`

Here is the full output from running cargo run -- prepare-local --docker-env mini.

I have installed libgit2, libssh2, and openssl via home brew.

Add a "broken list" of crates that always fail

According to a recent run there were > 3000 crates that failed to build. Some are certainly due to cargobomb issues, but I bet a lot are just plain broken. That's a lot of CPU time to waste. Add another type of list, probably just hardcoded in a text file, of crate / versions known to not work, and have cargobomb skip them.

Track build/test times of individual crates across run.

The current estimation of time is based on taking the average time for the crates that have been run in the current process. Since crates can vary significantly in how long they take to build (or fail to build), the estimates can vary wildly (I have seen variations of at least 2x). This could be improved by averaging over previous runs of the remaining crates, perhaps adjusted based on the current run of the already completed crates.

is_rust_app loop could be parallelized

this could easily be parallelizeable via rayon, github rate limiting permitting

https://github.com/brson/cargobomb/blob/bbb044725c2cf32206d075c9439b29808c043ac4/src/lists.rs#L252oiy

Error message references invalid command

corey@mac ~/d/cargobomb (master) [1]> cargo run -- define-ex stable beta --crate-select top-100
    Finished dev [unoptimized + debuginfo] target(s) in 0.0 secs
     Running `target/debug/cargobomb define-ex stable beta --crate-select top-100`
boom! program args: define-ex stable beta --crate-select top-100
kaboom! unable to read pop list. run `cargobomb create-full-lists`?
kaboom! caused by: No such file or directory (os error 2)
boom! command failed
boom! logs: 2017-05-13T13-14-24.427538000
boom! duration: 0s
corey@mac ~/d/cargo

create-full-lists is not a command

TSA won't let me board plane

They asked if I had any explosives and I told them "nothing except for cargobomb on my laptop." Now I'm on the No Fly List and can't get to the rust conference.

Thanks @brson for making such an explosive testing package. Now everyone is going to think rust devs are terrorists trying to board planes with bombs in their cargo. I don't think this was the kind of publicity rust wanted.

Tracking issue: crater needs a test suite

There are aspects of crater we should have some confidence of through tests, but our coverage today is minimal (zero tests, so all current confidence comes from us doing runs and them not breaking).

Basic functionality:

Running the readme instructions should always work
The blacklist should work

Security/isolation:

A process in a hot output loop should be killed after a timeout (#138)
A process generating tons of garbage should be killed after a certain size is hit (#160)
Network access should be disabled during tests
Network access should be disabled in build scripts

Issues that are fixed by PR should be considered for addition to this list as well.

New boilerplate for crater results

Initial minimal tweak:

Hi X (crater requester), Y (reviewer)! Crater results are at: <url>. 'Blacklisted' crates (spurious failures etc) can be found here. If you see any spurious failures not on the list, please make a PR against that file.

Cargobomb leaks processes when they timeout.

Tetanus is dead, echran is indev.

As I said in the title. I am the maintainer of these repos, and unless I get help they won't be modified to newer rust versions.
https://github.com/LogoiLab/echran
https://github.com/LogoiLab/Tetanus

Crater should do runs against builds with llvm assertions in (by default)

This used to be the case, but changed. This will mean splitting up the set of builds perf and crater use.

See

Error message references non-existent command

https://github.com/brson/cargobomb/blob/4b6e1c178794cfb36caa5cf93381d6bc0eff9fc4/src/lists.rs#L352

Could not read log due to not being able to open result file

@aidanhs says that they've seen these as well.

[cargobomb-prod cargobomb]$ cargo run --release -- publish-report --ex stable-1.20-beta-1.21.0-beta.1 s3://cargobomb-reports/stable-1.20-beta-1.21.0-beta.1
    Finished release [optimized] target(s) in 0.1 secs
     Running `target/release/cargobomb publish-report --ex stable-1.20-beta-1.21.0-beta.1 's3://cargobomb-reports/stable-1.20-beta-1.21.0-beta.1'`
Sep 08 22:38:46.067 INFO program args: publish-report --ex stable-1.20-beta-1.21.0-beta.1 s3://cargobomb-reports/stable-1.20-beta-1.21.0-beta.1
Sep 08 22:38:59.673 INFO writing results to s3://cargobomb-reports/stable-1.20-beta-1.21.0-beta.1
Sep 08 22:44:45.946 ERRO Could not read log for glium_macros-0.0.1 1.20.0: Couldn't open result file.
Sep 08 22:44:45.947 ERRO Could not read log for glium_macros-0.0.1 beta-2017-09-02: Couldn't open result file.
Sep 08 22:47:11.178 ERRO Could not read log for libbreakpad-client-sys-0.1.0 1.20.0: Couldn't open result file.
Sep 08 22:47:11.178 ERRO Could not read log for libbreakpad-client-sys-0.1.0 beta-2017-09-02: Couldn't open result file.

Install libreadline-dev

See for example http://cargobomb-reports.s3.amazonaws.com/stable-1.21.0-beta-1.22.0-beta.3/beta-2017-11-13/gh/SirVer.shell_grunt2.75a90cd9a915bed8b27d472c1185d44087299b92/log.txt.. lua needs libreadline headers to build.

Links to test logs should be URL-escaped, otherwise will get AccessDenied

Repro steps:

Go to http://cargobomb-reports.s3.amazonaws.com/pr-45333/index.htm
Click "regressed"
Open any of the two logs for the crate google-dfareporting2d2-cli-1.0.6+20160803
Got an AccessDenied error ← unexpected
Change the URL's + sign to %2B
Now you can read the log.

Container time restrictions don't always apply

On the cargobomb-prod instance there's a crate (pleingres) that generates huge amounts of output due to erroring in a tight loop, but doesn't get terminated after 30min. Could be a bug in our timer code where if stdout is always saturated it never gets round to checking if the timer has expired?

`cargobomb create-lists-full` results in panic

https://github.com/brson/cargobomb/blob/bbb044725c2cf32206d075c9439b29808c043ac4/src/model.rs#L222-L223

corey@mac ~/d/cargobomb (master) [1]> cargo run -- create-lists-full
    Finished dev [unoptimized + debuginfo] target(s) in 0.0 secs
     Running `target/debug/cargobomb create-lists-full`
boom! program args: create-lists-full
thread 'main' panicked at 'unimplemented args_to_cmd create-lists-full', src/model.rs:314
stack backtrace:
   0: std::sys::imp::backtrace::tracing::imp::unwind_backtrace
   1: std::panicking::default_hook::{{closure}}
   2: std::panicking::default_hook
   3: std::panicking::rust_panic_with_hook
   4: std::panicking::begin_panic
   5: std::panicking::begin_panic_fmt
   6: cargobomb::model::conv::clap_args_to_cmd
   7: cargobomb::main_
   8: core::ops::FnOnce::call_once
   9: std::panicking::try::do_call
  10: __rust_maybe_catch_panic
  11: std::panicking::try
  12: std::panic::catch_unwind
  13: cargobomb::main
  14: __rust_maybe_catch_panic
  15: std::rt::lang_start
  16: main
kaboom! panicked: unimplemented args_to_cmd create-lists-full
boom! command failed
boom! logs: 2017-05-13T13-12-44.816960000
boom! duration: 0s

Some `build-fail`s are wrongly treated as `test-fail`s.

From the most recent cargobomb report at rust-lang/rust#44287, the following should be considered build-fail, but are shown as test-fail:

generator-0.5.0 failed to build an example.
treeflection_derive-0.1.16, ditto.
whiteread-0.4.3 failed to build a doctest.

Failed git pull caused by force push

Oct 16 12:09:38.738 INFO pulling existing url git://github.com/DenialAdams/bamegoy into ./work/local/gh-mirrors/DenialAdams.bamegoy
Oct 16 12:09:38.738 INFO running `"git" "pull"`
Oct 16 12:09:38.860 INFO blam! Auto-merging src/rom.rs
Oct 16 12:09:38.860 INFO blam! CONFLICT (add/add): Merge conflict in src/rom.rs
Oct 16 12:09:38.860 INFO blam! Auto-merging src/ppu.rs
Oct 16 12:09:38.860 INFO blam! CONFLICT (add/add): Merge conflict in src/ppu.rs
Oct 16 12:09:38.860 INFO blam! Auto-merging src/main.rs
Oct 16 12:09:38.860 INFO blam! CONFLICT (add/add): Merge conflict in src/main.rs
Oct 16 12:09:38.860 INFO blam! Auto-merging src/cpu.rs
Oct 16 12:09:38.860 INFO blam! CONFLICT (add/add): Merge conflict in src/cpu.rs
Oct 16 12:09:38.860 INFO blam! Auto-merging Cargo.toml
Oct 16 12:09:38.860 INFO blam! CONFLICT (add/add): Merge conflict in Cargo.toml
Oct 16 12:09:38.860 INFO blam! Auto-merging Cargo.lock
Oct 16 12:09:38.860 INFO blam! CONFLICT (add/add): Merge conflict in Cargo.lock
Oct 16 12:09:38.860 INFO blam! Automatic merge failed; fix conflicts and then commit the result.
Oct 16 12:09:38.860 ERRO unable to pull git://github.com/DenialAdams/bamegoy
Oct 16 12:09:38.860 ERRO caused by: command `"git" "pull"` failed

Tracking issue: crater is slow

Crater has a number of inefficiencies that can be resolved to make life better for everyone.

Impactful

#96 - generate-lockfile does a registry update
#101 - reconsider doing generate-lockfile inside a docker container
Parallelise prepare-local
Parallelise run-tc (currently relies on inherent parallelism of cargo and builds everything into a huge target directory - some complexity here to figure how to manage the shared resources correctly).
Investigate using codegen-units, experiments show it speeds up rustc build a lot, but slows down tests. ThinLTO may help also.

Nice-to-have

Consider offering a build-only mode (probably build tests as well)

Metrics

Monitor which crates are taking the longest and what the long tail of slow crates looks like
Get a breakdown of what phases are taking time (e.g. build, test build, testing) to inform future work

Speculative

allow cargobomb to distribute over multiple machines

Crater should use the same cargo version to generate lockfiles as to do the build

rust-lang/cargo#4563 has made generating with stable and building with beta and --frozen break.

cc @Mark-Simulacrum

Reconsider CLI design

i haven't thought too hard about this, but i think the CLI could be improved. for example, it's unclear to me when i have a certain experiment if it's already been run, what toolchain it used, etc. another painpoint is if i specify define-ex nightly beta, and it's been a few days, i'm unsure which exact nightly the experiment ran on

toolchains (are these necessary?):

cargobomb toolchain --list
cargobomb toolchain --add nightly

experiments:

cargobomb experiment new --type comparison <toolchain1> <toolchain2>
cargobomb experiment new --type query <toolchain>
cargobomb experiment prepare
cargobomb experiment run
cargobomb experiment gen-report

i'll probably update this more as i'll probably have further thoughts about this. curious if anyone else has opinions

Retry on failure to upload to s3

When uploading, I've seen an error from s3 that said <Error><Code>InternalError</Code><Message>.... We should probably try at least one more time.

Tracking issue: running crater locally

We would like people interested in crater to be able to clone and run a series of commands to be able to get a quick taste of the full crater process.

Known papercuts/issues here include:

#71 - building the docker image for the first time takes a while, so the prepare-local command fails due to a timeout imposed on the command (prepare-local possibly shouldn't have a timeout at all). Suggested workaround is to cd docker && docker build ., so the image is built by the time you try and run prepare-local.
#38 - the 'demo' set of crates doesn't work, so prepare-local will fail.

These are just the known issues so far.

Applications from github often have out-of-date lock files, which causes the build to fail.

@brson: I'm seeing this with master, is this typical?

Represent github URLs as a struct, so formatting them can be error-free.

gh_url_to_org_and_name and repo_dir shouldn't be able to fail