`test161`: A Testing Tool for OS/161

test161 is a command line tool for automated testing of OS/161 instructional operating system kernels that run inside the sys161 (System/161) MIPS R3000 simulator. You are probably not interested in test161 unless you are a student taking or an instructor teaching a course that uses OS/161.

Installing Go

Many Linux distributions package fairly out-of-date versions of Go. Instead, we encourage you to install the Go Version Manager (GVM):

sudo apt-get install -y curl bison # Install requirement
bash < <(curl -s -S -L https://raw.githubusercontent.com/moovweb/gvm/master/binscripts/gvm-installer)
source $HOME/.gvm/scripts/gvm

At this point you are ready to start using GVM. We are currently building and testing test161 with Go version go1.5.3. However, because the Go compiler is now written in Go, installing versions of Go past 1.5 require install Go version 1.4 first.

gvm install go1.4
gvm use 1.4
gvm install go1.5.3
gvm use 1.5.3 --default

`GOPATH`

Note that gvm will set your GOPATH and PATH variables properly to allow you to run Go binaries that you install. However, if you are interested in writing Go code you should set a more accessible GOPATH as described as described here.

Installing `test161`

Once you have Go installed, upgrading or installing test161 is simple:

go get -u github.com/ops-class/test161/test161

Configuration

Out of the box, test161 can’t do much without a set of Tests, Targets, and Commands, as well as an OS/161 root directory which contains your kernel. If you are starting from the ops-class OS/161 sources, as soon as you configure, compile, and install your userland binaries and kernel, test161 will work in either your root or source trees.

Submission

In order to submit to the test161 server, you need to configure your username/token, which can be done with:

test161 config add-user <username> <token>

Removing users and changing configured tokens can be done with:

test161 config del-user <username>              // Delete user information
test161 config change-token <username> <token>  // Change token

Custom Configuration

The ops-class sources create symlinks between your OS/161 source directory and root directory, which may not always be what you want. To support partial environments where either source or root cannot be inferred from the other, you can use test161 config to set the directory containing tests, targets, and commands:

test161 config test161dir <path>

This allows you to run test161 run … from your root directory.

Secure Overlay

The test161 server uses an overlay directory, containing an overlay for each assignment, which it rsyncs over student submissions before compiling. The overlay contains files that we don’t trust from students, e.g. make files, anything that prints SUCCESS. To enable an overlay for local testing, set the TEST161_OVERLAY environment variable to the overlay path.

Local Server

Set the TEST161_SERVER environment variable for testing submissions with a local server.

Printing Configuration

To view the current test161 configuration, simply run test161 config.

Usage

Running Tests

To run a tests, group of tests, or target, use the test161 run <names> sub-command. Here, names can be a single target, a list of tests, or a list of [tags].^[1] For test files, <names> is a list of http://www.linuxjournal.com/content/globstar-new-bash-globbing-option [globstar] style file names. The following are all valid commands:

test161 run sync/*.t         # Run all tests in the tests/sync
test161 run **/l*.t          # Run all tests in all sub-directories beginning with 'l'
test161 run syncprobs/sp*.t  # Run the syncprobs
test161 run sync/lt1.t       # Run lock test 1
test161 run locks            # Run all lock tests (tests tagged with 'locks')
test161 run asst1            # Run the asst1 target

Command Line Flags

There are several command line flags that can be specified to customize how test161 runs tests.

-dry-run (-r): Show the tests that would be run, but don’t run them.
-explain (-x): Show test detail, such as name, descriptions, sys161 configuration, commands, and expected output.
-sequential (-s): By default the output of all tests are interleaved, which can be hard to debug. Specify this option to run tests one at a time.
-no-dependencies (-n): Run the given tests without also running their dependencies. In test files, you can specify a test’s dependencies, and the test will be skipped if the dependency fails. The default behevior is to run all dependencies along with each test specified on the command line. This option does not apply for targets.
-verbose (-v): There are 3 levels of output: loud (default), quiet (no test output), and whisper (only final summary, no per test status).

Requirements

sys161 and disk161 in the path.

Default Settings

conf:
  cpus: 8
  ram: 1M
  disk1: # disabled by default, but should be enabled when you want swap disk
    sectors: min(2 * ram, 8000) # 8000 is a minimum due to a current sys161 bug
    rpm: 7200
    nodoom: true
  disk2: # disabled by default, but uses these defaults if configured
    sectors: min(2 * ram, 8000)
    rpm: 7200
    nodoom: false
  random: seed=random # random number generated at configuration time
stat:
  resolution: 0.01
  window: 1
monitor:
  enabled: "true
  intervals: 10
  kernel:
    enabled: false
    min: 0.001
    max: 1.0
  user:
    enabled: false
    min: 0.0001
    max: 1.0
  timeouts:
    prompt: 300
    progress: 10

Commands, Tests, and Targets

Commands

The basic unit in test161 is a command. such as lt1 for running Lock Test 1, or sp1 to run the whalemating test. Information about what to expect when running these commands, as well as what input/ouput they expect is specified in the commands/ directory in your test161 root directory. All .tc (test command) file in this directory will be loaded and commands must only be specified once.

Tests

Test files (*.t) are located in the tests/ directory in your test161 root directory. This directory can contain subdirectories to help organize tests. Each test consists of one or more commands, and each test can have its own sys161 configuration. Tests are run in their own sandboxed envrionment, but commands within the test are executed within the same sys161 session. Some tests will consist of multiple commands, and such tests are designed to stress test your system.

Targets

Target files (*.tt) are located in the targets/ directory in your test161 root directory. Targets specify which tests are run for each assignment, and how the scoring is distributed. When you test161 submit your assignments, you will specify which target to submit to.

Features

Testfile Syntactic Sugar

A line starting with $ will be run in the shell and start the shell as needed. Lines not starting with $ are run from the kernel prompt and get there if necessary by exiting the shell. sys161 shuts down cleanly without requiring the test manually exit the shell and kernel, as needed.

So this test:

$ /bin/true

Expands to:

s
/bin/true
exit
q

Note that commands run in the shell must be prefixed with $. Otherwise test161 will consider them a kernel command and exit the shell before running them. For example:

This test is probably not what you want:

s
/bin/true

Because it will expand to:

s
exit
/bin/true # not a kernel command

But this is so much simpler, right?

$ /bin/true

Test Tags

Optionally, tests can have one or more tags. test161 can be invoked to run these tests as a group with test161 run <tag>.

Progress Tracking Using `stat161` Output

test161 uses the collected stat161 output produced by the running kernel to detect deadlocks, livelocks, and other forms of stalls. We do this using several different strategies:

Progress and prompt timeouts. Testfiles can configure both progress (monitorconf.timeouts.progress) and prompt (monitorconf.timeouts.prompt) timeouts. The former is used to kill the test if no output has appeared, while the latter is passed to expect and used to kill the test of the prompt is delayed. Ideally OS/161 tests should produce some output while they run to help keep the progress timeout from firing, but the other progress tracking strategies described below should also help.
User and kernel maximum and minimum cycles. test161 maintains a buffer of statistics over a configurable number of stat161 intervals. Limits on the minimum and maximum number of kernel and user cycles (expressed as fractions) over this buffer can help detect deadlocks (minimum) and livelocks (maximum). User limits are only applied when running in userspace.
Note that test161 also checks to ensure that there are no user cycles generated when we are running in kernel mode, which could be caused by a hung progress.

Running multiple tests and dependencies

Correctness vs. Grading

Security

Multiple output strategies

test161 supports different output strategies through its PersistenceManager interface. Each TestEnvironment as a PersistenceManager which receives callbacks when events happen, like when scores changes, status change, or when output lines are added. This allows multiple implementations to handle output as they wish. The test161 client utility implements the interface through its ConsolePersistence type, which writes all input to stdout. The server uses a MongoPersistence type which outputs JSON data to our mongo backend server.

TODOs

Nits

Colored test output on terminals that support it? (Particularly for correct/incorrect.)
Handle missing newline correctly. Test with shll for lossy shell support.
sys161 version checks
Order the test output in some meaningful way, probably by depth in the dependency graph. (That way all skipped tests should be shown last.)
- Not necessarily true. You could have a long, unrelated branch that succeeds (even leaf nodes), but some unrelated depencency fails early. I made the default print order topological sort, but it’s still confusing. I added the reason a test is skipped, which helps. Maybe a nice ASCII art tree would work here…
Check for repository problems:
- Check and fail if it has inappropriate files (.o), or is too large. (Prevent backend storage DOS attacks.)
Use URL associated with the tree-ish id provided to test161 submit
Fix directory bash completion for test161 config test161dir. It’s unfortunately adding a space instead of /.

Performance Tracking

Most of the infrastructure is in place to handle performance targets, but we still need finish this and test it. Specifically, we need set the performance number in the Test object and use it properly in the Submission.

Parallel Testing Output

It would be cool to be able to print serial output from one test while queuing the output from other tests. Maybe using curses to maintain a line at the end of the screen showing the tests that are being run.

Tag Descriptions and Querying

It would be nice to be able to add descriptions to tags and have test161 print all tests that fall into a tag (or target) along with the description.

Output Frequency

For long running tests, OS/161 tests generate periodic output, usually in the form of a string of '.' characters. This output is used as a keep-alive mechanism, resetting test161’s progress timeout. Because this output is in a single line, and it would create more unnecessary DB output and server load to break these into multiple lines, it would be nice to refactor things in such a way that the current output line is periodically persisted. This would give students a better indication of progress, as opposed to tests looking "stuck".

Key saving

Now that we are having students save a key through the web interface we need to make sure that these keys get saved, associated with each successful submission, and not destroyed even if they are changed later.

Configuration override

It would be great if test161 run boot.t --sys161-cpus=1 worked properly. I think that there is a library for this.

Support for GDB backtraces on error

It should be possible to automate the process of hooking up a debugger and running BT on panics.

Server

Environment inference with environment variable overrides, similar to the test161 client
- test161-server config to both show the configuration and modify it
Log the configuration on startup
More usablility cleanup
- Usage
- Help
- Bash completion
Moving window for stats API
Periodically persist server stats, either in MongoDB or through the logger. We currently lose these on restart.
Move collaboration messages into their own files instead of hard-coding

1. In the case that tag and target names conflict, specify -tag if you mean tag.

kushwiz / test161 Goto Github PK

test161's Introduction

test161: A Testing Tool for OS/161