SAAS Autograder

Travis C. I.

####Berkeley SAAS w/ edX

Berkeley folks
- There is an AMI (id ami-df77a8b6) that includes latest version of the autograder.
Other folks
- the autograder code is at github:saasbook/rag.

Usage

Launch an EC2 instance (micro is fine) with autograder AMI (Ubuntu 14.04). If you are using solutions from a private repo, make sure you put the private key in /home/ubuntu/.ssh/id_rsa
You will need to provide a config file that provides login information. See configuration below.
The autograder should be run from CLI in the background once properly configured.

Configuration and setup

The ubuntu_install.sh script is provided in the repo for easy set up on Amazon machines. You can also refer to it to set up it locally.

There is one config file hosted locally on the autograder required for setup with edX: config/conf.yml.

conf.yml includes the following:

 default:
   adapter: XQueue  #Name of the submission interface being used.
   queue_uri: 'https://xqueue.edx.org/'
   queue_name: 'cs169x-development'
   django_auth:
     username: 'username'
     password: 'password'
   user_auth:
     user_name: 'username'
     user_pass: 'password'
 	  log_to_file: true
 	  log_level: 1

default defines the current strategy being used by the autograder.
The rest of the information should be filled in appropriately. Currently only supports XQueue as a submission interface

Remote edX configuration

If using edX, you must configure the homework to point to where the autograder can retrieve the spec files for the homework.
The grader payload is specified as XML and is passed to the autograder as JSON. It contains the following:

```
assignment_name: 'assign1'  # the name of the assignment
autograder_type: 'HerokuRspecGrader'  # type of grader to use on the submission. Will be deprecated and moved into hw repo.
assignment_spec_uri: '[email protected]:zhangaaron/example_hw1_sol.git'  #  a homework directory containing spec files for the autograder to run against HW
due_dates: {'20150619000000': 1.00, '20150620000000': 0.30}  # a hash that defines time brackets and grade scaling based on submission time. If date < key, then will receive scaling value associated with key. 
version: 1.0.0  # the version of RAG configured to use with this homework
```

Execution and tests

####To run the autograder program: while true; do bundle exec ruby run_autograder.rb path/to/configfile ; done

Under normal execution it is possible for the autograder to crash. The autograder is resilent against student code submission but network interruptions can cause the autograder to crash, this will allow it to start up again.

Autograder Refactoring Summer 2015

The rag autograder was completely refactored in summer of 2015. The only legacy code that remains is FeatureGrader and HW4Grader, which were minimally refactored to work with the new autograder structure. One of the main goals of the refactoring project were to increase ease of setup and configuration of the autograder. We eliminated several set up steps required to get the autograder started, and simplified config files.

We also wanted to make the autograder easier to extend. This was accomplished by doing major refactoring and adopting software design patterns. We split up the autograder into three basic classes, assignments, submission systems, and autograder engines. The hope is that those three classes can be easily subclassed for future extension.

We made the autograder less brittle by eliminating almost all cases of regex parsing to grade submissions and opted to pass information through classes when possible. This should help future proof the autograder from external changes.

In respect to the edX submission system, which is the only submission system that the autograder currently uses, we heavily refactored it to use an external gem xqueue_ruby which handles the low-level API calls to edX's XQueue platform and started using the grader_payload as a JSON dictionary configuring how the assignment is graded. This paves the way for "grading as a service" since anyone should be able to create an assignment on edX to be graded by our grader without having to go through the setup of the autograder themselves.

One thing that we wish we were able to do was increase autograder resilence against malicious code but this turns out to be rather hard to do. Options considered for making the autograder more secure were chrooting the grading subprocess, setting the $SAFE ENV value for the subprocess, and disabling the Kernel library to prevent calls to fork or killing the main grading process. These efforts were confounded by the fact that all of these interfere with the operation of external libraries such as RSpec.

The mutation testing/feature grader (HW 3)

At a high level HW3 and others that use FeatureGrader work by running student-submitted Cucumber scenarios against modified versions of a instructor-provided app.

The Following Diagram roughly describes the flow of the autograder :

Each step defined in the .yml file can have scenarios to run iff that step passes.

Example from hw3.yml:

- &step1-1
      FEATURE: features/filter_movie_list.feature
      pass: true
      weight: 0.2
      if_pass:
      - &step1-3
      FEATURE: features/filter_movie_list.feature
      version: 2
      weight: 0.075
      desc: "results = [G, PG-13] movies"
      failures:
      - *restrict_pg_r
      - *all_x_selected

In this case if step1-1 passes, Step 1-3 will be run. If step1-1 fails then step1-3 will not run and the student will not receive points for it. It is important that the outer step be less restrictive than the inner step (If the outer one fails, there should be no way that the inner one could pass).

Step1-3 has two scenarios specified as failures; this indicates that when the cucumber features are run, both of those scenarios should fail. In other words, when the mutation for this step is added to the app, the student’s tests should detect the change and fail. (Example: If the test is to ensure that data is sorted, and the mutation is that the data is in reverse order, the student’s test should fail because the app is not behaving as expected)

Defining a new step:

In order to add a new step the following must be done:

Add an entry to the .yml file.
The new entry should be a hash with the following properties:
1. FEATURE, a relative path to the Cucumber feature that will be run for this step.
2. weight, the fraction of total points on this homework represented by this feature
3. version: This sets an environment variable that the mutation-test app can use to add any modifications desired to the app before the feature is run.
4. desc: A string describing the step, used when providing feedback
Optional properties:
1. failures (list): scenarios that should fail in this step
2. if_pass (list): steps to run iff this step passes.

Defining a new Scenario:

To define a new scenario add a new entry to the "scenarios" hash in the .yml file. It is a good idea to set an alias for the scenario so it can be referenced later inside of steps.

The entry should contain:

match: A regular expression that will identify the name of this scenario. (Used when parsing cucumber output to see if this scenario passed or failed)
desc: A description of the scenario. (Used to give feedback to the student)

Adding a mutation to the app:

When a feature is run, the environment variable version will be set to the value of the version property for that feature. Use this as a feature flag in the app (by checking ENV["version"]) to trigger a "bug", e.g. reversing sort order/not returning all data.

codlife / rag Goto Github PK

rag's Introduction

SAAS Autograder

Usage

Configuration and setup

Remote edX configuration

Execution and tests

Autograder Refactoring Summer 2015

The mutation testing/feature grader (HW 3)

Defining a new step:

Defining a new Scenario:

Adding a mutation to the app:

rag's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent