phetsims / aqua Goto Github PK

View Code? Open in Web Editor NEW

2.0 2.0 4.0 797 KB

Automatic QUality Assurance

License: MIT License

JavaScript 77.45% HTML 6.75% Shell 0.16% TypeScript 15.63%

aqua's People

Contributors

Stargazers

Watchers

Forkers

tommyteavee nickcrews nooberjones

aqua's Issues

CT race condition assigns results to the wrong repo

These errors are showing for dot today, note that the URLs come from friction. @jonathanolson can you please take a look?

dot : top-level-unit-tests : require.js
74 out of 74 tests passed. 0 failed.

Approximately 9/18/2018, 10:38:48 PM
dot : top-level-unit-tests : require.js?brand=phet-io
74 out of 74 tests passed. 0 failed.

Approximately 9/18/2018, 10:38:48 PM
dot : top-level-unit-tests : require.js?ea
74 out of 74 tests passed. 0 failed.

Approximately 9/18/2018, 10:38:48 PM
dot : top-level-unit-tests : require.js?ea&brand=phet-io
74 out of 74 tests passed. 0 failed.

Approximately 9/18/2018, 10:38:48 PM
dot : top-level-unit-tests : require.js?ea&brand=phet-io : load
Uncaught TypeError: false is not a function
TypeError: false is not a function
    at e.value (https://bayes.colorado.edu/continuous-testing/snapshot-1537331928780/friction/build/phet/friction_en_phet.html?postMessageOnLoad&postMessageOnError&postMessageOnBeforeUnload&fuzzMouse&fuzzTouch:924:636475)
    at end (https://bayes.colorado.edu/continuous-testing/snapshot-1537331928780/friction/build/phet/friction_en_phet.html?postMessageOnLoad&postMessageOnError&postMessageOnBeforeUnload&fuzzMouse&fuzzTouch:924:652204)
    at endDrag (https://bayes.colorado.edu/continuous-testing/snapshot-1537331928780/friction/build/phet/friction_en_phet.html?postMessageOnLoad&postMessageOnError&postMessageOnBeforeUnload&fuzzMouse&fuzzTouch:924:641844)
    at Object.up (https://bayes.colorado.edu/continuous-testing/snapshot-1537331928780/friction/build/phet/friction_en_phet.html?postMessageOnLoad&postMessageOnError&postMessageOnBeforeUnload&fuzzMouse&fuzzTouch:924:639809)
    at e.dispatchToListeners (https://bayes.colorado.edu/continuous-testing/snapshot-1537331928780/friction/build/phet/friction_en_phet.html?postMessageOnLoad&postMessageOnError&postMessageOnBeforeUnload&fuzzMouse&fuzzTouch:924:855958)
    at e.dispatchEvent (https://bayes.colorado.edu/continuous-testing/snapshot-1537331928780/friction/build/phet/friction_en_phet.html?postMessageOnLoad&postMessageOnError&postMessageOnBeforeUnload&fuzzMouse&fuzzTouch:924:855664)
    at e.upEvent (https://bayes.colorado.edu/continuous-testing/snapshot-1537331928780/friction/build/phet/friction_en_phet.html?postMessageOnLoad&postMessageOnError&postMessageOnBeforeUnload&fuzzMouse&fuzzTouch:924:854281)
    at e.mouseUp (https://bayes.colorado.edu/continuous-testing/snapshot-1537331928780/friction/build/phet/friction_en_phet.html?postMessageOnLoad&postMessageOnError&postMessageOnBeforeUnload&fuzzMouse&fuzzTouch:924:850148)
    at e.mouseToggle (https://bayes.colorado.edu/continuous-testing/snapshot-1537331928780/friction/build/phet/friction_en_phet.html?postMessageOnLoad&postMessageOnError&postMessageOnBeforeUnload&fuzzMouse&fuzzTouch:924:910189)
    at mouseToggleAction (https://bayes.colorado.edu/continuous-testing/snapshot-1537331928780/friction/build/phet/friction_en_phet.html?postMessageOnLoad&postMessageOnError&postMessageOnBeforeUnload&fuzzMouse&fuzzTouch:924:907342)
Approximately 9/18/2018, 10:38:48 PM
dot : top-level-unit-tests : require.js?ea&brand=phet-io : run
Uncaught TypeError: false is not a function
TypeError: false is not a function
    at e.value (https://bayes.colorado.edu/continuous-testing/snapshot-1537331928780/friction/build/phet/friction_en_phet.html?postMessageOnLoad&postMessageOnError&postMessageOnBeforeUnload&fuzzMouse&fuzzTouch:924:636475)
    at end (https://bayes.colorado.edu/continuous-testing/snapshot-1537331928780/friction/build/phet/friction_en_phet.html?postMessageOnLoad&postMessageOnError&postMessageOnBeforeUnload&fuzzMouse&fuzzTouch:924:652204)
    at endDrag (https://bayes.colorado.edu/continuous-testing/snapshot-1537331928780/friction/build/phet/friction_en_phet.html?postMessageOnLoad&postMessageOnError&postMessageOnBeforeUnload&fuzzMouse&fuzzTouch:924:641844)
    at Object.up (https://bayes.colorado.edu/continuous-testing/snapshot-1537331928780/friction/build/phet/friction_en_phet.html?postMessageOnLoad&postMessageOnError&postMessageOnBeforeUnload&fuzzMouse&fuzzTouch:924:639809)
    at e.dispatchToListeners (https://bayes.colorado.edu/continuous-testing/snapshot-1537331928780/friction/build/phet/friction_en_phet.html?postMessageOnLoad&postMessageOnError&postMessageOnBeforeUnload&fuzzMouse&fuzzTouch:924:855958)
    at e.dispatchEvent (https://bayes.colorado.edu/continuous-testing/snapshot-1537331928780/friction/build/phet/friction_en_phet.html?postMessageOnLoad&postMessageOnError&postMessageOnBeforeUnload&fuzzMouse&fuzzTouch:924:855664)
    at e.upEvent (https://bayes.colorado.edu/continuous-testing/snapshot-1537331928780/friction/build/phet/friction_en_phet.html?postMessageOnLoad&postMessageOnError&postMessageOnBeforeUnload&fuzzMouse&fuzzTouch:924:854281)
    at e.mouseUp (https://bayes.colorado.edu/continuous-testing/snapshot-1537331928780/friction/build/phet/friction_en_phet.html?postMessageOnLoad&postMessageOnError&postMessageOnBeforeUnload&fuzzMouse&fuzzTouch:924:850148)
    at e.mouseToggle (https://bayes.colorado.edu/continuous-testing/snapshot-1537331928780/friction/build/phet/friction_en_phet.html?postMessageOnLoad&postMessageOnError&postMessageOnBeforeUnload&fuzzMouse&fuzzTouch:924:910189)
    at mouseToggleAction (https://bayes.colorado.edu/continuous-testing/snapshot-1537331928780/friction/build/phet/friction_en_phet.html?postMessageOnLoad&postMessageOnError&postMessageOnBeforeUnload&fuzzMouse&fuzzTouch:924:907342)
Approximately 9/18/2018, 10:38:48 PM

JS code must be linted

We should add a way to lint the aqua JS code.

Use same testing framework for Node and HTML?

We started using mocha for chipper tests, and we should consider using it for browser tests too. I noticed in http://www.techtalkdc.com/which-javascript-test-library-should-you-use-qunit-vs-jasmine-vs-mocha/ that Mocha can use any assertion library--I wonder if that means we can use our ASSERT/assert.js for it and what that would imply. @jonathanolson thoughts?

Bayes CT now requires a *.js suffix to start CT?

pm2 start continuous-server reports

script not found : /data/share/phet/continuous-testing/aqua/js/continuous-server

But adding the *.js suffix seems to be OK. Why did this change? Shall we update continuous-testing-management.md?

npm used up all /tmp space?

Continuous testing was failing with:

1|continuo | 2017-05-05T16:23:28.018Z [ERROR] Failure to npm update /data/share/phet/continuous-testing//snapshot-1494001255097/area-builder:
1|continuo | npm ERR! Linux 3.10.0-514.2.2.el7.x86_64
1|continuo | npm ERR! argv "/usr/bin/node" "/usr/bin/npm" "update" "--cache=../npm-caches/area-builder"
1|continuo | npm ERR! node v6.9.1
1|continuo | npm ERR! npm  v3.10.8
1|continuo | npm ERR! path /tmp/npm-80366-949b5586
1|continuo | npm ERR! code ENOSPC
1|continuo | npm ERR! errno -28
1|continuo | npm ERR! syscall mkdir
1|continuo | npm ERR! nospc ENOSPC: no space left on device, mkdir '/tmp/npm-80366-949b5586'
1|continuo | npm ERR! nospc This is most likely not a problem with npm itself
1|continuo | npm ERR! nospc and is related to insufficient space on your system.
1|continuo | npm ERR! Please include the following file with any support request:
1|continuo | npm ERR!     /data/share/phet/continuous-testing/snapshot-1494001255097/area-builder/npm-debug.log

It's created ~230MB of temporary files.

I'm looking into having npm use a separate temporary directory (that we'll delete anyways).

@markmorlino, I didn't see anything mounted to /tmp directly, is there any way we can avoid this in the future?

2 files named test-sims.js

There are currently 2 files in aqua named test-sims.js. This is confusing, recommended to rename one of them. Assigning to @samreid since he created the second one (aqua/js/test-sims.js).

Built sims aren't loading in test-sims.html

Built sims are not loading in local tests using test-sims.html. @jonathanolson is this related to chipper 2.0? Maybe after version/id convention changes? All sims look like this:

Review how to create and run new unit tests

At today's meeting @jbphet asked if developers should be in the loop about how to use the new qunit testing, and I suggested we should discuss it at developer meeting.

Briefly, to add unit tests to a new repo:

cd repo
grunt generate-test-harness
touch js/{{REPO}}-tests.js

Then populate js/{{REPO}}-tests.js with tests, see dot/js/dot-tests.js for a good example, which looks like this:

// Copyright 2017, University of Colorado Boulder

/**
 * Unit tests for dot. Please run once in phet brand and once in brand=phet-io to cover all functionality.
 *
 * @author Sam Reid (PhET Interactive Simulations)
 */
define( function( require ) {
  'use strict';

  // modules
  require( 'DOT/BinPackerTests' );
  require( 'DOT/Bounds2Tests' );
  require( 'DOT/ComplexTests' );
  require( 'DOT/DampedHarmonicTests' );
  require( 'DOT/Matrix3Tests' );
  require( 'DOT/MatrixOps3Tests' );
  require( 'DOT/UtilTests' );
  require( 'DOT/Transform3Tests' );
  require( 'DOT/LinearFunctionTests' );
  require( 'DOT/Vector2Tests' );

  // Since our tests are loaded asynchronously, we must direct QUnit to begin the tests
  QUnit.start();
} );

Tests should be placed adjacent to the file they test (if applicable), like Vector2Tests.js is adjacent to Vector2.js. For completeness, here is Vector2Tests.js

// Copyright 2017, University of Colorado Boulder

/**
 * Vector2 tests
 *
 * @author Jonathan Olson (PhET Interactive Simulations)
 * @author Sam Reid (PhET Interactive Simulations)
 */
define( function( require ) {
  'use strict';

  // modules
  var Vector2 = require( 'DOT/Vector2' );

  QUnit.module( 'Vector2' );

  function approximateEquals( assert, a, b, msg ) {
    assert.ok( Math.abs( a - b ) < 0.00000001, msg + ' expected: ' + b + ', result: ' + a );
  }

  QUnit.test( 'distance', function( assert ) {
    approximateEquals( assert, new Vector2( 2, 0 ).distance( Vector2.ZERO ), 2 );
    approximateEquals( assert, new Vector2( 2, 0 ).distanceSquared( Vector2.ZERO ), 4 );
    approximateEquals( assert, new Vector2( 4, 7 ).distance( new Vector2( 6, 9 ) ), 2 * Math.sqrt( 2 ) );
    approximateEquals( assert, new Vector2( 4, 7 ).distanceSquared( new Vector2( 6, 9 ) ), 8 );
  } );
} );

To add a new repo for testing on Bayes, add it to the list in continuous-server.js which is currently:

// repo-specific Unit tests (require.js mode) from `grunt generate-test-harness`
[ 'axon', 'circuit-construction-kit-common', 'dot', 'kite', 'phetcommon', 'phet-core', 'phet-io', 'query-string-machine', 'scenery' ].forEach( function( repo ) {

Then make a request to @jonathanolson to restart bayes. The tests automatically run on Bayes with the following query parameters:

'', '?ea', '?brand=phet-io', '?ea&brand=phet-io'

removing a repository causes issues

The 'exemplar' repo was recently deleted, and after this happened, Continuous Testing (CT) started failing. It wasn't just failing for the examplar repo - it was failing to update the web page at all.

I worked with @phet-steele to fix this, and we did it by shutting it down, manually deleting the local copy of the examplar directory used by CT, doing a git pull, and restarting CT.

Is there something that can be done to make CT handle this automatically?

Use the simulation-main.js and config file as the harness to start up unit testing

One of the barriers to setting up unit testing for a new repo is creating and maintaining the scaffolding. For instance, dot has this code:

<!DOCTYPE html>
<html>
<!-- To run these tests, please launch this HTML file in a browser. It will use require.js to dynamically load the required files -->
<head>
  <meta charset="utf-8">
  <meta http-equiv="X-UA-Compatible" content="IE=edge"/>

  <title>Unit tests for Dot</title>
  <link rel="stylesheet" href="../../../sherpa/lib/qunit-1.14.0.css">

  <script src="../../../sherpa/lib/jquery-2.1.0.min.js"></script>
  <script src="../../../sherpa/lib/lodash-4.17.4.min.js"></script>
  <script src="../../../assert/js/assert.js"></script>
  <script type="text/javascript">
    window.assertions.enableAssert();
    window.assertions.enableAssertSlow();
  </script>

  <script data-main="../../js/dot-config.js" src="../../../sherpa/lib/require-2.1.11.js"></script>

  <!-- Loads the code needed to run unit tests for each particular library/codebase -->
  <script src="unit-tests.js"></script>
  <script src="../../../phet-core/tests/qunit/unit-tests.js"></script>
  <script src="../../../axon/tests/qunit/unit-tests.js"></script>
</head>
<body>
<!-- Where test results HTML is placed -->
<div id="qunit"></div>
<!-- Div for holding temporary HTML content needed by tests, which is reset by QUnit before every test -->
<div id="qunit-fixture"></div>
<script src="../../../sherpa/lib/qunit-1.14.0.js"></script>
<script>
  require( [ 'dot-config' ], function() {
    require( [ 'main', 'AXON/main', 'PHET_CORE/main' ], function( dot, axon, phetCore ) {
      window.dot = dot;
      window.axon = axon;
      window.phetCore = phetCore;

      QUnit.log( function( details ) {
        window.parent && window.parent.postMessage( JSON.stringify( {
          type: 'qunit-test',
          main: 'scenery',
          result: details.result,
          module: details.module,
          name: details.name,
          message: details.message,
          source: details.source // TODO: consider expected/actual, or don't worry because we'll run finer tests once it fails.
        } ), '*' );
      } );

      QUnit.done( function( details ) {
        window.parent && window.parent.postMessage( JSON.stringify( {
          type: 'qunit-done',
          failed: details.failed,
          passed: details.passed,
          total: details.total
        } ), '*' );
      } );

      runDotTests( '.' );
      runAxonTests( '../../../axon/tests/qunit' );
      runPhetCoreTests( '../../../phet-core/tests/qunit' );

      var $checkbox = $( '#qunit-filter-pass' );
      if ( !$checkbox[0].checked ) {
        $checkbox.click();
      }
    } );
  } );
</script>
</body>
</html>

This means it needs a custom config file, and a different way of loading requirejs modules. Much of this QUnit.done and QUnit.log are duplicated, currently across 13 files.

This code can be factored out, but we still face the problem of how to load module files from the repo. The strategy with the lowest impedance will be to run the unit tests in the same way that the sim is run. I'd like to experiment with using a query parameter to put the sim into a "unit test" mode. I'm not sure if it would still run the sim or not (perhaps not by default, but it could do so if unit tests warranted it).

This will help us understand how to proceed for phetsims/tasks#900

Several CT tests are no longer being done

These tests have not been done in CT for some time
axon
circuit construction kit common
phet core
phet io
phetcommon
querry string machine
tandem

More comprehensive automated testing

I keep running into issues with building Scenery, running unit tests, running examples, other miscellaneous files, etc., that is starting to sap time.

My current plan is to have a more general automated test that runs through a large assortment of things (and for most pages, it would load, wait X seconds, and if there have been no errors the test would pass):

Builds Scenery/Kite/Dot
Tests require.js mode unit tests for scenery/kite/dot/axon/phet-core
Tests built unit tests for scenery/kite/dot
Tests launching all examples, some documentation pages, and the playground from Scenery/Kite/Dot
Tests launching assorted Scenery test files, like renderer-comparison, text-bounds-comparison, etc.
Tests some sim color profile pages, e.g. molecule-shapes-colors.html
The ability to add other things that people want.

Run the existing fuzz iframe fuzz tests in aqua

This will support browser based tests for #2 and allow us to run them on repeat.

Continuous testing should use npm prune

See phetsims/phet-info#39 (comment)

Future enhancements to automated testing

After chipper 2.0 and some other work settles it would be good to discuss and prioritize some enhancements to automated testing.

I will bring this issue up at dev meeting for further discussion, but currently marking deferred.

A few comments already brought up:

SR: It would be great to easily be able to run all tests locally (including fuzz tests and unit tests)
AP: screenshot snapshot comparison
AP: keyboard-navigation fuzz testing
CM: list of shas for some column in CT report
SR: instructions for how to reproduce a failed test locally

seems also somewhat related to #15

Send notification emails

See #4

Continuous testing fails on new repo not checked out

Should it be able to check out new repos on its own, or just ignore them and flag an error until it is updated?

undefined is not a constructor

phantomjs works great but doesn't have 100% ES5 compatibility. Notably, I've seen undefined is not a constructor for:

Number.isInteger
Number.sign

If phantom will not fix these issues, we may wish to work around them to get better test support.

Phettest can't find Twixt

@jonathanolson....why? I try to run expression exchange on phettest and I receive a script error. Specifically, the error is a 404 trying to find http://phettest.colorado.edu/expression-exchange/js/TWIXT/Easing.js, which totally looks wrong.

Some things I tried:

expression-exchange-config.js looks fine
I've refreshed chipper a bunch
I've pulled all a bunch
I verified on the phettest machine that twixt exists

What's going on? As far as I know, expression-exchange is one of few (the only?) sim trying to use twixt/js/Easing.js.

How to deal with assert.expect( 0 )?

In #30 we discussed assert.expect. Here's a case where it's used and currently necessary:

// Copyright 2017, University of Colorado Boulder

/**
 * QUnit Tests for BooleanProperty
 *
 * @author Sam Reid (PhET Interactive Simulations)
 */
define( function( require ) {
  'use strict';

  // modules
  var BooleanProperty = require( 'AXON/BooleanProperty' );

  QUnit.module( 'BooleanProperty' );
  QUnit.test( 'BooleanProperty', function( assert ) {
    window.assert && assert.throws( function() {
      new BooleanProperty( 'hello' ); //eslint-disable-line
    }, 'invalid initial value for BooleanProperty' ); // eslint-disable-line
    var c = new BooleanProperty( true );
    c.set( true );
    c.set( false );
    c.set( true );
    window.assert && assert.throws( function() {
      c.set( 123 );
    }, 'set an invalid value for BooleanProperty' );

    if ( !window.assert ) {
      assert.expect( 0 );
    }
  } );
} );

When assertions are enabled, two tests are run (the assert.throws calls). When assertions are disabled, 0 tests are run. If the assert.expect was commented out, QUnit would report a failure with this message:

Expected at least one assertion, but none were run - call expect(0) to accept zero assertions.

Two solutions for dealing with this problem are:

Use assert.expect(0) as we have been
Add a different assert test, such as assert.ok( c.value, 'boolean value should be true' ); which runs whether assertions are enabled or not.

(1) would fail if we add a test like (2). But (2) seems like a bit of a workaround to deal with QUnit expecting at least one test by default. @pixelzoom which do you recommend?

Install phantomjs on simian and get it running

See #2 and #4

Color coding on continuous-report to reflect phet-io api testing

In https://github.com/phetsims/phet-io-wrappers/issues/30 we decided it would be nice to have stable phet-io api test failures be reflected in a different color, so that they won't mess with other test failure outputs. We are not ready to implement this just yet, as we don't yet know exactly how the stable api will look, nor the stable api test. But I will let @jonathanolson know when we are ready to implement this feature. On hold for now.

Port from qunit-1.14.0.js to qunit-2.4.1.js

I'm seeing usages in:

I'll use the chipper generate-test-harness branch until review and merge to master is complete: phetsims/chipper#631

multiple issues with test-server

Related to phetsims/chipper#478, but since test-server has moved to aqua, I'll open a new issue here.

There were significant changes to test-server query parameters and documentation, which I reviewed in phetsims/chipper#478 (comment). I did not (at the time) test drive the instructions.

My test drive of master @481eb87de41a402b4d3c153a58ca2ca8290f66cc reveals the following issues ( node aqua/test-server/test-server.js running, Mac OS X 10.11.6, Chrome):

(1) test-server/README.md indicates that testBuilt defaults to true. But with http://localhost/~cmalley/GitHub/aqua/test-server/test-sims.html?ea, it built only the first sim (acid-base-solutions). Here's the first part of the console output:

(2) No boxes other than the first column were filled in:

(3) test-server/README.md does not indicate the default value for testDuration.

(4) If query parameters are passed through to all sims, then why do we need testTask=fuzzMouse and testFuzzRate. Why can't we just specify (e.g.) fuzzMouse=1000?

PhET-iO wrapper tests should run with each simulation

In https://github.com/phetsims/phet-io-wrappers/issues/37 we have been running PhET-iO wrapper tests in a single test unit, but going forward it makes more sense to run the PhET-iO wrapper tests with the simulations. For instance, when running the beers-law-lab unit tests, it would test all of the PhET-iO wrappers (as part of the beers-law-lab row).

My main motivation for this is to correctly catch errors and associate them with the right simulation (and catch them early on).

document query parameters in test-sims.js

A Slack discussion today (12/11/17) pointed someone to the testSims query parameter in test-sims.js. I was surprised to see that none of the query parameters in test-sims.js are documented. I.e.:

 var options = QueryStringMachine.getAll( {
    testTask: {
      type: 'boolean',
      defaultValue: true
    },
    testRequirejs: {
      type: 'boolean',
      defaultValue: true
    },
    testBuilt: {
      type: 'boolean',
      defaultValue: true
    },
    testDuration: {
      type: 'number',
      defaultValue: 30000 // ms
    },
    testSims: {
      type: 'array',
      defaultValue: [], // will get filled in automatically if left as default
      elementSchema: {
        type: 'string'
      }
    },
    testConcurrentBuilds: {
      type: 'number',
      defaultValue: 1
    }
  } );

QUnit should not rely on timeout--it should know when tests complete

@jonathanolson surmised that QUnit 1.14 and possibly QUnit 2.0.1 don't send out messages when testing is complete, and that aqua is having to rely on the 40 second timeout before assuming tests are complete.

I tried setting the duration to 400,000 and ran the tests, and never saw the [next test] message when running http://localhost/aqua/html/qunit-test.html?url=http://localhost/scenery/tests/qunit/unit-tests.html

This suggests that the 40 seconds timeout is what is ending the tests. We should try to find a way to send a message when the QUnit tests truly complete. Related to phetsims/chipper#624

run qunit

Example here: https://github.com/ariya/phantomjs/blob/master/examples/run-qunit.js

New QUnit tests should also test compiled code

We will need a way to build and run the tests, it's a new requirejs config and main file.

Remove unused files

I'm seeing several files that look like they pertain to an old PhantomJS prototype. They should be deleted.

Should we use Qunit instead of our current test-sims.html?

From discussing phet-io testing with @samreid in phetsims/phet-io#158. Qunit has been good for testing accross the project and can offer more specificity than just the colored squares for each sim. If all of our tests are built on Qunit, rather than most of them, it may be easier to automate reporting, like with continuous automated testing run on Bayes. Maybe we should investigate using Qunit tests instead of test-sims.html. Tagging @jonathanolson for an opinion, and marking for dev discussion.

Here is a skype chat that also started this conversation:

[11:44:09 AM] Sam Reid: What do you think about changing the automated fuzz tests to using QUnit instead of our own grid of colors?
[11:45:30 AM] Chris Malley: Do you mean the colored boxes in test-server? Only one of those boxes pertains to fuzz testing.
[11:46:07 AM] Sam Reid: Well, one is for fuzz-testing the requirejs mode and one is for fuzz-testing the built mode, right?
[11:47:00 AM] Chris Malley: None of the colored boxes is specifically for fuzz testing.
[11:47:33 AM] Chris Malley: left is running requirejs (with optional fuzz), center is build, right is running built (with optional fuzz).
[11:47:56 AM] Sam Reid: What if instead of a colored grid of boxes, it was a QUnit report?
[11:47:59 AM] Chris Malley: if you omit ?fuzzMouse, then you can just test loading.
[11:49:17 AM | Edited 11:49:30 AM] Chris Malley: I guess that (qunit) would be OK.
[11:50:16 AM] Chris Malley: what's the motivation?
[11:51:12 AM] Sam Reid: Michael and I are thinking about how to test PhET-iO wrappers, and choosing to go down the path of [a] building something that looks like our color grid or [b] doing something with QUnit that gives finer-grained error reporting.
[11:51:31 AM] Sam Reid: I think for the PhET-iO we need the finer-grained error reports.
[11:51:42 AM] Sam Reid: If we used the color grid for that, it could be many many columns.
[11:51:55 AM] Sam Reid: So we started thinking why not render the main tests through QUnit as well?
[11:52:08 AM] Sam Reid: For instance, you could check the item that says “show only failed tests”.
[11:52:14 AM] Sam Reid: or “re-run only failed tests”
[11:52:17 AM] Chris Malley: test-sever is generally a bit confusing until you get familiar with the colors. and the error messages don't clearly identify which test phase failed.
[11:52:27 AM] Sam Reid: I agree.
[11:52:46 AM | Edited 11:52:54 AM] Sam Reid: QUnit could be clear: e.g. “Faraday’s Law failed to build with this error [...]"

Chipper recommendation summary

Chipper had issue phetsims/chipper#410 "Scheduled automated testing & notification emails". The main points from that issue were:

It would be great to have testing run nightly, and send out an email if there are any failures?
@phet-steele would like emails
Nightly seems like it would work fine if it reports only new bugs (maybe a weekly email about current bugs).
@andrewadare mentioned an interest in continuous integration services for automated testing.
@andrewadare looked into Jenkins and Travis-CI, reported that Jenkins was more complex than expected.
@ariel-phet said regarding priority "I would not say a "high priority" but I do want us to continue to work on it."
Type checking would be a nice test to have phetsims/chipper#483
We are communicating with OIT regarding putting phantomjs on simian

[continuous testing] add browser/device identification

Probably needed for https://github.com/phetsims/QA/issues/120.

Undecided yet as to whether recording the user-agent would be sufficient. One way would be to set a stored configuration "name" in each browser used for testing that would be used for reporting, e.g. bayes-chrome (or generally the name of the device, and the name of the browser) so that we could track down the exact device needed for reproduction.

Would probably be good to discuss sometime with @lmulhall-phet and @ariel-phet to decide on what would be best.

document how to restart Continuous Testing (CT) service

As reported on Slack, Continuous Testing server (aka aqua, aka bayes) has been down for several days during the week of 12/25, and is still down as of this writing. Since PhET has come to rely on this, perhaps we should have some instructions on how to start it. @jonathanolson is apparently running it from his own account, and it's questionable whether https://github.com/phetsims/aqua/README.md is sufficient or up-to-date.

How many days does CT keep old snapshots?

Concurrent builds in test-server

Given the crazy-slow build time (probably somewhat due to minification), it would be nice to run a certain number of builds concurrently.

Create a composite Qunit test to run all Unit tests

From https://github.com/phetsims/phet-io-wrappers/issues/114, we were using a package called qunit-composite that was working nicely for consolidating each sim's wrapper qunit tests. Taking this its logical conclusion, we can create a single runnable that would run every qunit test in the project. That would be nice.

Two thoughts here:

We currently have CT that will test all of these qunit suites individually, this is more for running locally as another check to make sure you didn't break anything.
The library we are using has a deprecation warning that we should recognize before going further with this solution (when running http://localhost/phet-io-wrappers/phet-io-wrappers-all-sims-tests.html?ea):

qunit-2.4.1.js:2104 assert.push is deprecated and will be removed in QUnit 3.0. Please use 
assert.pushResult instead (https://api.qunitjs.com/assert/pushResult).

Red test with no explanation

@jonathanolson I tried running aqua load-only test and saw red with no explanation. Is this normal?

Snapshots should be deleted

SR, MK, and I moved the snapshots into a tmp folder and started deleting them but they need to finish

Is bayes no longer running unit tests?

I noticed that http://localhost/axon/axon-tests.html fails with Uncaught ReferenceError: PhetioIDUtils is not defined. I will need to regenerate the unit test HTMLs but I'm wondering why Bayes didn't catch this kind of issue. Is it no longer running unit tests?

Run new tests on bayes

See phetsims/chipper#631

@jonathanolson and I discussed the proposed test harnesses and how to run it in Bayes. @jonathanolson recommended that I work on changes to continuous-server.js myself, but that I would need his assistance to get it running on the server.

compare screenshots during automated testing

Moved here from 8/17/17 dev meeting notes, since this will involve considerable brainstorming, design, identification of pitfalls, etc.

CM: Perhaps we should increase the priority of comparing screenshots during automated testing. PhET has too many products to manually review them. Post-vacation, I had 5 sims with serious layout problems. That’s a lot of “drift” over the course of only ~10 days.
JO: I agree. Collaboration with SR may be helpful, as I don’t see a non-phet-io way of setting the seed on launch (query parameter?), sending it an event stream or triggering a predefined random fuzz (postmessage?) or getting a screenshot out (postmessage?)
JB: Seems like a great idea if we can do it for a reasonable cost.
JB: Bumping to next dev meeting (Aug 17) so we can continue discussing with AP, JO, and CM, who are out for Aug 10 meeting.

Expand browser-based tests

Phantom is an incredible platform for running automated headless tests, but I'm realizing that we probably want to run our tests on many platforms, including Firefox, Edge, iOS, etc and not necessarily lock in our only QA solution to be on phantom (which differs from all of the above). So we should investigate tests outside of phantom, even if they cannot be run permanently on a headless linux server.

Bayes is crashing quite a bit recently

Address review/discussion feedback for automated test harnesses

In today's discussion of #29 we recommended this additional work:

dot/kite/axon test configs have extra stuff => or just use one config file with different deps, see phetsims/chipper#638
move docs somewhere. grunt --help should point to the docs
maybe can get rid of these ( see #31)
// if ( !window.assert ) {
// assert.expect( 3 ); // TODO: this is a hack to suppress the "expected 0 tests but 5 were run" error.
// }
can get rid of the 0s as well. (see #31)

Move phettest to a new machine

The main motivation is to speed up phettest. It would be great if the "Pull All" button could use parallel-pull-all.sh, but the current machine serving phettest (Mendeleev) can't handle that. (I'd need @jonathanolson's help on wiring that button up after switching machines)

Oh and the header for phettest would have to change to a new name, whatever that may be:

use QueryStringMachine in test-server?

Factored out of phetsims/query-string-machine#15

Automated changeset testing

There are many times where I've been waiting for local testing (aqua, unit tests, snapshot comparison) to complete before pushing local commits to master. It's been inconvenient, since it prevents me from starting code changes until the testing is complete.

I'd like to consider something like the following (very open to modification):

Locally I'd identify two sets of SHAs (usually a "before" and "after" the changes that should be tested). Presumably master and commits on a feature branch.
There would be some way of automatically sending those up to an external server (bayes?) where processing starts.
It would run relevant tests, and use the snapshot comparison to identify if it changed anything visual/interactive. Presumably we'd send the server information about what to test (e.g. a Scenery change would probably do all tests and compare all sims, but an area-model-common change would have a much more limited testing).
Once complete (or in progress) you'd be able to view a report for the change. It would note all tests that fail before/after (generally only caring about things that changed), and it would provide a similar interface for the snapshot comparison that I've done (would be able to visually show a difference in any sim that it caused).
If the testing is as expected (passing), then I'd merge the branch into master.

I'm not sure how important some complexity would be (e.g. "only run tests for area-model sims"), but it would be possible to start with a simple interface and add anything needed.

Tagging for developer meeting to discuss if this would be helpful for others, priorities, features, etc.

test-server/test-sims.js not passing lint

See #10

@zepumph volunteered to fix this. He said he can use his IDE linter so can begin on this even before #10 is complete.