Code Monkey home page Code Monkey logo

tool-competition-av's Introduction

Cyber-Physical Systems Testing Tool Competition [OUTDATED]

This repository refers to the 2022 edition, to check the latest version and join the current competition, you can visit the current repository

Contacts

For more information on the 2022 edition, contact:

Dr. Alessio Gambi - IMC Krems, Austria

Dr. Vincenzo Riccio - Università di Udine, Italy

Goal

The SBST Workshop offers a challenge for software testers who want to work with self-driving cars in the context of the usual tool competition.

The competitors should generate virtual roads to test a lane keeping assist system using the provided code_pipeline.

The generated roads are evaluated in the BeamNG.tech driving simulator. This simulator is ideal for researchers due to its state-of-the-art soft-body physics simulation, ease of access to sensory data, and a Python API to control the simulation.

Video by BeamNg GmbH

Note: BeamNG GmbH, the company developing the simulator, kindly offers it for free for researcher purposes upon registration (see Installation).

Comparing the Test Generators

Deciding which test generator is the best is far from trivial and, currently, remains an open challenge. In this competition, we rank test generators by considering various metrics of effectiveness and efficiency that characterize the generated tests but also the process of generating them, i.e., test generation. We believe that our approach to compare test generators is objective and fair, and it can provide a compact metric to rank them.

Ranking Formula

The formula to rank test generators is the following weighted sum:

rank = a * OOB_Coverage + b * test_generation_efficiency + c *  test_generation_effectiveness

where:

  • OOB_Coverage captures the effectiveness of the generated tests that must expose as many failures as possible (i.e., Out Of Bound episodes) but also as many different failures as possible. We compute this metric by extending the approach adopted in the previous edition of the competition with our recent work on Illumination Search. As an example, our novel approach has been already adopted for the generation of relevant test cases from existing maps (see SALVO). Therefore, we identify tests' portion relevant to the OOBs, extract their structural and behavioral features, and populate feature maps of a predefined size (i.e., 25x25 cells). Finally, we define OOB_Coverage by counting the cells in the map covered by the exposed OOBs. Larger values of OOB_Coverage identify better test generators.

  • test_generation_efficiency captures the efficiency in generating, but not executing, the tests. We measure it as the inverse of the average time it takes for the generators to create the tests normalized using the following (standard) formula:

    norm(x) = (x - min) / (max - min)

    Where min and max are values empirically found during the benchmarking as the minimum and maximum average times for generating test across all the competitors.

  • test_generation_effectiveness captures the ability of the test generator to create valid tests; therefore, we compute it as the ratio of valid tests over all the generated tests.

Setting the Weights

We set the values of the in the ranking formula's weights (i.e., a, b, and c) to rank higher the test generators that trigger many and different failures; test generation efficiency and effectiveness are given equal but secondary importance. The motivation behind this choice is that test generators' main goal is to trigger failures, while being efficient and effective in generating the tests is of second order importance.

The following table summarizes the proposed weight assignment:

a b c
0.6 0.2 0.2

Implement Your Test Generator

We make available a code pipeline that will integrate your test generator with the simulator by validating, executing and evaluating your test cases. Moreover, we offer some sample test generators to show how to use our code pipeline.

Information About the Competition

More information can be found on the SBST tool competition website: https://sbst22.github.io/tools/

Repository Structure

Code pipeline: code that integrates your test generator with the simulator

Self driving car testing library: library that helps the integration of the test input generators, our code pipeline, and the BeamNG simulator

Scenario template: basic scenario used in this competition

Documentation: contains the installation guide, detailed rules of the competition, and the frequently asked questions

Sample test generators: sample test generators already integrated with the code pipeline for illustrative purposes

Requirements: contains the list of the required packages.

License

The software we developed is distributed under GNU GPL license. See the LICENSE.md file.

tool-competition-av's People

Contributors

alessiogambi avatar alessiogambi-passau avatar dgumenyuk avatar fse2020submission avatar p1ndsvin avatar spanichella avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

tool-competition-av's Issues

Time budget

Would it be possible to know the order of magnitude of the time budget during the competition? Will it be minutes, hours, days? It would be useful to know that to determine which approach is more suitable.

Thank you very much again.

Documentation for beamng-user

Setting up the pipeline is quite easy by following your documentation. However, I struggled a bit with the beamng-user argument. In GUIDELINES.md under section Technical considerations is this argument not documented.

OOB_percentage reporting

Hello,
would it be possible to integrate in the code pipeline an oob_percentage reporting? After execution of a test case, to either obtain the oob percentage value per state or the maximum percentage value, regardless if the executed test case passed or failed (similar to the reporting of the distance value within the execution data).

BeamNG Executor

Setup a beamng executor that uses beamgpy to setup the scenario, configure bemangAI to drive from the first to the last point in the road, and collect all the simulation/sensor data except images, lidars. Data must be tagged using SIMULATION time (from the Timer sensor).

Creating a road makes the simulation loop

Additionally, the simulation goes into an infinite loop.

this is the code to generate the test:

        test.append( (10, 10) )
        test.append( (10, self.map_size -10) )
        test.append( (self.map_size -10, self.map_size -10) )
        test.append( (self.map_size -10, 10) )

Validating sharp turns

At the moment we do not check the validity of the roads with respect to the curvature, but we should. For example, invalid roads have turns so sharp that it is physically impossible for a car to drive on them.

Improve stats on execution time

Possibly, we should report the duration in real-time and simulated time for executing tests as well as the time spent in generating them.

Unique OBE

Test generators that expose the same problem (e.g., OBE) multiple times are not that effective. We should report in the statistics and after each failed test execution, whether or not a test failed the same way (or similarly) before.

Problems with copying the TIG maps

If the script is executed from a directory different than root, beamng executor cannot find the "TIG" levels and fail.
For the moment the workaround is to manually copy

Mock Executor

The mock executor can rely on TUM logic for trajectory planning, but will not be able to provide all the data that BeamNG does.

Multi-threaded execution

We understand that running multiple simulations in parallel is not possible. However, is there any other restriction on the number of threads we can use? For instance, is it possible to run a background thread while the simulator is running?
Thanks again!

Distinguish failed test per type

Tests might fail because the car drives out of the lane (OBE) or because of a timeout triggers (the ego-car does not move).
The statistics should keep track of this

Test Oracles

We need to define basic test oracles:

  • Stand still oracle. If the car does not move in 10 sec test fail.
  • OBE (approx). We consider the position of the car (not the bounding boxes), and if this is outside the lane, triggers the test failure.
  • Overall timeout. The test must complex in 5 minutes (configurable).

Factorize interface for test generators

Hi, I am new to this framework. I like the code pipeline and it is easy to use. I felt free to share a few thoughts about the interface for generating the test cases :-).

As a new user of this code pipeline, I want to implement only the part of a test generator that describes the actual test case such that I don't need to focus on the test execution and reporting.

Suggestions for refactoring:

  • Factorize the interface of the test generators into an abstract base class (e.g., start and __init__ methods) which have to be implemented by the test generators.
  • Use dependency injection for road_points in the start method:
def start(road_points):
  # add points to `road_points`
  • In case of multiple tests the data structure of road_points could look like road_points = [test1_list_of_points, test2_list_of_points]
  • Extract the code for the test execution.
  • Extract the code for reporting the test outcome.

These are just my thoughts. What do you think about this design option? Do you see some drawbacks in this?

Car driving out of the lane but output is success

We have a question regarding the output of the tool.

In the description of the competition, it says that "a test fail if the ego-car does not move, or does not reach the end of the road within a timeout (computed over the length of the road), or drives off the lane."

Given the path [(10, 100), (40, 113), (60, 115), (80, 99), (100, 80), (120, 70), (140, 58), (160, 38), (180, 30)], the ego car goes out of the lane in the very first curve. However, the test continues to the end and the output of the test is "success". Shouldn't the output of this test be failed? What does exactly mean to "drive off the lane"? We assume that crossing the yellow line should also be a failure.

Also, does the test always continue to the end even if there is a failure at an early stage? Or should the test actually stop around 00:08s, when the car goes out of the lane?

out_of_lane_480p.mov

Thank you.

Documentation

Improve the documentation to include installation instructions, example of usage, and instructions to register and setup BeamNG.research.

Test Validators

Implement the listed test validators including:

  • min road length
  • max number points
  • min curvature / road shapes
  • self-intersection and overlapping
  • "Type" checking. The test must be a list of tuples or similar. Not sure how to handle duck typing.

Missing PyOpenGL-accelerate Dependency

I am not sure if this is needed on all systems but I had to install also the PyOpenGL-accelerate module. Maybe this can be added to the requirements-37.txt.

Collect statistics about generation

We need a component that collects some statistics about the generation, including

  • generated tests
  • valid and invalid tests
  • passed and failed and errored tests
  • execution time
  • generation time

"oob_distance" is NaN

In the current version, we are always obtaining NaN as the "oob_distance" value in the JSON files.

Example: "steering": 44.62020603589313,
"steering_input": -0.08308082203328948,
"brake": 0,
"brake_input": 0,
"throttle": 0,
"throttle_input": 0,
"wheelspeed": 9.729266273285269,
"vel_kmh": 35,
"is_oob": false,
"oob_counter": 0,
"max_oob_percentage": 0,
"oob_distance": NaN

After inspecting the code, we realized that in line 30 of oob_monitor.py the clauses of the inline if seem to be flipped.

Currently:
last_max_oob_percentage = self.last_max_oob_percentage if oob_bb else float("nan")
 (line 29)
oob_distance = float("nan") if oob_bb else self.oob_distance( wrt=wrt) (line 30)

How we believe it should be:
last_max_oob_percentage = self.last_max_oob_percentage if oob_bb else float("nan")
 (line 29)
oob_distance = self.oob_distance( wrt=wrt) if oob_bb else float("nan") (line 30)

Could you confirm that our understanding is correct and fix the code in the repository?
Besides, can we understand a negative value of oob_distance means that the car is out of bound and the test failed?

Thank you very much in advance, and happy new year!

time.time_ns() method not available in Python 3.6

I just tried to follow the installation guide that recommends Python 3.6 (at least it states that stuff was tested with 3.6).

The time.time_ns() method on line 120 in competition.py was introduced in Python 3.7. This has to replaced by either int(round(time.time() * 1e9)) or the docs should not recommend Python 3.6.

Test Simulation Timeout

We do have in place a timeout mechanism that triggers if the test subject does not move. However, this is not yet tested, and we should probably introduce a way to specify a timeout on the entire simulation as well.

License

Add the following license header in all the source files:

This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program. If not, see https://www.gnu.org/licenses/.

Acknowledge and link this repository and its authors in case of usage.

BeamNG not starting, need configurable userpath

BeamNG does not start with my non-standard userpath (Windows' "My Documents" moved from standard C: drive to F:). This results in a timeout when running the competition scripts.

2021-01-11 14:23:39,040 ERROR    Uncaught exception:
Traceback (most recent call last):
  File "road_definition.py", line 50, in <module>
    main()
  File "road_definition.py", line 39, in main
    bng = beamng.open(launch=True)
  File "F:\Anaconda3\envs\sbst21\lib\site-packages\beamngpy\beamng.py", line 318, in open
    self.skt, addr = self.server.accept()
  File "F:\Anaconda3\envs\sbst21\lib\socket.py", line 205, in accept
    fd, addr = self._accept()
socket.timeout: timed out
2021-01-11 14:23:39,109 WARNING  sys:1: ResourceWarning: unclosed <socket.socket fd=696, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=0, laddr=('127.0.0.1', 64256)>

2021-01-11 14:23:39,131 WARNING  F:\Anaconda3\envs\sbst21\lib\subprocess.py:786: ResourceWarning: subprocess 11936 is still running
  ResourceWarning, source=self)

Checking Executor Precondition on Startup

BeamNG executor requires some preconditions to be met, including env variables like BNG_HOME. Currently, those are not checked and the execution simply fails with an error message.

Comments/suggestions on the whole repository structure

  1. I would refactor the current README.md. Specifically, let's separate the description of the repository from the actual information concerning the installation of the pipeline to run the competition. We should transform the first page to its essential information and links, see the following suggestions for it.
    We need to have on the main page information like
  • what the competition is about
  • little overview on BeamNG research and make it clear we are using their simulators (with some reasoning about it)
  • add if possible video of the simulation (from youtube there are many) potential scenarios
  • what we make available
  • where to find the code and the guideline to run the competition
  1. before going to the low level of detail, we should cross-reference the SBST web page:
    • add the image and text concerning the competition on CPS
    • add the like to the original web page of the SBST tool competition:https://sbst21.github.io/tools/
  1. if possible I would refactor the repository into folders
  • "code-pipeline": containing the code to run the actual competition (partially described in the main page)
  • OPTIONAL folder "previous results": reference to previous paper and tools that used the previous version of the pipeline
  • "datasets": previous and current dataset (if any are provided)
  1. for each folder mentioned in point 2) add a low-level README.md file that describes its content.

4) Maybe link to the chair for the SBST tool competition somewhere, with links to home pages?

  1. I would use a different structure for the README.md of the GUIDELINE similar to the following:

"Setup Guide and Program Description"
The goal of this part is to give a brief description of how the competition code works.

Supported Operating Systems

  • Windows ..."

Pre-Requisites

  • Python ...
  • Memory: XX GB
    ...
  • For usage on Windows, add ...

Setup Information
For information about setup and use, please refer to the instruction provided here.

Pipeline architecture description
...

Pipeline installation
...

TO RUN THE TOOL
....
...
"

for the architecture description, with Alessio, we got a figure from the BeamNG team for a paper we wrote with them.

  1. Add section reference, so that the original pipeline is linked to the repository

  2. Let's add a separate section to the "Competition Evaluation Methodology" (or something similar) to describe what the competition actually does

  3. Move installation consideration under the installation section of the pipeline

  4. In general, try to have a README.md file for each sub-folder, so that who visit the repository does not get lost.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.