informalsystems / modelator Goto Github PK

View Code? Open in Web Editor NEW

51.0 24.0 5.0 342.15 MB

Model-based testing tool

Home Page: https://mbt.informal.systems

License: Apache License 2.0

TLA 14.77% Python 80.34% Shell 3.95% Ruby 0.24% HTML 0.44% SCSS 0.26%

tla tlaplus model-checking model-based-testing rust

modelator's Introduction

Modelator

Modelator is a tool that enables automatic generation of tests from models. Modelator takes TLA+ models as its input and generates tests that can be executed against an implementation of the model.

Under the hood, Modelator uses Apalache, our in-house model checker, to generate tests from models.

Modelator is used by Atomkraft, a testing framework for the Cosmos blockchains network.

Installing Modelator

First, make sure your system has

Python 3.8 or newer
Java 17 or newer

To install Modelator, simply run pip install modelator. That's it! Please verify that the tool is working by writing modelator on the command line. You should see something like this:

For detailed installation instructions and clarifications of dependencies, check INSTALLATION.md

Using Modelator

For a full example of running Modelator together with the system implementation and the corresponding TLA+ model, see the Towers of Hanoi example.

To see all commands of the Modelator CLI, run modelator --help.

The command apalache provides the info about the current version of Apalache installed and enables you to download different versions. (In order to check the usage of this command, run modelator apalache --help. Do the same for details of all other commands.)

Commands load, info, and reset are used to manipulate the default value for a TLA+ model. (A default version is not strictly necessary since a model can always be given as an argument.)

The most useful commands are simulate, sample, and check. Command simulate will generate a number of (randomly chosen) behaviors described by the model. If you would like to obtain a particular model behavior, use command sample: it will generate behaviors described by a TLA+ predicate. Finally, to check if a TLA+ predicate is an invariant of the model, use check. To see all the options of these commands, run modelator simulate --help, modelator sample --help, or modelator check --help.

Contributing

If you encounter a bug or have a an idea for a new feature of Modelator, please submit an issue.

If you wish to contribute code to Modelator, set up the repository as follows:

git clone [email protected]/informalsystems/modelator
cd modelator
poetry install
poetry shell

License

Licensed under the Apache License, Version 2.0 (the "License"); you may not use the files in this repository except in compliance with the License. You may obtain a copy of the License at

https://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

modelator's People

Contributors

Stargazers

Watchers

Forkers

isgasho alexander-n sudssf arufa-research andrey-kuprianov

modelator's Issues

Execute model checkers in a temporary directory

Currently the model checkers (Apalache and TLC) are run in the same directory where the executable is called. This pollutes the directory, and also has the possibility of name clashes. We need to execute model checkers in a temporary directory, similar to how it's done in Tendermint-rs LightClient tests. This also simplifies the cleanup task -- simply delete the tempdir.

`modelator apalache test <tla> <cfg>` should give a warning if no tests are found in the `<tla>` file.

Commit: 70e0fe1

It is possible for a user to pass a regular TLA spec and config to apalache test, meaning, one without tests as generated by first running generate-tests. If an invariant is specified in the config then it will generate a counterexample.

This is unlikely to be what the user is looking for, but there is no error message, so they won't realise.

Possible refactoring and API improvements

Pulling in here the suggestions from the reviews of PR #44, which should not necessarily addressed right away, but are rather about long-term improvements:

If you think it's useful, you can always add a run_with_default method which requires the Default bound and uses the default value of the system. #44 (comment)
remove redundant Default impls and ::new() fns. Instead derive Default: #44 (comment)
make run_tla_steps and run_tla_events convenient struct methods. The function bodies are very small and very direct anyway: #44 (comment)
use the power of AsRef or Deref: #44 (comment)
Why Action is generic type and Outcome is associated type? Action does not have trait bounds. We can make Action as an associated type too? : #44 (comment)
I saw some 'static and DeserializeOwned bounds here and there. Are these lifetimes sanitized? If it is not done yet, I can take a look later. It would remove unnecessary lifetime bounds. #44 (comment)

Cache doesn't work correctly

Caching of MC results doesn't work as expected. I have a failing run in CI, but the tests run fun on my laptop. After removing .modelator/cache dir, I get the same test failure. Upon the next test run, all tests pass again.

Related to #43

Reintroduce generic artifacts

We've had an initial attempt of introducing artifacts, which have then been deleted in favor of a simpler approach.

Artifacts are meant as a generic representation of (intermediate) stages of processing; each should come with the clean description of the format, and with the ability to (de-)serialise to/from files or strings in a generic way.

Current artifacts include:

TLA file
Cfg file
TLA trace
JSON trace

Currently, e.g. TLA and Cfg files are represented as references to filesystem paths, which makes it really hard to operate them independently of that. Instead each artifact should know only it's content, no matter where it comes from, and be able to show/save it upon request. This will make it transparent to integrate into the modules, and further into the CLI.

What to do:

Bring back (and may be rework/extend) the Artifact trait
Describe clearly what is the format of the current artifacts (related to #58)
Refactor the current artifacts to implement that trait

Allow to select model checker version

Currently here the model checker choice is limited to unversioned TLC or Apalache, which defaults to already pretty outdated versions of both. We need to change this scheme as follows:

leave those options, but now they should refer to the latest version of the model checkers bundled with the corresponding modelator release
add versioned options, e.g. TlcVersion(String), and ApalacheVersion(String), that would attempt to download a specific version of model checkers. This will allow both to deterministically fix the MC versions for some tests, as well as to experiment with the newer versions of model checkers.

A protocol and a prototype for MBT-core <-> MBT-Rust communication

Following the description outlined in #2 (comment)., we need to split the jobs of generating and executing tests into MBT-core and MBT-Rust plugin. For that we need to devise a simple protocol for the communication of MBT-core and MBT-Rust, and do a prototype implementation of it.

As a testing ground for the protocol, it is proposed to take the work done in IBC-rs#701, and to separate it into MBT-Core and MBT-Rust:

MBT-core will have the task of running TLC on the provided spec, and generating tests from it. It will run as a daemon listening for local requests.
MBT-Rust will run the given test executor implementation on the tests obtained via communication with MBT-core.

In the first iteration we should not care too much about the particularities of the implementation, but to focus on obtaining a minimal working prototype that has MBT-core and MBT-Rust separated and communicating via a protocol. This will be refactored later for a better code structure.

Sources do not compile with Rust v. 1.54

After updating to Rust v. 1.54 the sources stop to compile, giving multiple variants of the following error:

error[E0277]: the trait bound `ApalacheMethods: clap::Args` is not satisfied
  --> modelator/src/cli/mod.rs:24:14
   |
24 |     Apalache(ApalacheMethods),
   |              ^^^^^^^^^^^^^^^ the trait `clap::Args` is not implemented for `ApalacheMethods`
   |
   = note: required by `augment_args_for_update`

@rnbguy could you please take a look?

Add `parse` method to `apalache` module

Leveraging apalache's parsing capabilities is the first step towards: #27
After parsing the TLA+ model using apalache, we can hash its output and use it as an identifier of that model (which is a requirement for #27).

Reliable extraction of counterexample traces

As demonstrated by #56, our extraction of traces from counterexamples is incomplete.

The way it works now, is via an ad-hoc TLA+ parser, that is specific to the simple variant of TLA+ encountered in counterexamples.

Strategically, the requirement is that we support both Apalache and TLC in the near future, though for the nearest future (couple of weeks / a month) having only Apalache might be enough.

We need to weigh on a number of alternatives on how to proceed:

Extend / harden moderator's current ad-hoc parser. The advantage is that it should support both Apalache and TLC. The disadvantage is that it is a bit fragile wrt. minor format changes. So if following this route the parser should be made as relaxed wrt. input as possible.
Rework the modelator's TLA+ Jsonnet library: it was written for the older Apalache JSON format, and needs to be adapted to the new Apalache's JSON format. Then, hopefully, TLC output can be rerouted via Apalache? Not sure about this...
Use some third-party solution, e.g. Tla2Json, which has been used previously by Dan (he can elaborate more on that). This tool is specific to TLC, and as far as I have quickly checked it doesn't support the output as produced by Apalache.

Any further ideas are welcome! We should keep in mind that we can ask Apalache team to implement some features if needed, but can't easily do so for TLC.

Running binary with only the argument `tla` causes panic instead of giving warning to user.

Bug:

Running binary ./modelator tla panics.

Expected:

./modelator --help gives

SUBCOMMANDS:
    apalache    Generate TLA+ traces using Apalache
    tla         Generate TLA+ test cases and parse TLA+ traces
    tlc         Generate TLA+ traces using TLC

so it is expected that a user will try to pass only tla arg.

Log:

thread 'main' panicked at 'called `Option::unwrap()` on a `None` value', modelator/src/cli/mod.rs:29:17
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Additional info:

Tested on release in versions 0.2.0 and (06ed539eb) latest

Confusing "No test trace found" error state

The error state

Err(
    NoTestTraceFound(
        "mc.log",
    ),
)

Is produced by several different error conditions, making debugging hard and confusing. For example:

It was produced when I forgot an invariant in IBCTests.cfg. In TLC this yielded the much more informative error message The invariant Neg specified in the configuration file is not defined in the specification.
It is produced when any of the *Test operators in a TLA+ file does not yield an error trace when negated. A single one of the *Test operators in a TLA+ spec not yielding an error trace when negated will result in this error being produced with no indication of which one caused the problem. To find and fix it involves a slow process of trying to negate each *Test operator and seeing if a trace is produced.

Generate multiple tests using TLC

As discussed previously, we want to generate multiple tests from one test assertion. This is possible already for TLC using the --continue option; for Apalache this will become possible when informalsystems/apalache#542 is resolved (to be tracked separately).

Parallelise execution of individual TLA+ test assertions

Testing for each TLA+ test assertion is completely independent from all others, and thus they can be parallelised. This will greatly improve user experience in terms of waiting time.

MBT-core: TLA+ history can be prohibitively heavy

Related: #4

After adding a history variable to my model, TLC doesn't finish because my machine runs out of disk space.
I worked around it by simply having a previousState variable.

I believe such variable will suffice for most use-cases (I checked and this is the case in most of the light client tests (e.g. https://github.com/informalsystems/tendermint-rs/blob/master/light-client/tests/support/model_based/LightTests.tla#L182-L187)) and should be sufficiently lightweight to not affect model checking performance/termination.

Translate`.tla` counterexample using `modelator tla tla-trace-to-json-trace <filename>` results in parsing error.

Bug:

Running ./modelator tla tla-trace-to-json-trace examples/counterexample1.tla does not work. Results in parsing error.

Content of counterexample.tla is at the bottom of this issue.

Expected:

JSON output for translated tla.

Log:

{
  "status": "error",
  "result": {
    "Nom": "Parsing Failure: Error { input: \"(((0\\n                  :> [creator |-> \\\"vt2\\\",\\n                    promiseId |-> 0,\\n                    type |-> \\\"vat\\\",\\n                    watchers |-> {}]\\n                @@ 1\\n                  :> [creator |-> \\\"None\\\",\\n                    promiseId |-> 0,\\n                    type |-> \\\"blank\\\",\\n                    watchers |-> {}])\\n              @@ 2\\n                :> [creator |-> \\\"None\\\",\\n                  promiseId |-> 0,\\n                  type |-> \\\"blank\\\",\\n                  watchers |-> {}])\\n            @@ 3\\n              :> [creator |-> \\\"vt1\\\",\\n                promiseId |-> 0,\\n                type |-> \\\"vat\\\",\\n                watchers |-> { \\\"vt0\\\", \\\"vt1\\\" }])\\n          @@ 4\\n            :> [creator |-> \\\"None\\\",\\n              promiseId |-> 0,\\n              type |-> \\\"blank\\\",\\n              watchers |-> {}])\\n        @@ 5\\n          :> [creator |-> \\\"None\\\",\\n            promiseId |-> 0,\\n            type |-> \\\"blank\\\",\\n            watchers |-> {}]\\n    /\\\\ cnt_promise = 0\\n    /\\\\ curr = \\\"vt0\\\"\\n    /\\\\ step = \\\"init\\\"\\n\\n(* Transition 3 to State1 *)\\n\", code: Satisfy }"
  }
}

Additional info:

Tested on release in (06ed539eb) latest

.tla file in question:

---------------------------- MODULE counterexample ----------------------------

EXTENDS MC_userspace

(* Constant initialization state *)
ConstInit == TRUE

(* Initial state *)
State0 ==
  bank
      = ((((0
                  :> [creator |-> "vt2",
                    promiseId |-> 0,
                    type |-> "vat",
                    watchers |-> {}]
                @@ 1
                  :> [creator |-> "None",
                    promiseId |-> 0,
                    type |-> "blank",
                    watchers |-> {}])
              @@ 2
                :> [creator |-> "None",
                  promiseId |-> 0,
                  type |-> "blank",
                  watchers |-> {}])
            @@ 3
              :> [creator |-> "vt1",
                promiseId |-> 0,
                type |-> "vat",
                watchers |-> { "vt0", "vt1" }])
          @@ 4
            :> [creator |-> "None",
              promiseId |-> 0,
              type |-> "blank",
              watchers |-> {}])
        @@ 5
          :> [creator |-> "None",
            promiseId |-> 0,
            type |-> "blank",
            watchers |-> {}]
    /\ cnt_promise = 0
    /\ curr = "vt0"
    /\ step = "init"

(* Transition 3 to State1 *)
State1 ==
  bank
      = ((((0
                  :> [creator |-> "vt2",
                    promiseId |-> 0,
                    type |-> "vat",
                    watchers |-> {}]
                @@ 3
                  :> [creator |-> "vt1",
                    promiseId |-> 0,
                    type |-> "vat",
                    watchers |-> { "vt0", "vt1" }])
              @@ 5
                :> [creator |-> "vt0",
                  promiseId |-> 1,
                  type |-> "promise",
                  watchers |-> {"vt0"}])
            @@ 1
              :> [creator |-> "vt0",
                promiseId |-> 1,
                type |-> "resolver",
                watchers |-> {"vt0"}])
          @@ 2
            :> [creator |-> "None",
              promiseId |-> 0,
              type |-> "blank",
              watchers |-> {}])
        @@ 4
          :> [creator |-> "None",
            promiseId |-> 0,
            type |-> "blank",
            watchers |-> {}]
    /\ cnt_promise = 1
    /\ curr = "vt0"
    /\ step = "storePromise"

(* Transition 1 to State2 *)
State2 ==
  bank
      = ((((3
                  :> [creator |-> "vt1",
                    promiseId |-> 0,
                    type |-> "vat",
                    watchers |-> { "vt0", "vt1" }]
                @@ 2
                  :> [creator |-> "None",
                    promiseId |-> 0,
                    type |-> "blank",
                    watchers |-> {}])
              @@ 0
                :> [creator |-> "vt2",
                  promiseId |-> 0,
                  type |-> "vat",
                  watchers |-> {}])
            @@ 1
              :> [creator |-> "vt0",
                promiseId |-> 1,
                type |-> "resolver",
                watchers |-> {"vt0"}])
          @@ 4
            :> [creator |-> "None",
              promiseId |-> 0,
              type |-> "blank",
              watchers |-> {}])
        @@ 5
          :> [creator |-> "vt0",
            promiseId |-> 1,
            type |-> "promise",
            watchers |-> {"vt0"}]
    /\ cnt_promise = 1
    /\ curr = "vt0"
    /\ step = "sendItem"

(* Transition 2 to State3 *)
State3 ==
  bank
      = ((((0
                  :> [creator |-> "vt2",
                    promiseId |-> 0,
                    type |-> "vat",
                    watchers |-> {}]
                @@ 2
                  :> [creator |-> "vt0",
                    promiseId |-> 0,
                    type |-> "vat",
                    watchers |-> {"vt0"}])
              @@ 1
                :> [creator |-> "vt0",
                  promiseId |-> 1,
                  type |-> "resolver",
                  watchers |-> {"vt0"}])
            @@ 3
              :> [creator |-> "vt1",
                promiseId |-> 0,
                type |-> "vat",
                watchers |-> { "vt0", "vt1" }])
          @@ 5
            :> [creator |-> "vt0",
              promiseId |-> 1,
              type |-> "promise",
              watchers |-> {"vt0"}])
        @@ 4
          :> [creator |-> "None",
            promiseId |-> 0,
            type |-> "blank",
            watchers |-> {}]
    /\ cnt_promise = 1
    /\ curr = "vt0"
    /\ step = "storeSelfRef"

(* Transition 4 to State4 *)
State4 ==
  bank
      = ((((0
                  :> [creator |-> "vt2",
                    promiseId |-> 0,
                    type |-> "vat",
                    watchers |-> {}]
                @@ 1
                  :> [creator |-> "None",
                    promiseId |-> 0,
                    type |-> "blank",
                    watchers |-> {}])
              @@ 3
                :> [creator |-> "vt1",
                  promiseId |-> 0,
                  type |-> "vat",
                  watchers |-> { "vt0", "vt1" }])
            @@ 4
              :> [creator |-> "None",
                promiseId |-> 0,
                type |-> "blank",
                watchers |-> {}])
          @@ 5
            :> [creator |-> "None",
              promiseId |-> 0,
              type |-> "blank",
              watchers |-> {}])
        @@ 2
          :> [creator |-> "vt0",
            promiseId |-> 0,
            type |-> "vat",
            watchers |-> {"vt0"}]
    /\ cnt_promise = 1
    /\ curr = "vt0"
    /\ step = "resolve"

(* Transition 0 to State5 *)
State5 ==
  bank
      = ((((3
                  :> [creator |-> "vt1",
                    promiseId |-> 0,
                    type |-> "vat",
                    watchers |-> { "vt0", "vt1" }]
                @@ 4
                  :> [creator |-> "None",
                    promiseId |-> 0,
                    type |-> "blank",
                    watchers |-> {}])
              @@ 5
                :> [creator |-> "None",
                  promiseId |-> 0,
                  type |-> "blank",
                  watchers |-> {}])
            @@ 0
              :> [creator |-> "vt2",
                promiseId |-> 0,
                type |-> "vat",
                watchers |-> {}])
          @@ 2
            :> [creator |-> "vt0",
              promiseId |-> 0,
              type |-> "vat",
              watchers |-> {"vt0"}])
        @@ 1
          :> [creator |-> "None",
            promiseId |-> 0,
            type |-> "blank",
            watchers |-> {}]
    /\ cnt_promise = 1
    /\ curr = "vt1"
    /\ step = "transferControl"

(* The following formula holds true in the last state and violates the invariant *)
InvariantViolation ==
  LET Call_hist ==
    <<
      [step |-> $C$6, curr |-> $C$152, cnt_promise |-> $C$8, bank |-> $C$68], [step |->
          $C$758,
        curr |-> $C$689,
        cnt_promise |-> $C$759,
        bank |-> $C$757], [step |-> $C$5430,
        curr |-> $C$5337,
        cnt_promise |-> $C$5431,
        bank |-> $C$5429], [step |-> $C$11472,
        curr |-> $C$11331,
        cnt_promise |-> $C$11473,
        bank |-> $C$11471], [step |-> $C$20258,
        curr |-> $C$20021,
        cnt_promise |-> $C$20259,
        bank |-> $C$20257], [step |-> $C$34539,
        curr |-> $C$34110,
        cnt_promise |-> $C$34540,
        bank |-> $C$34538]
    >>
  IN
  LET Example_si_2 ==
    { (Call_hist)[i$22]["step"]: i$22 \in DOMAIN Call_hist }
        = { "init",
          "resolve",
          "storePromise",
          "storeSelfRef",
          "unwatchSlot",
          "sendItem",
          "transferControl" }
      \/ { (Call_hist)[i$23]["step"]: i$23 \in DOMAIN Call_hist }
        = { "init",
          "resolve",
          "storePromise",
          "storeSelfRef",
          "sendItem",
          "transferControl" }
  IN
  Example_si_2

================================================================================
(* Created by Apalache on Fri Aug 13 09:44:40 BST 2021 *)
(* https://github.com/informalsystems/apalache *)

Make modelator crate publishable

We want to publish modelator crate in the next couple of days; as it turned out this is not possible after including JARs into the crate (see #21), as the crate size then exceeds the allowed limit of 10MB.

We need either to temporarily revert the changes done in #21, or think about some other solution avoid packaging the JARs into the crate.

Windows test build fails

See this test run:

TLC is not found for some reason
TMate session never finishes (I think it is disabled for macOS and Ubuntu builds)

Make top-level structure with separate crates and Clap

I am very much impressed by the code structure and quality of https://github.com/CertainLach/jrsonnet. (not taking into account the absent documentation). In particular:

usage of separate crates for various functions
Clap-rs for command-line interface

We need to adopt the best practices from there.
This issue is complete when there is one crate for the current source code exported from tendermint-rs, and one crate for CLI with Clap.

MBT-core: weights for test instantiation

In the current way of MBT functionality, see e.g. LightClient MBT tests, we have a lot of redundancy:

there is a joint set of tests in LightTests.tla
various subsets of this set of tests are listed in configuration for various instances of the model, see, e.g. MC4_4_faulty.json, or MC100_2_faulty.json.

The reason for that is that we want to limit which tests we execute for each model instance based essentially on the test execution time: various tests have different complexity, and this test complexity is multiplied to the model instance complexity, giving the total test execution time.

This redundancy can be eliminated if we introduce the weights system to approximate the complexity/time of each test for each model instance:

each test gets a weight, which is the approximate measure of its complexity or execution time. These weights can also be approximated by running the tests for small instance sizes, and taking the test execution time as its weight.
model instantiation parameters, such as listed in MC4_4_faulty.tla, also get weights that reflect their influence on the test execution time
the total test weight is calculated as some formula over the test weight and the weights of the model instance parameters. This formula can be as simple as product.

Such weights system will allow us to eliminate the whole bunch of static files currently necessary for MBT. E.g., a user could provide the MBT-core with the model, ranges for instantiation parameters, and the set of tests, and then ask the system to generate all tests such that the total test weight is smaller then the given bound.

Treat TLA+ test assertions as individual tests

In the current implementation TLA+ tests are treated collectively: if one test fails, the whole testing is stopped.

We need rather to treat each TLA+ test assertion as an individual test, and produce the output that summarises the number of successful/failed tests.

Add timeout option for test execution time.

Similar to Tendermint-rs Lightclient tests, we need to add the possibility for the user to specify a timeout on how long the test execution should take at most.

Rework Apalache module

The current Apalache module needs some improvements.

adopt generic artefacts as input & output (see #61)
get rid of any path assumptions; the directory to execute in should be passed as a parameter (or set via some global option)
return meaningful error messages by extracting the required information from Apalache error messages

Store failed tests as a regression test suite

We need to behave similarly to Proptest, and store all failed tests as a regression test suite, that we rerun on each pass.

Until we truly implement semantic versioning and reproducibility for tests, this can be done by tracking hashes of all inputs (TLA+ model, config, and test assertion). In case any of the hashes changes, the corresponding regression tests are invalidated.

Check Java existence and compatibility

Both Apalache and TLC modules depend on Java for their execution.

We need:

determine which versions of Java are accepted by each of them, and compute the intersection
when any of the modules is being executed, check that the corresponding version of Java is installed locally. If not, provide a meaningful error message advising the user to install one of the versions accepted by both tools

JsonRPC via Jsonnet

According to the architectural decision in #2 (comment), we are going to use JSON-RPC for interacting with modules.

Taking into account that many modules are going to be implemented using Jsonnet, and also that Json transformations are much easier there than e.g. in Rust, it makes sense to create a simple Jsonnet library that will allow to wrap/unwrap requests, responses, and errors into JSON-RPC envelopes. The library should be usable in different settings:

from another Jsonnet file, such that it can be transformed into a full-fledged module
from Rust, such that some non-Jsonnet module can be also wrapped/unwrapped
from the command line -- for testing/debugging, or for integrating into scripts

System in run_tla_steps has trait bound of debug

Is this really necessary? System, at least how I'm using it, tends to be a struct with a grab bag of state used to run the test. For example, a GRPC library which is statefully initialized with a url. The library I'm using does not implement Debug, so I cannot derive Debug on System, and need to implement it manually.

MBT-core: test configuration

We need to design a configuration format to describe model-based tests.

The config format should provide the fields to specify:

the TLA+ model
whether to generate monitor for the model (see #3)
whether to generate history for the model (see #4)
the instantiation parameters (see #5)
the TLA+ test assertions: inline in the config, or via a file reference. The test assertions can be built using the auto-generated monitor and history
the weight for the test assertions and instantiation parameters (see #7)

Extract language/repo-independent features of Testgen

With the aim of creating MBT-core we will extract some language/repo-independent features from Tendermint-rs/testgen into this repo.

Meta: Refactoring for version 0.3

This is a meta-issue to keep track of refactorings necessary for the version 0.3, to be release end of August.
We will not do a complete refactoring to fully implement #2; instead we will do a sequence of simple steps, each making the tool more reliable and easier to use.

The tasks are listed in the approximate order of their execution, but some can be done in parallel.

Run CI on more environments

Currently we run CI only on Ubuntu:

modelator/.github/workflows/ci.yml

Line 8 in a39f9fa

runs-on: ubuntu-latest

This should be extended to include MacOS and Windows.

Modelator fails with latest Apalache

Recent Apalache versions report counterexamples of model invariants as errors.

# Tool home: system
# Package:   
# JVM args:  -Xmx4096m -DTLA-Library=
#
# APALACHE version 0.15.13 build ad76fbf
#
# WARNING: This tool is in the experimental stage.
#          Please report bugs at: [https://github.com/informalsystems/apalache/issues]
# 
# Usage statistics is OFF. We care about your privacy.
# If you want to help our project, consider enabling statistics with config --enable-stats=true.

Checker options: filename=Simple_Test.tla, init=, next=, inv=     I@18:04:41.045
Tuning:                                                           I@18:04:41.242
PASS #0: SanyParser                                               I@18:04:41.243
Parsing file /home/playground/Simple_Test.tla
Parsing file /home/playground/Simple.tla
Parsing file /home/playground/Ops.tla
Parsing file /tmp/Integers.tla
Parsing file /tmp/Naturals.tla
PASS #1: TypeCheckerSnowcat                                       I@18:04:41.520
 > Running Snowcat .::.                                           I@18:04:41.521
 > Your types are great!                                          I@18:04:41.592
 > All expressions are typed                                      I@18:04:41.593
PASS #2: ConfigurationPass                                        I@18:04:41.619
  > Simple_Test.cfg: Loading TLC configuration                    I@18:04:41.622
  > Using the init predicate Init from the TLC config             I@18:04:41.654
  > Using the next predicate Next from the TLC config             I@18:04:41.654
  > Simple_Test.cfg: found INVARIANTS: TestNeg                    I@18:04:41.654
  > Set the initialization predicate to Init                      I@18:04:41.655
  > Set the transition predicate to Next                          I@18:04:41.655
  > Set an invariant to TestNeg                                   I@18:04:41.655
PASS #3: DesugarerPass                                            I@18:04:41.666
  > Desugaring...                                                 I@18:04:41.666
PASS #4: UnrollPass                                               I@18:04:41.677
  > Unroller                                                      I@18:04:41.686
PASS #5: InlinePass                                               I@18:04:41.702
  > InlinerOfUserOper                                             I@18:04:41.703
  > Wrap                                                          I@18:04:41.705
  > CallByNameOperatorEmbedder                                    I@18:04:41.706
  > LetInExpander                                                 I@18:04:41.707
  > Unwrap                                                        I@18:04:41.707
  > InlinerOfUserOper                                             I@18:04:41.707
Leaving only relevant operators: CInitPrimed, Init, InitPrimed, Next, TestNeg I@18:04:41.709
PASS #6: PrimingPass                                              I@18:04:41.714
  > Introducing InitPrimed for Init'                              I@18:04:41.715
PASS #7: VCGen                                                    I@18:04:41.723
  > Producing verification conditions from the invariant TestNeg  I@18:04:41.723
  > VCGen produced 1 verification condition(s)                    I@18:04:41.728
PASS #8: PreprocessingPass                                        I@18:04:41.735
  > Before preprocessing: unique renaming                         I@18:04:41.736
 > Applying standard transformations:                             I@18:04:41.741
  > PrimePropagation                                              I@18:04:41.741
  > Desugarer                                                     I@18:04:41.750
  > UniqueRenamer                                                 I@18:04:41.761
  > Normalizer                                                    I@18:04:41.769
  > Keramelizer                                                   I@18:04:41.780
  > After preprocessing: UniqueRenamer                            I@18:04:41.792
PASS #9: TransitionFinderPass                                     I@18:04:41.799
  > Found 1 initializing transitions                              I@18:04:41.805
  > Found 1 transitions                                           I@18:04:41.808
  > No constant initializer                                       I@18:04:41.808
  > Applying unique renaming                                      I@18:04:41.809
PASS #10: OptimizationPass                                        I@18:04:41.819
 > Applying optimizations:                                        I@18:04:41.822
  > ConstSimplifier                                               I@18:04:41.823
  > ExprOptimizer                                                 I@18:04:41.824
  > ConstSimplifier                                               I@18:04:41.825
PASS #11: AnalysisPass                                            I@18:04:41.828
 > Marking skolemizable existentials and sets to be expanded...   I@18:04:41.829
  > SkolemizationMarker                                           I@18:04:41.829
  > ExpansionMarker                                               I@18:04:41.830
 > Running analyzers...                                           I@18:04:41.831
  > Introduced expression grades                                  I@18:04:41.838
  > Introduced 0 formula hints                                    I@18:04:41.838
PASS #12: PostTypeCheckerSnowcat                                  I@18:04:41.838
 > Running Snowcat .::.                                           I@18:04:41.838
 > Your types are great!                                          I@18:04:41.855
 > All expressions are typed                                      I@18:04:41.856
PASS #13: BoundedChecker                                          I@18:04:41.865
State 0: Checking 1 state invariants                              I@18:04:42.079
Step 0: picking a transition out of 1 transition(s)               I@18:04:42.082
State 1: Checking 1 state invariants                              I@18:04:42.089
Step 1: picking a transition out of 1 transition(s)               I@18:04:42.089
State 2: Checking 1 state invariants                              I@18:04:42.092
Step 2: picking a transition out of 1 transition(s)               I@18:04:42.092
State 3: Checking 1 state invariants                              I@18:04:42.094
Step 3: picking a transition out of 1 transition(s)               I@18:04:42.095
State 4: Checking 1 state invariants                              I@18:04:42.097
Step 4: picking a transition out of 1 transition(s)               I@18:04:42.098
State 5: Checking 1 state invariants                              I@18:04:42.102
Step 5: picking a transition out of 1 transition(s)               I@18:04:42.104
State 6: Checking 1 state invariants                              I@18:04:42.107
State 6: state invariant 0 violated. Check the counterexample in: counterexample1.tla, MC1.out, counterexample1.json E@18:04:42.132
Found 1 error(s)                                                  I@18:04:42.136
The outcome is: Error                                             I@18:04:42.138
Checker has found an error                                        I@18:04:42.139
It took me 0 days  0 hours  0 min  1 sec                          I@18:04:42.139
Total time: 1.142 sec                                             I@18:04:42.139
EXITCODE: ERROR (12)

which makes modelator report it as a failure too.

modelator/modelator/src/model/checker/apalache/mod.rs

Lines 162 to 173 in 49a905d

    
           (false, true) => { 
        
               // check if a failure has occurred 
        
               if stdout.contains("EXITCODE: ERROR") { 
        
                   return Err(Error::ApalacheFailure(ApalacheError::new(&stdout))); 
        
               } 
        
               assert!( 
        
                   stdout.contains("EXITCODE: OK"), 
        
                   "[modelator] unexpected Apalache stdout" 
        
               ); 
        
               Ok(ModelCheckerStdout::from_string(&stdout)?) 
        
           }

We should handle this EXITCODE as a special case.

Event-driven, abstract-state, unit-testing framework

In order to ease adoption of MBT, we are going to do it in two steps:

construct a unit-testing framework, which will allow the users to describe their unit tests in terms of abstract system states.
reuse this unit-testing framework for MBT

The idea is that it will be beneficial for everyone:

writing tests in terms of abstractions should be much easier, than in terms of concrete data structures.
those abstract states will be exactly the states of a TLA+ model. And if the user is already using the framework, it means they constructed already the translation layer between abstract and concrete system states -- that's what we need for MBT.

Add a command-line interface for each module

Currently we have 3 modules (tla, tlc and apalache) and 4 methods:

tla
- generate_tests(tla_file: TlaFile, tla_config_file: TlaConfigFile) -> Vec<(TlaFile, TlaConfigFile)>
- tla_trace_to_json_trace(tla_trace: TlaTrace) -> JsonTrace
tlc
- test(tla_file: TlaFile, tla_config_file: TlaConfigFile, options: Options) -> TlaTrace
apalache
- test(tla_file: TlaFile, tla_config_file: TlaConfigFile, options: Options) -> TlaTrace

Each of the above methods should be possible to be invoked using the command-line:

modelator tla generate_tests tla_file tla_config_file
modelator tlc test generated_tla_file generated_config_file > tla_trace_file
modelator tla tla_trace_to_json_trace tla_trace_file > json_trace_file

Make JARs download optional

In the current implementation the JARs are downloaded automatically here. This might create unnecessary delays, network traffic, or even could prevent the tool from functioning, not to mention that the users could be rightly suspicious of the tool that downloads something under the hood.

I think we need to pack with each release certain versions of the Jars statically, using include_bytes. We will bundle the files we know do work, and we have tested.

Additionally the user can be provided with the option to download the latest versions of the Jars, but this is optional for now.

Rework TLC module

The current TLC module needs some improvements.

adopt generic artefacts as input & output (see #61)
get rid of any path assumptions; the directory to execute in should be passed as a parameter (or set via some global option)
return meaningful error messages by extracting the required information from TLC error messages

Should we keep dead/unused code in comments?

Currently there are instances of chunks of commented out code interleaved with live code. For instance

        // TODO: disabling cache for now; see https://github.com/informalsystems/modelator/issues/46
        // load cache and check if the result is cached
        // let mut cache = TlaTraceCache::new(options)?;
        // let cache_key = TlaTraceCache::key(&tla_file, &tla_config_file)?;
        // if let Some(value) = cache.get(&cache_key)? {
        //     return Ok(value);
        // }

in src/../apalache/mod.rs.

Similarly, many docstrings are marked with ignore.

Leaving commented out code is considered by some to be a smell. There is a philosophy that the only things in main should be 'live' docs and code. Following such guidelines implies it's better to simply delete non-running code. The rationality is that any commented out code will become irrelevant so quickly as to make it useless, unless the turnaround to bring it back to life is extremely short.

I'm inclined to agree with this kind of philosophy, but code style is a very subjective issue so it should be agreed and discussed as a team.

Refactor error handling

We do indeed need to refactor the error reporting, as highlighted by #52:

make the errors as specific as possible, not bundle them together into a huge super-error (e.g. report per each test assertion, not per file with a bunch of assertions)
propagate a reasonable amount of information from the model checkers, to help the user understanding the errors

MBT-core: auto-generate TLA+ history

TLA+ code that adds history tracking to some existing TLA+ is quite simple and can be auto-generated.

Consider for example this fragment from LightClient model-based tests.

Part of MBT-core functionality should be generating such code on the fly for arbitrary specifications. For that:

convert TLA+ to JSON via apalache-mc parse
add modifications to JSON encoding
output it, or convert back to TLA+

Improve Modelator's Rust API

The current interaction between a Rust developer that wants to write model-based tests and the Modelator interface for that is suboptimal (see runner.rs).

Consider how it is used in IBC-rs: mod.rs, step.rs. There is a lot of boilerplate repetitive code that can be substantially simplified.

For that we are going to make Modelator's Rust-facing interface doing more job in terms of converting between concrete and abstract datastructures, as well as in terms of executing the test steps.

Abstract TLC and Apalache interfaces into a ModelChecker trait.

MBT-core: architecture

MBT-core implements the core, target language independent, functionality for model-based testing.

Figure out the architecture for the MBT core:

decomposition into components
test configuration: format (#6), instantiation (#7)
interaction with Apalache, Jsonart, language plugins
evaluate boundaries of automatic code generation
- auto-generate TLA+ monitor (#3) , history (#4), model instance (#5)
CI

Extend integration tests

While the current integration tests are OKish, on the other hand they are way too simple.

We need to create several integration tests, that:

touch on various aspects of TLA+
allow to illustrate Modelator's techniques and benefits

Non-deterministic test failures & behaviour

Our tests behave non-deterministically:

from time to time some tests fail: I get the failures for tester::tests::test as well as for apalache. The weird thing is that most times they succeed, but sometimes just fail;
the temporary files created in tests/integration/tla are not always deleted. This could be fixed via #33.

Most frequently this is triggered when something needs to be recompiled, but not necessarily so.

We need to fix that -- a testing tool should have it's own testing done right;)

Add generic modules

We need to add abstract module interface with manifests as outlined in #2 (comment) and
#2 (comment).

MBT-core: auto-generate TLA+ monitor

TLA+ code that adds monitoring to some existing TLA+ is quite simple and can be auto-generated.

Consider for example this fragment from LightClient model-based tests.

Part of MBT-core functionality should be generating such code on the fly for arbitrary specifications. For that:

convert TLA+ to JSON via apalache-mc parse
add modifications to JSON encoding
output it, or convert back to TLA+

Generic Jsonnet executor

We are going to have a large number of MBT-core modules implemented as small Jsonnet programs. We need to implement the way to execute arbitrary Jsonnet program that satisfies the requirements outlined in #2 (comment) as an MBT-core module.

Modelator will then come with a prepackaged set of Jsonnet-based modules, that the end user will be able to extend at will.

Refactor to get rid of implicit path assumptions

Fixing #33 was relatively easy, but it turned out that implicit path assumptions are present in many places in the code, and in particular in tests, thus disabling those tests was necessary.

We need to make all possible assumptions clear and explicit, in particular refactor and make TLA+ and Cfg files independent from the path where the original files has been read from.

Add caching of model checking results

The bottleneck for MBT is running model checkers. This can be mitigated by adding a caching layer, which would cache the results of each model checker invocation.

When a model checker is invoked, compute and store hashes of the TLA+ model, configuration, and the test assertion.
When a model checker produces a trace, store it in the modelator directory under the key which is a combination of above hashes.
When a model checker is to be invoked on the same inputs (verified by hash match), use the cached version of the output instead of calling the model checker.
Provide to the user the possibility to clear the caches.

MBT-core: auto-instantiate model for given parameters

For the given values of parameters, such as target height of the Light Client, etc., the TLA+ model instantiation file can be completely auto-generated. I.e. there is no need to store static files like this one.

We need to implement this auto-generation logic in MBT-core.

	(false, true) => {
	// check if a failure has occurred
	if stdout.contains("EXITCODE: ERROR") {
	return Err(Error::ApalacheFailure(ApalacheError::new(&stdout)));
	}
	assert!(
	stdout.contains("EXITCODE: OK"),
	"[modelator] unexpected Apalache stdout"
	);

	Ok(ModelCheckerStdout::from_string(&stdout)?)
	}