vellvm / ctrees Goto Github PK

View Code? Open in Web Editor NEW

11.0 7.0 2.0 1.85 MB

An itree-like data-structure to additionally support internal non-determinism

License: MIT License

Coq 99.82% OCaml 0.18%

ctrees's Introduction

Choice Trees

We develop a cousin of Interaction Trees, dubbed ctrees with native support for internal non-determinism.

Building instructions

Installing dependencies

Installing the opam dependencies

opam install dune
opam install coq-ext-lib
opam install coq-itree
opam install coq-relation-algebra
opam install coq-coinduction
opam install coq-equations

Obtaining the project

git clone https://github.com/vellvm/ctrees
cd ctrees

Building the project

dune build

ctrees's People

Contributors

Stargazers

Watchers

Forkers

elefthei aqissiaq

ctrees's Issues

Idea: Branching Temporal Logic -> ctrees

Here's an idea. Branching temporal logic (CTL) allow us to write a specification with modal operators like
"forever P", "eventually P" and "never P" and also consider different worlds by the usual quantifiers, ie:
"For all paths, P holds"
"There exists a path that P holds"

I think ctrees would work as a denotational model of CTL. Then we can have a CTL specification denote to a ctree S and show a program also denotes to a ctree P then prove P <= S where <= is the refinement operator.

On a canonical representation for stuck

With the move to arbitrary branching indexes, stuck is now represented by a certain index B0.
However, we end up injecting B0 into the ambient domain of indexes on a regular basis, and equ is sensible to the specific injection used when working parametrically. This leads to some lemmas being non provable for superficial reasons.

Hardcoding stuckness into a constructor in the structure would resolve the problem.

Edit: I've been a bit quick to despair, care in parametrization of injection instances allows us to sneak around the issue. That remains unfortunate.

Messed up reduction behavior of bind

When reducing an eta-expanded trigger e >>= k computation, we expose a [subst] rather than a [bindl], messing up subsequent rewrites --- minimal example thereafter. This should not happen, we need to double check what I messed up compared to how Li-yao has set-up the itrees.

Section Foo.
  Variable E : Type -> Type.
  Variable X : Type.
  Variable e : E X.

  Definition foo : ctree E unit := trigger e;; Ret tt.

  Goal foo ≅ foo.
    rewrite ctree_eta at 1.
    cbn.
    (* [subst] shows up, and it shouldn't *)
  Abort.
End Foo.

Allowing for an heterogeneous relation on labels

The current notion of bisimulation we use requires the computations to share the same signature of effects, and the same return type. This is reflected by the fact that the bisimulation checks that the challenges in the bisimulation game request to expose the same label, up-to coq's eq.

Relaxing this constraint on the return type is necessary to recover eutt's flexibility, but @nchappe has already a use case where he needs the additional flexibility to relate syntactically distinct events.

It should be slightly verbose but relatively straightforward to expand the current definition without breaking the current theory. Extending the theory should not be too bad, but a bit more work.

Trade off in representations

Setting aside more radically distinct presentations such as explored in the experiments folder, we need to pick a path along the following trade-offs, some of which depend on one another.

Hardcoding some branches. Namely, there are essentially three candidates: Stuck, Guard, and Step. The alternative relies on predicating constructions that use them by a typeclass constraint of the shape B0 -< B for instance. Here the tradeoff is between bloating up case analyses on the structure of the tree, vs. carrying around almost universally needed class constraints.
Reflecting in the type signature the domain of delayed branches independently from the one of stepping branch. Compared to the current heterogeneous branch, this would add a fourth parameter to the datatype. On the obvious cons side, this bloats type signatures everywhere with parameters. On the positive side, this gives us a lot of flexibility to enforce static invariants: in particular, it allows us to easily specify the case where no delayed non-determinism is allowed, a situation somewhat reminiscent of guarded ccs, and notably used at the moment extensionally in the yield development.
Leveraging the result about so-called guarded encoding of stepping branches, establishing that stepping branches can always be encoded into guarded branches whose branches are guarded by Steps. This allow us to replace BrS from the datatype, assuming that we introduce explicitly Step. Naturally, it becomes incompatible with expressing statically that a computation is Guarded as suggested in 2.

Use the fin from ExtLib

This is another annoying compatibility thing -- the Fin.t from ExtLib has many useful lemmas like

  Theorem fin_case : forall n (f : fin (S n)),
      f = F1 \/ exists f', f = FS f'.

But if we decide to use it we would have to change fin everywhere.

Proper instance for heterogeneous ssim

A property stating that if L == L' then ssim L == ssim L' would be useful (L and L' are relations on labels).

In 5537188 I introduced an admitted instance gfp_weq, I don't think this is provable without delving into the definitions of gfp/t from coq-coinduction.

Not sure but the theorem may also be true with <= instead of ==.

__upto_bind_eq_ssim is broken

Since the generalization of ssim up-to-bind (f755434).

This is a non-trivial problem, as a ssim (update_val_rel eq eq) appears, which is not equivalent to just ssim eq because of the annoying type parameter in the val case.

The coffee machine example (more specifically the coffee_ssim proof) is a good starting point to take a closer look at the problem.

Posing definitions rather than notations for the main relations

Following the [coinduction] library style, most greatest fixed-points are simple notations rather than definitions. This creates a lot of pain-points, notably when their implicit parameters need to be provided explicitly.

We want to move to definitions at some point to clean this up, but it might be a little delicate, the unifications problems that result from applying the very generic lemmas from the coinduction library being quite brittle.

Universe inconsistency with ITree.Events.StateFacts

We tried to solve this with @euisuny and @YaZko but I had to leave. Making a note here that

From ITree Require Import
     Events.StateFacts.

From CTree Require Import
     Bisim.

Gives a universe inconsistency error

Error: Universe inconsistency. Cannot enforce Coq.Relations.Relation_Definitions.1 <= RelationAlgebra.monoid.1 because
RelationAlgebra.monoid.1 <= Coq.Lists.List.1 <= MonadFix.MonadFix.u0 < Coq.Relations.Relation_Definitions.1.

Prove complete simulations for the constant and state schedulers

The theorems refine_cst_refinement and refine_state_refinement are proved with regular strong simulations, but it would make more sense to use complete simulations (cssim).

Meta-theory for refinements/simulations

The current bisimulations are already cleanly defined in terms of a symetrized simulation endofunction, but we do not currently define the gfp of the latter.

TODO: define and develop the meta-theory for simulations.

Performance issue

The [sbisimalt.v] file in particular is extremely slow to typecheck.

TODO: resolve this.

[CTL] Make low-level definitions opaque

CTL is a layered definition, some low-level things should be made opaque to hide the complexity.

Opaque ktrans
Opaque entailsF

Non-finite internal branching

We currently restrict internal branching to finite branching, indexed by Fin types.
This is a clear restriction, and an immediate blocking point in experimental applications investigated by @nchappe.

Two possibilities:

Allow for indexation by either a Fin type, or by nat
Allow for indexation by an arbitrary type. In this case, the indexes used by a given computation must become a parameter of the datatype, similar to E for external interactions.

Exposing statically the indexes used in the second solution is mandatory if any notion akin to interpretation of internal non-determinism is to be defined in the future (and it shall!).

Swiss army knife for inversion

Inversion of equivalent computations is painful, we currently have a whole bunch of lemmas and keep doing things in an ad-hoc way. We need a tactic settling it once and for all (at least for [equ], hopefully for [sbisim] too).

Need to cover:

Cases where both computations exhibit a head constructor
-- Unifies the arguments if suitable
-- Derives a contradiction if not
Cases where only one side exhibit a head constructor
-- Derives the head constructor of the other computation

Limiting the use of [JMeq_eq]

We currently use dependent inductions and destructions with no afterthoughts, leading [JMeq_eq] and [eq_rect_eq] to be pervasive to the whole development.

In the Interaction Trees development, @Lysxia went to great trouble (DeepSpec/InteractionTrees#194) to identify strictly the lemmas requiring those axioms, and showed in the process that a great deal of things can be achieved without them.

We may want to undergo a similar introspection at some point.

Recovering the companion's "symmetric" tactic in the heterogeneous branch

The coq-coinduction library provides a quite convenient tactic to leverage symmetry arguments in bisimulation proofs, avoiding to repeat the proof essentially to the identical in each direction.
The tactic is however quite strongly tied to the syntactic definition of a bisimulation as the intersection of a simulation and its reverse, which we lost when we generalized things to the heterogenous.
Can we recover it nicely in the cases where the proof method holds nonetheless?

[CTL] Coinduction lemma

The Coinduction library allows for parametrized coinduction proofs on the shallow definition of ag
but we have no way to do the same kind of proof in the deep definition, for example

We need to unfold entailsF only to do the parametrized coinduction underneath. This exposes the low-level details to the user of the deep embedding. We shouldn't do that and related #29 entailsF should be opaque.
One attempt at defining AG_coind' requires explicitly providing the coinductive relation, which is extremely tedious and cancels all the benefits of parametrized coinduction.

What would it take to do parametrized coinduction proofs without unfolding entailsF and without defining the relation R manually?

On specifiying the relation `L` fed to (bi)simulations

All relations are now parameterized by a relation L on labels, encompassing hence at once the itree-style relational postcondition passed to eutt, as well as the generalization rutt allowing to map distinct events one to another.

Currently, the type label reflects through a type parameter the signature of computations it applies to, but not its return type. It was coined this way to avoid awkward type transports when considering transitions in bind constructions, but has the drawbacks of making the relational specification L awkward: it is not aware of the return type specifically.

Additionally, having all baked into a single relation is sometimes weird: while the relation on val labels certainly must change when running a cut rule, the part on obs and tau must be invariant.

TODO: revisit these tradeoff. Expose the return type in the type of labels? Explicitly break relations into a val relation and an event relation?