verse-lab / sisyphus Goto Github PK
View Code? Open in Web Editor NEWMostly Automated Proof Repair for Verified Libraries
Home Page: https://verse-lab.github.io/sisyphus/
License: GNU Affero General Public License v3.0
Mostly Automated Proof Repair for Verified Libraries
Home Page: https://verse-lab.github.io/sisyphus/
License: GNU Affero General Public License v3.0
For example, in find_mapi
, the higher order function never writes to a
:
let find_mapi (a: 'a array) (f: int -> 'a -> 'b option) =
let (len: int) = Array.length a in
if len = 0
then None
else
let (value_found: 'b option ref) = ref None in
let (_: bool) = while_upto 0 len (fun (i: int) ->
let (res: 'b option) = f i a.(i) in
let (found: bool) = Opt.option_is_some res in
if found then
value_found := res;
let (res: bool) = not found in
res
) in
!value_found
As such, theres no need for us to try and generate candidates for this part of the heap, as we know that it will have the same value as at the start of the iteration, and thereby can be found in our goal:
a ~> Array l
There was an outdated implementation of the principle of this component in archive/verification/hardcoded.ml but it was hardcoded and intended just to check that the principle worked.
We will need to implement a validator that:
Currently for find_mapi
, the valid candidate is not in the space of expressions searched by the expression generator as it involves a negb
and an is_some
.
Adding these functions to general synthesis would unnecassarily expand the search space, but we do need these functions for these kinds of invariants.
To fix the expression generator, add in boolean operations (just negb
for now) and option operations (is_some
and is_none
) when the invariant arguments/heap context contains options
How should we support functions that use invariants? Currently our higher order functions are all assumed to be pure.
The current implementation of the dynamic trace analysis is mostly a hack that I threw together for testing but probably needs to be reworked to work correctly for arbitrary OCaml binaries.
Currently, we print out the program, instrumented with calls to print the state of the heap at intermediate points, and then call OCaml on the code:
let generate_trace str : state list =
let open Bos in
OS.Cmd.run_io Cmd.(v "ocaml" % "-stdin") (OS.Cmd.in_string str)
|> OS.Cmd.out_string |> Result.get_exn |> fst |> Fun.flip Marshal.from_string 0
This was good enough for testing whether the idea of trace based program alignment would work at all, but quickly falls flat when we start working with any program of non-trivial size, as they may not be a single file.
Relevant links:
https://discuss.ocaml.org/t/pass-ocaml-values-to-and-from-ocaml-code-evaluated-at-runtime
Problem: CFML uses it's own vendored version of OCaml's parser; however this vendored parser does not support annotations, and CFML crashes on such inputs.
I have emailed Arthur Chargueraud about this, and he has said that they will be releasing a new version sometime in the future that fixes this, but requires some changes to the OCaml main branch to be made first, so will probably take a while.
In the meanwhile, we should modify the build scripts to manually remove annotations before running cfml on them.
Note to self: Don't use sed because Mac users (:-1: ) will crash if using them.
Idea by @mkeoliya: In the table of results, if we still have admits remaining in the proof (i.e we don't get round to automating this part), we can also include how many of the admits can be automatically dispatched by using sauto, maybe with appropriate lemmas. This would help to provide a motivation of how the remaining goals are "trivial", and the difficult parts are handled by our tool.
Depends on #4.
The proof module should provide:
We still need to work out the exact interface on this one (just like #6), but the idea should be that this component, when initialised with an appropriate set of primitive operations/functions/constants will generate candidate expressions to fill the intermediate holes of the proof script.
Maybe an off the shelf solver can be used for this, as suggested by Abhik.
Example of a function that requires a non-list invariant (i.e non-trivial structure for pre-condition).
type 'a tree =
| Leaf
| Node of 'a * 'a tree * 'a tree
let rec reduce f base t =
match t with
| Leaf -> base
| Node (vl, t1, t2) ->
let t1 = reduce f base t1 in
let t2 = reduce f base t2 in
f t1 vl t2
let hd_or ls vl =
match ls with
| [] -> vl
| hd :: _ -> hd
let tl_safe ls =
match ls with
| [] -> []
| _ :: tl -> tl
let count_and_copy t =
let s = ref [] in
let f = (fun t1 vl t2 ->
let count = 1 + t1 + t2 in
let t2 = hd_or !s Leaf in
s := tl_safe !s;
let t1 = hd_or !s Leaf in
s := tl_safe !s;
s := (Node (vl, t1, t2)) :: !s;
count
) in
let count = reduce f 0 t in
(count, List.hd !s)
This idea was suggested by @mkeoliya - one possible way we could use proof reduction in the future is to use it for specification inference as well. Given an OCaml program - as the OCaml typesystem provides some constraints, we can infer what the footprint of a program is quite easily. Then, we can look for any higher order functions in the body of a program, look up the lemmas of them in Coq, and use the proof reduction to quickly test out candidates.
Currently the mechanism for handling old proofs is very ad-hoc - it involves parsing the old proof script using Coq, and then extracting the raw ast of any expressions in the proof script into our internal logical expression encoding.
This is problematic as it relies on the user adhering to particular proof styles, and is also very fragile - if the user defines additional notations, etc. we may misinterpret the resulting ast and waste time on ill-typed invariants.
A better approach would be to operate on the old proof term directly, analyze it and extract expressions from the proof term itself. This way, we are agnostic to the tactics and proof style and notations, and instead can observe exactly the expressions used by the user.
I think restructuring the lemmas changed our query to Z3. @mkeoliya , can you debug (not high priority, but good to get it done maybe next week).
I've tried to make the build process as automated as possible, but there are some subtle dependency invariants in there that can cause building the project to fail.
@mkeoliya, probably a good first issue for you to get familiar with the codebase.
Problem: The module lib/proof_parser is outdated.
Details:
Currently Sisyphus uses a custom Menhir parser to parser Coq proof scripts (see lib/proof_parser/raw_parser.mly)
While this works, it's not ideal as it's fragile and as we will need to manually update the parser whenever our proof scripts change.
Solution:
In a recent commit a71d0c, I added a simple wrapper lib/proof/proof.mli around Coq's internal API which can be used to use Coq itself for parsing.
An example of how to do this is shown in main.ml, but this needs to be integrated into the rest of the project.
logic
submodule's dependency on proof_parser
.
https://github.com/verse-lab/proof-repair/blob/master/lib/logic/dune
proof_sanitizer.ml
: https://github.com/verse-lab/proof-repair/blob/master/lib/logic/proof_sanitizer.mlproof_parser
to use the Coq api instead.
proof_parser
depend on logic
and the coq api wrapperWhen converting 1 - 2
into our internal representation, this is captured as the application of function -
to two arguments 1
and 2
- i.e `App ("-", [`Int 1; `Int 2])
; This is okay.
However, there are some small inconsistencies across the codebase. In particular, I think some parts of the pretty printer assume that binary operators are encoded with parenthesis - i.e as `App ("(-)", [`Int 1; `Int 2])
, etc.
We should comb through all functions that operate over Lang.Expr.t and check that they make consistent assumptions about how binary operators are represented.
CI takes 30 minutes, making it essentially useless. The problem is that we build Z3 and Coq each time, which takes forever, Can we make it faster?
If so, then we can re-enable running benchmarks as well.
Idea by @mkeoliya , we can use Coq's extraction mechanism to extract definitions used in lemmas to OCaml, but the question is how to call them from the evaluator proof_analysis/proof_term_evaluator.ml
.
I wanted to build this project on my own machine (Fedora 37, 16GB RAM).
Following the readme, I ran the following commands in the sisyphus directory:
$ opam switch create . 4.12.0
$ opam repo add coq-released https://coq.inria.fr/opam/released --all --set-default
$ dune build
The output of dune build
contained some slightly concerning warnings.
Warning: Coq Language Versions lower than 0.8 have been deprecated in Dune
3.8 and will be removed in an upcoming Dune version.
File "benchmarks/table/dune", line 6, characters 0-187:
6 | (rule
7 | (target foo.txt)
8 | (deps ./template.tex ../../bin/main.exe (glob_files_rec ../../resources/*.{ml,v}) (glob_files_rec ../../resources/_CoqProject))
9 | (action (run touch foo.txt))
10 | )
Error: No rule found for benchmarks/table/template.tex
[WARNING] var was deprecated in version 2.1 of the opam CLI. Use opam var instead or set OPAMCLI environment variable to 2.0.
...
*** Warning: in file Verify_find_mapi_new.v, library Verify_combinators is required from root Common and has not been found in the loadpath!
*** Warning: in file Verify_find_mapi_new.v, library Verify_opt is required from root Common and has not been found in the loadpath!
*** Warning: in file Verify_find_mapi_new.v, library Tactics is required from root Common and has not been found in the loadpath!
...
And its exit code (obtained with echo $?
) was 1
, ie error, but I just ignored that (is that ok?).
Then I ran
$ dune runtest
which, after a while, spawned 8 processes (proably because I have 8 cores), each of which took several GB of RAM, and since my Linux system is not very good at dealing with out-of-memory situations, it put all my UI process into swap, and completely froze my system. After chatting with @Gopiandcode at PLDI, we concluded that I should try to only run one test at a time, so today I ran
$ dune runtest -j1
However, even one single process took 9.9GB when I screenshotted it (and even more just before I killed it):
(The above screenshot shows my process manager and a tooltip with all the command line args of the executable)
The paper says that the experiments were run on a machine with 8GB of RAM, so it seems if I get the setup right on my 16GB machine, I should be able to reproduce it.
Do you have any idea what could be wrong in my setup?
My first idea was that the many >=
in sisyphus.opam
could cause me to get newer versions than what you used a few months ago, but with some quick --version
invocations, I couldn't spot any version differences that seem significant:
$ opam --version
2.1.3
$ dune --version
3.8.2
$ coqc --version
The Coq Proof Assistant, version 8.15.2
compiled with OCaml 4.12.0
$ which coqc
~/git/clones/sisyphus/_opam/bin/coqc
$ which dune
~/git/clones/sisyphus/_opam/bin/dune
$ ocamlc --version
4.12.0
For now, we don't need to consider partial applications of higher order functions, just applications of a higher order combinator to arguments themselves that are functions.
In this case, for find_mapi
, the analysis can successfully find out that fp
is a variable that should be used whenever we expect an argument of type int -> A -> option B
, but for whatever reason (find out why), the type of find_mapi
is empty. Fixing this should make the thing work
12.10.22 01:35:53. gen INFO generation context is { Expr_generator.consts =
{int
-> [2; 1; arg0; 0],
bool -> [arg1],
func(int -> A -> (B)
option) -> [fp]}
;
pats =
{};
funcs =
{int
-> [("length",
[A list]);
("-",
[int; int]);
("+",
[int; int])
], A list
-> [("take",
[int; A list])
], val
-> [("find_mapi",
[])]} }
The name for the proof_validator
sub-module is incorrect/outdated, it should really be the coq
module.
We currently side-step the problem of type inference by requiring that every let binding must be explicitly annotated:
let (x: int) = length l in
...
This works, but is not ideal because it means we have to modify the programs a little.
Seeing as we're already using the compiler infrastructure in other parts of the project, we should be able to make the sanitization process more general by using the compiler to infer types for every binding, instead of using a syntactic restriction.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.