paradoxika / skeptik Goto Github PK
View Code? Open in Web Editor NEWA library for Proof Theory (especially Proof Compression) in Scala.
A library for Proof Theory (especially Proof Compression) in Scala.
The timeout method in line 6 of https://github.com/Paradoxika/Skeptik/blob/develop/src/main/scala/at/logic/skeptik/util/time.scala does not do what it intends to do.
"awaitAll" doesn't kill the future if the execution of the future exceeds the timeout. It leaves the future running.
To Joseph:
I have read the paper's explanation of the example of compression with LUniv, and I noticed that it depends on a particular top-down traversal order, in which the unit {a} is visited before the subproof of {-a,b}. Otherwise, no compression would have happened.
I know that, in general, LUniv will always be dependent on the traversal order, because whether a subproof is univalent depends on the subproofs that have already been collected. But in the particular case of units, it seems that we could avoid this order dependence if we first selected all the units in an extra traversal. Have you considered this?
Another thing: do you think it would make sense to investigate heuristics that try to traverse the proof in the best possible top-down order? In the case of LUniv, it might be beneficial to visit subproofs with smaller conclusions first, for example... Probably the gain would not be so significant, right? What do you think?
There are some results about the complexity of checking RUP proofs. It might be (or not) scientifically challenging to see if these complexity results still hold for a yet-to-be-developed RUPModuloTheries proof system.
We currently have only a parser for proofs output by the SMT-Solver VeriT: https://github.com/Paradoxika/Skeptik/blob/develop/src/main/scala/at/logic/skeptik/parser/SMT2Parser.scala
We need a new parser for proofs output by the SMT-Solver CVC4. It should be easy to implement it by using the existing VeriT parser as an example.
The reduction rules of the Reduce&Reconstruct algorithm seem to be suitable to be performed during the fixing phase of other algorithms. How much compression can we gain with this? What is the time cost? Is it worth?
Note that using Scala's parallel collections causes an overhead. So, it is not always a good idea to replace a sequential for-loop by a parallel one.
There is a problem with the special symbols used to express formulae (e.g. symbol for the existential quantifier). These symbols are not always translated correctly.
The symbols are defined in the RichFormula.scala file.
We currently have only a parser for proofs output by the SMT-Solver VeriT: https://github.com/Paradoxika/Skeptik/blob/develop/src/main/scala/at/logic/skeptik/parser/SMT2Parser.scala
We need a new parser for proofs output by the SMT-Solver OpenSMT. It should be easy to implement it by using the existing VeriT parser as an example.
Right now, all theory rules (e.g. eq_transitive, and_pos, and, or...) are being parsed as Axiom. This means that we lose information from the original proof.
As a first improvement, we could instead parse theory inferences as instances of a generic blank inference rule with a variable name. The name of the generic blank inference rule instance would be set to "eq_transitive", "and_pos", "and", depending on the theory inference being parsed...
I'll take an example : RPI[3]LU. Let n_i be an irregular node with
premises n_l and n_r. RPI mark n_l for deletion :
edgesToDelete[n_i] = LeftDS
But if n_r is a unit, every edges to n_r are deleted :
edgesToDelete[n_i] = RightDS
During fixing, only the edge from n_i to n_r will be deleted. The
irregularity of n_i will be kept.
This correspond closely to the buggy behaviour I've observed in
RPI[3]LU. That's why I really think this is that bug which make the
difference between RPI.LU and RPI[3]LU.
The solution I propose is to encapsulate edgesToDelete in a dedicated
class.
Alternatively, we could set:
edgesToDelete[n_i] = BothDS
I have read the explanation of the example of compression with LUniv, and I noticed that it depends on a particular top-down traversal order, in which the unit {a} is visited before the subproof of {-a,b}. Otherwise, no compression would have happened.
I know that, in general, LUniv will always be dependent on the traversal order, because whether a subproof is univalent depends on the subproofs that have already been collected.
It could make sense to investigate heuristics that try to traverse the proof in the best possible top-down order... In the case of LUniv, it might be beneficial to visit subproofs with smaller conclusions first, for example... Probably the gain would not be so significant, though.
The paper that describes the RecyclePivots algorithm contains another algorithm called RecycleUnits, which is a special kind of subsumption, restricted to cases when the subsuming clause is a unit clause. The experimental results of RecycleUnits are not so impressive, and that's why I have never been eager to reimplement or improve it. But there are a few yet unanswered questions that I asked myself a long time ago:
can we generalize RecycleUnits in order to obtain less-restricted but still relatively efficient subsumption?
it seems that RecycleUnits and LowerUnits could easily be combined in a non-sequential way. How good would this be?
(and this is a more recent and more speculative question:) could this not-yet-developed generalization of RecycleUnits be somehow related to LowerUnivalents, in the same way that RecycleUnits is related to LowerUnits?
A more careful analysis of whether RedRec (especially more careful applications of A2) can expose new irregularities might be interesting.
The authors of RedRec considered a combined approach that is basically an iterated sequential composition of RP and RedRec. They hoped that, after each iteration of RedRec, new irregularities would be exposed (mainly because of rule A2, I suppose) and this would allow the next iteration of RP to compress the proof even further. This is an interesting and plausible hope, but I think a careful look at table 1b suggests that this hope actually doesn't realize. If it did, then the compression gain for CR as we increase the ratio from 0.01 to 1 should be bigger than the corresponding compression gain for RR. But it is actually smaller! In my opinion, this indicates that a simpler sequential combination (namely, executing RP only once and then RedRec) would achieve essentially the same compression as their CR, but would be more efficient, since it wouldn't (needlessly) execute RP so many times.
See discussion here: Jogo27@31be3b0#commitcomment-1644086
For some algorithms (e.g. "DAGify"), it might be possible and not too hard to directly construct a Proof during the foldDown or bottomUp traversal, instead of calling the Proof constructor when the traversal is finished. The Proof constructor traverses the proof once again, and this extra traversal might be saved.
The remainder (warnings) are either part of help.scala (which is only
crap code)Is this code still being used? Crap code shouldn't be used. And used
code should be improved until it becomes not crap... :-)
In this file are some functions I wrote to debug proof compression.
Most of them find differences in related proofs and display them using
graphviz. On one hand I could need them again, on the other hand I
don't have time to improve them. But anyway, this code is not used by
any other Skeptik code.
The command-line interface is currently only partially implemented. I propose something like this:
-compress <input_file> <output_file>
where is a string representing a sequential combination of compression algorithms (e.g. "RPI.LU.RedRec(30).Split(30)" would execute Split with 30s timeout, then RedRec, then LU then RPI).
Ideas to try:
Reduce&Reonstruct increases the number of literals, while everything else remains the same size, for the eq_diamond proofs. The following table was generated with the eq_diamond5.smt2 proof.
Is this normal? Maybe this is a normal although undesirable side-effect that happens when R&R applies rule A2 uselessly? Or is it a bug?
Currently, a complex atom such as "(= y (f x z))" is parsed as a single propositional variable (Var) having the whole string "(= y (f x z))" as its name. It should instead be parsed as a complex tree of App's and Var's.
Proposed by Pascal:
He (Daniel) was suggesting we studied together something about proofs for SAT, for PxTP. I proposed to look at proofs for pseudo-boolean constraints (linear constraints of the from a1 L1 + a2 L2 +... an LN >= k, where a1... an k are natural numbers, and L1... LN are propositional literals having value 0 or 1 depending on their truth value),
This is another direction for PxTP. It may be motivating also because it seems the pigeon hole problem is efficiently solved with pseudo-boolean constraints. What do you think?
I have read your explanation of the example of compression with LUniv, and I noticed that it depends on a particular top-down traversal order, in which the unit {a} is visited before the subproof of {-a,b}. Otherwise, no compression would have happened.
I know that, in general, LUniv will always be dependent on the traversal order, because whether a subproof is univalent depends on the subproofs that have already been collected. But in the particular case of units, it seems that we could avoid this order dependence if we first selected all the units in an extra traversal.
(e.g. "update-pom.sh" and "prepareForVSC.sh")
See:
Joseph observed that really many instances of Sequent are being created and consuming lots of memory.
I suspect this is due to some bug in the computation of the ancestry relation.
right now all ProofNodeCollections have a children relation. But it would make sense to have a more basic type of ProofNodeCollection without a children relation too.
ProofNodeCollection should do a few more things (e.g. implement canBuildFrom) in order to be usable as any other usual Scala collection.
There should be specialized subtypes of ProofNodeCollection that do parallel foldDown, for example.
ProofNodeCollection could provide methods to get its roots or leaves, to add and remove nodes, to compose itself with other ProofNodeCollections,...
Joseph Wrote:
While working on the slides for PCC's talk, I thought about a new idea to
improve LUniv. I think it may remove the dependency on ordering.What is the idea behind LU and LUniv ? It is that a node with many incoming
edges labeled with the same literal may be delayed until all of this
resolvents are resolved. Lowering down to the bottom is overkill and limit the
node we can lower.The new idea is therefore to perform a first bottom-up traversal. This
traversal must make it possible, given a set of nodes, to know which is the
higher node on all path from the root to any node of the set. I hope to be
clear enough for you to understand me. Your comments are welcome.
From @Jogo27 :
But nevertheless, I think we should think about a batch processing program
rather than a CLI interactive one. What I mean is that compression should be
run with a command looking like :
skeptik --compress --batch algorithm-description --proofs proofs-list
Where both "algorithm-description" and "proofs-list" are files. The former
describes the algorithms to run, along with the suffix for compressed proofs
filename. The later is just a list of proofs filename.
There are (at least) two possible ways of checking that a resolution refutation (with omitted intermediary resolvents and explicit pivots) is correct:
Could the second way be more efficient than the first way?
Was this already investigated in the literature? Here are some pointers:
Here are two ideas for improving LU and LUniv suggested by Joseph in this thread: https://groups.google.com/forum/?fromgroups#!topic/skeptik-dev/knoIZHwfKyY%5B1-25%5D
Idea 1:
Joseph - For LUniv and LUnivRPI, the number of literals which dual isn't in
the clause of lowered pivots is computed. It has to be one. But if
it's zero, we can stop to collect/fix anything and just reintroduce
lowered nodes now with this "zero-dual" node as "root" node.
Bruno - This idea seems interesting. Do you have any idea whether such cases
occur often in practice?
Joseph - No idea.
Bruno - Might there be proofs where this idea would perform badly though?
Joseph - It depends on how the idea is implemented. For instance, an exception
might be thrown when a "zero-dual" node is encountered. In this case
the algorithm might be much faster. But a small proof of false below
the found node might exist, and then the compression ratio will be
worse than in the original algorithm. On the other hand, a more careful
(and slow) implementation might exists were such drawbacks are avoided.
Idea 2:
Joseph - Same for LU, if the "dual" of an already collected Unit is
collected, we can stop collecting and resolve this units together.
But in this case, we have to take care of the unit order (one Unit
might be an indirect premise of the other).
Bruno - Also interesting, but in this case I can easily imagine a proof where
it would not be a good idea to do this:
Joseph - I've imagine two possible implementations for this idea. The first
one is to traverse the proof top-down for collecting units. The other
one is to check only when reintroducing nodes. I believe this second
solution to be much more interesting because we might test for
duplication, for tautology and even in 3pass for proof of false.
The usual split works like this:
We could postpone the reconstruction of the refutation and call split recursively on the proofs of b and (not b) instead. I believe this could not only provide better compression and be more efficient, but also make it easier to find a better heuristic for Split. The reason is:
Split essentially lowers all resolution steps with pivot b to the bottom of the proof. So the heuristic should choose the best possible literal b to be the last pivot in the proof. But when we iterate Split, we move b up! So, in the usual iterated version of Split, the heuristic and iteration work against each other. In the recursive version proposed here, this does not happen.
RP resets safeLiterals to the empty set when a node N has more than one child.
RPI takes the intersection of the safeLiterals coming from all its children.
The intermediary algorithm should reset safeLiterals of N to be equal to N's conclusion sequent.
Note that RPI's safeLiterals is always a superset of N's conclusion sequent. Therefore, this intermediary algorithm would have compression ratio smaller than RPI's and bigger than RP's. But it might be faster than RPI.
Checking that Phi_L' or Phi_R' are in D (i.e. marked for deletion) can be quite expensive if D is large and checking set pertinence in Scala is O(log n)... This means that we could optimize delete by using a special proof T of
Scala Play's cache library, for example can work with memcache servers, which will be useful if we need to scale Skeptik with respect to memory usage.
Another possibility is GridGain's in-memory data grid: http://www.gridgain.com/features/in-memory-data-grid/
We currently have only a parser for proofs output by the SMT-Solver VeriT: https://github.com/Paradoxika/Skeptik/blob/develop/src/main/scala/at/logic/skeptik/parser/SMT2Parser.scala
We need a new parser for proofs output by the SMT-Solver MathSat. It should be easy to implement it by using the existing VeriT parser as an example.
[info] Compiling 39 Scala sources to /Users/Bruno/Dropbox/Code/Skeptik/target/scala-2.10.0-M5/classes...
[warn] /Users/Bruno/Dropbox/Code/Skeptik/src/main/scala/at/logic/skeptik/algorithm/compressor/combinedRPILU/RegularizationEvaluation.scala:155: This lexical syntax is deprecated. From scala 2.11, a dot will only be considered part of a number if it is immediately followed by a digit.
[warn] ma.keys.foldLeft(mb) { (acc,k) => acc + (k -> (ma(k) + mb.getOrElse(k,0..toFloat))) }
[warn] ^
[warn] /Users/Bruno/Dropbox/Code/Skeptik/src/main/scala/at/logic/skeptik/algorithm/compressor/combinedRPILU/RegularizationEvaluation.scala:161: This lexical syntax is deprecated. From scala 2.11, a dot will only be considered part of a number if it is immediately followed by a digit.
[warn] if (fakeSize(proof.childrenOf(node)) == 1) information.estimatedSize + 1..toFloat else 1..toFloat
[warn] ^
[warn] /Users/Bruno/Dropbox/Code/Skeptik/src/main/scala/at/logic/skeptik/algorithm/compressor/combinedRPILU/RegularizationEvaluation.scala:161: This lexical syntax is deprecated. From scala 2.11, a dot will only be considered part of a number if it is immediately followed by a digit.
[warn] if (fakeSize(proof.childrenOf(node)) == 1) information.estimatedSize + 1..toFloat else 1..toFloat
[warn] ^
[warn] /Users/Bruno/Dropbox/Code/Skeptik/src/main/scala/at/logic/skeptik/algorithm/compressor/combinedRPILU/RegularizationEvaluation.scala:163: This lexical syntax is deprecated. From scala 2.11, a dot will only be considered part of a number if it is immediately followed by a digit.
[warn] evalRegularization(left, leftInfo) + evalRegularization(right,rightInfo) - 1..toFloat,
[warn] ^
[warn] /Users/Bruno/Dropbox/Code/Skeptik/src/main/scala/at/logic/skeptik/algorithm/compressor/combinedRPILU/RegularizationEvaluation.scala:182: This lexical syntax is deprecated. From scala 2.11, a dot will only be considered part of a number if it is immediately followed by a digit.
[warn] leftInfo.estimatedSize + rightInfo.estimatedSize - 1..toFloat,
[warn] ^
[warn] /Users/Bruno/Dropbox/Code/Skeptik/src/main/scala/at/logic/skeptik/algorithm/compressor/combinedRPILU/RegularizationEvaluation.scala:223: This lexical syntax is deprecated. From scala 2.11, a dot will only be considered part of a number if it is immediately followed by a digit.
[warn] safeLiterals.ant.foldLeft(0..toFloat)(foldFunction(information.estimatedGainForLeftSafeLiteral)) +
[warn] ^
[warn] /Users/Bruno/Dropbox/Code/Skeptik/src/main/scala/at/logic/skeptik/algorithm/compressor/combinedRPILU/RegularizationEvaluation.scala:224: This lexical syntax is deprecated. From scala 2.11, a dot will only be considered part of a number if it is immediately followed by a digit.
[warn] safeLiterals.suc.foldLeft(0..toFloat)(foldFunction(information.estimatedGainForRightSafeLiteral))
[warn] ^
[warn] /Users/Bruno/Dropbox/Code/Skeptik/src/main/scala/at/logic/skeptik/experiment/compression/Algorithms.scala:51: This lexical syntax is deprecated. From scala 2.11, a dot will only be considered part of a number if it is immediately followed by a digit.
[warn] new Result(proof, (System.nanoTime - beginning).toDouble / 1000000., 1)
[warn] ^
[warn] /Users/Bruno/Dropbox/Code/Skeptik/src/main/scala/at/logic/skeptik/experiment/compression/Algorithms.scala:84: This lexical syntax is deprecated. From scala 2.11, a dot will only be considered part of a number if it is immediately followed by a digit.
[warn] var duration:Double = 0.
[warn] ^
[warn] /Users/Bruno/Dropbox/Code/Skeptik/src/main/scala/at/logic/skeptik/experiment/compression/Algorithms.scala:88: This lexical syntax is deprecated. From scala 2.11, a dot will only be considered part of a number if it is immediately followed by a digit.
[warn] duration = (System.nanoTime - beginning).toDouble / 1000000.
[warn] ^
[warn] /Users/Bruno/Dropbox/Code/Skeptik/src/main/scala/at/logic/skeptik/experiment/compression/Experimenter.scala:112: This lexical syntax is deprecated. From scala 2.11, a dot will only be considered part of a number if it is immediately followed by a digit.
[warn] val initialMeasuresRecord = measures map { m => (m, 0.) }
[warn] ^
[warn] /Users/Bruno/Dropbox/Code/Skeptik/src/main/scala/at/logic/skeptik/experiment/compression/Measures.scala:51: This lexical syntax is deprecated. From scala 2.11, a dot will only be considered part of a number if it is immediately followed by a digit.
[warn] sum.update(algorithm, sum.getOrElse(algorithm,0.) + value)
[warn] ^
[warn] /Users/Bruno/Dropbox/Code/Skeptik/src/main/scala/at/logic/skeptik/experiment/compression/Measures.scala:76: This lexical syntax is deprecated. From scala 2.11, a dot will only be considered part of a number if it is immediately followed by a digit.
[warn] String.format("%7.3f %%", double2Double(100. * afterVal.toDouble / beforeVal.toDouble))
[warn] ^
[warn] /Users/Bruno/Dropbox/Code/Skeptik/src/main/scala/at/logic/skeptik/experiment/compression/Measures.scala:79: This lexical syntax is deprecated. From scala 2.11, a dot will only be considered part of a number if it is immediately followed by a digit.
[warn] def average(algorithm: String) = String.format("%.3f %%", double2Double(100. * afterSum(algorithm).toDouble / beforeSum.toDouble))
[warn] ^
[warn] /Users/Bruno/Dropbox/Code/Skeptik/src/main/scala/at/logic/skeptik/algorithm/compressor/guard/package.scala:35: implicit conversion method proofFctToGuard should be enabled
[warn] by making the implicit value language.implicitConversions visible.
[warn] This can be achieved by adding the import clause 'import language.implicitConversions'
[warn] or by setting the compiler option -language:implicitConversions.
[warn] See the Scala docs for value scala.language.implicitConversions for a discussion
[warn] why the feature should be explicitly enabled.
[warn] implicit def proofFctToGuard[P <: Proof[,P]](fct: ProofNodeCollection[P] => Boolean) = new Guard[P] { def apply(r: ProofNodeCollection[P]) = fct(r) }
[warn] ^
[warn] /Users/Bruno/Dropbox/Code/Skeptik/src/main/scala/at/logic/skeptik/algorithm/compressor/guard/package.scala:38: implicit conversion method fctToGuard should be enabled
[warn] by making the implicit value language.implicitConversions visible.
[warn] implicit def fctToGuard[P <: Proof[,P]](fct: %28%29 => Boolean) = new Guard[P] { def apply(r: ProofNodeCollection[P]) = fct() }
[warn] ^
[warn] /Users/Bruno/Dropbox/Code/Skeptik/src/main/scala/at/logic/skeptik/algorithm/compressor/package.scala:20: implicit conversion method compressorAlgorithmToFunctionProofNodeCollection should be enabled
[warn] by making the implicit value language.implicitConversions visible.
[warn] implicit def compressorAlgorithmToFunctionProofNodeCollection[P <: Proof[,P]](a: CompressorAlgorithm[P]) =
[warn] ^
[warn] /Users/Bruno/Dropbox/Code/Skeptik/src/main/scala/at/logic/skeptik/algorithm/compressor/package.scala:24: implicit conversion method compressorAlgorithmToFunctionProofNodeCollectionWithGuard should be enabled
[warn] by making the implicit value language.implicitConversions visible.
[warn] implicit def compressorAlgorithmToFunctionProofNodeCollectionWithGuard[P <: Proof[,P]](a: CompressorAlgorithm[P]) =
[warn] ^
[warn] /Users/Bruno/Dropbox/Code/Skeptik/src/main/scala/at/logic/skeptik/util/help.scala:76: inferred existential type scala.collection.immutable.Map[Wrap,List[Wrap]] forSome { type Wrap <: Object{val n: at.logic.skeptik.proof.sequent.SequentProof} }, which cannot be expressed by wildcards, should be enabled
[warn] by making the implicit value language.existentials visible.
[warn] This can be achieved by adding the import clause 'import language.existentials'
[warn] or by setting the compiler option -language:existentials.
[warn] See the Scala docs for value scala.language.existentials for a discussion
[warn] why the feature should be explicitly enabled.
[warn] def makeMapOfChildren(node: SequentProof, nodeCollection: ProofNodeCollection[SequentProof]) = {
[warn] ^
[warn] 19 warnings found
We need to keep everything that was necessary to generate data and charts for the LUniv paper. The experiment and the chart production should be easily reproducible, preferably by simply executing a single script. Currently, some files (e.g. "sliceXXX") seem to be missing. (the only files that should be missing are the proof files, because they are too big and there too many of them to upload to the repository).
Since we migrated from the old data structures and proof format to the new ones, we have not implemented a proof exporter yet. The proof exporter for the old format and old data structures is available at: https://github.com/Paradoxika/Skeptik/blob/develop/src/main/scala/at/logic/skeptik/exporter/OldResolutionProofExporter.scala . We should rewrite it to use the new data structures and new proof format.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.