Code Monkey home page Code Monkey logo

co4's People

Contributors

abau avatar blaisorblade avatar jwaldmann avatar rene-thiemann avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

co4's Issues

stack overflow during abstract evaluation

commit c5eaecd

leads to stack overflows during abstract evaluation, which should not happen, since resursion is on known list shapes only (the contents of the lists is unknown but it should not influence the control flow)

./CO4/CSPlib/Prob006/Main 2 1 +RTS -xc
Start producing CNF
*** Exception (reporting due to +RTS -xc): (THUNK_STATIC), stack trace: 
  CO4.EncodedAdt.isDefined,
  called from CO4.EncodedAdt.isUndefined,
  called from CO4.EncodedAdt.isConstantlyUndefined,
  called from CO4.EncodedAdt.flags.\,
  called from CO4.EncodedAdt.flags,
  called from CO4.EncodedAdt.constantConstructorIndex,
  called from CO4.EncodedAdt.constructorArgument,
  called from Main.encDistribute.\.encXs'_31,
  called from Main.encDistribute.\,
  called from Main.encDistribute,
  called from Main.encOesort.\,
  called from Main.encOesort,
  called from Main.encAlldifferent_sorting.\,
  called from Main.encAlldifferent_sorting,
  called from Main.encConstraint.\,
  called from Main.encConstraint,
  called from CO4.Monad.runCO4,
  called from CO4.Solve.solveP,
  called from CO4.Solve.solveAndTestP,
  called from Main.run,
  called from Main.main,
  called from :Main.CAF:main
...
Stack space overflow: current size 8388608 bytes.

ignore "deriving (Show, Eq)"

in order to be able to use *.standalone.hs in ghci, data declarations must contain deriving (Eq,Show). These should just be ignored by CO4.

improve tracing: check/log pattern matches on unknown/known data

for debugging the space/time behaviour I want to know, for each case, how often it was happening on known data, and how often on unknown data. Of course "known" is better since it does not require encoding and merging.

  • logging: collect this information (while the abstract program is running) and print it in a suitable form
  • checking: I want to put some annotation in the source that says "this case should ONLY be executed on known data"

as long as we don't have the "modes" system, if such an annotation is present, CO4 does not generate code for matching unknown data and merging the results, but instead emits code that will throw an exception.

built-in binary numbers

We should provide an easy way to use an externally defined (and optimized) binary encoding for data types, like those in https://github.com/jwaldmann/satchmo/blob/master/Satchmo/Binary/Op/Times.hs#L21

Suggestion: for abstract objects for types Integer (arbitrary width), Int8, Int16 (fixed width) (actually, Natural, Nat8, ... ) we use the type EncodedAdt that we already have:
we put the binary encoding in the list of flags, and there are no arguments. Then:

  • pattern match on these types is forbidden (instead, a library must expose some functions for testing properties)
  • but merging works as usual (well, need to check what happens if we merge different widths - probably the IntN library functions just ignore any extra bits after N)

Ambiguous type when using (==) in polymorphic code

Using (==) in polymorphic code, like

ord :: Eq a => a -> ...

, results in ambiguous types in the compiled code (here: encOrd)

Could not deduce (EncEq a0 e p)
      arising from the ambiguity check for `encOrd'
    from the context (CO4.Cache.MonadCache (e p) m,
                      CO4.Profiling.MonadProfiling m,
                      EncEq a e p)
      bound by the inferred type for `encOrd':
                 (CO4.Cache.MonadCache (e p) m, CO4.Profiling.MonadProfiling m,
                  EncEq a e p) =>
                 e p -> e p -> e p -> m (e p)

as it is unclear which EncEq instance to choose. Ghci suggests

Possible fix: add a type signature that fixes these type variable(s)

but this requires to explicitly annotate type signatures in the compiled program.

gtNat8 has wrong type sometimes

this does not work:

monotone xs = 
    and (zipWith gtNat8 xs (tail xs) )
...
CO4/Test/Assert.hs:25:4:
    Could not deduce (p ~ (e p -> m (e p)))

but this does:

monotone xs = 
    and (zipWith gt xs (tail xs) )

gt = gtNat8

Stack space overflow

I am getting stack overflows when generating CNFs of moderate (I think) size.
It would be good to know the reason. Perhaps something is too strict,
e.g., foldr with a strict combining function (so it has to traverse the whole list before producing a result)? The following trace (with cache, with profiling) shows nothing unusual.

commit 7a4284f ,

 ./CO4/Test/WCB '(((......)))' +RTS -xc

Start producing CNF
*** Exception (reporting due to +RTS -xc): (THUNK_STATIC), stack trace: 
  CO4.Profiling.onCurrentInner,
  called from CO4.Profiling.emit,
  called from Satchmo.Core.Boolean.assert,
  called from Satchmo.Core.Primitive.assertOr,
  called from Satchmo.Core.Boolean.and,
  called from Satchmo.Core.Primitive.xor.xor2,
  called from Satchmo.Core.Primitive.xor,
  called from CO4.PreludeNat.encComparePrimitives,
  called from CO4.PreludeNat.encLtNat8.\,
  called from CO4.PreludeNat.onFlags,
  called from CO4.PreludeNat.encLtNat8,
  called from CO4.PreludeNat.encGtNat8,
  called from CO4.PreludeNat.encMaxNat8.\,
  called from CO4.PreludeNat.encMaxNat8,
  called from CO4.Cache.withCache.\,
  called from CO4.Cache.withCache,
  called from Satchmo.Core.SAT.Minisat.>>=.\.(...),
  called from Satchmo.Core.SAT.Minisat.>>=.\,
  called from Satchmo.Core.SAT.Minisat.>>=,
  called from Satchmo.Core.SAT.Minisat.solve'.\,
  called from Satchmo.Core.SAT.Minisat.solve',
  called from CO4.Algorithms.Eitherize.Solve.solve,
  called from CO4.Algorithms.Eitherize.Solve.solveAndTestP,
  called from CO4.Algorithms.Eitherize.Solve.solveAndTestBooleanP,
  called from Main.result_for,
  called from Main.mainz

wildcard patterns (_)

for the definition of equality on

data T = A | B | C

it would be nice if we had '_' for writing

eqT x y = case x of A -> case y of A -> True ; _ -> False ...

profiling

simple proposal (for satchmo backend):

status :: SAT (Int, Int) 

should give current number of variables, clauses (we can count this on our own, or ask minisat-API)

then, simple version of "call with profile" (in the compiled program)

traced :: String -> SAT a -> SAT a
traced msg action = do
    s0 <- status
    x <- action
    s1 <- status
    print (msg, s1 - s0)
    return x

but the semantics in the constraint system is:

traced :: String -> a -> a ; traced msg x = x

deriving Eq

for all user-defined algebraic data types, equalitiy would be nice:

  • deriving Eq for the "original" program (so it can be used when testing the answer)
  • the standard implementation of '==' for the transformed program (so it can be used in constraints)

(Note: We don't want the full type class machinery, just '=='.)

A design question is whether

  • the compiler adds 'deriving (Show, Eq)' silently
  • the user has to write 'deriving (Show, Eq)' explicitly

Writing might be better: then the constraint program is a correctly typed Haskell program

nested patterns

it would be nice to allow nested patterns, for example

case xs of Cons x Nil -> ...

They should be translated to nested cases with flat patterns like

case xs of Cons x xs' -> case xs' of Nil -> ...

abstract evaluation (merging) blows up data

(this is not a bug) Following example shows an exponential blowup
due to recursive merging (which I think will produce a full binary tree)

data T = A | B Bool T | C T Bool

build xs = case xs of
        [] -> A
        x : xs' ->
            let t' = build xs'
            in case x of
                   False -> B x t'
                   True -> C t' x

https://github.com/apunktbau/co4/blob/608e3270af504849344a860c95fb893c0ad2faa4/CO4/Test/Complexity1.hs#L29

data from running the test program (#vars is doubled each time)

depth  #vars
5             188
6             380
7             764
8            1532
9            3068
10          6140

I think this could be avoided if the arguments of constructor C are swapped in the abstract value,
effectively using

data T = A | B Bool T | C Bool T

but I am not sure whether CO4 should look for such kind of optimisation.

from Nat8 to configurable (but fixed) bit width numbers

could we replace Nat8 by Nat in the sense that:

  • the bit width must be given in the constructor, but not in operations
  • operations with more than one argument (plus, times, equals) check at runtime that all operands have equal width

abstract evaluation introduces non-termination (could be improved by incremental solving)

the concrete program is terminating (for all arguments)
but the abstract program (currently) is not.

with incremental feasibility checking, abstract evaluation should detect
that "case u of True ..." cannot happen in the inner call.

constraint k u = case u of
            False -> k
            True -> constraint (not k) False

https://github.com/apunktbau/co4/blob/f5b8b0d9e3d3b018d7f437be5514ac5fc7b1aeaf/CO4/Test/Termination1.hs#L25

Passing around top level functions

This doesn't compile

data Bool        = False | True
data List a      = Nil | Cons a (List a)

main x = all (\c -> c x) (Cons not Nil)

all f = foldr (\x -> and (f x)) True

foldr k z xs = case xs of
  Nil       -> z
  Cons y ys -> k y ( foldr k z ys )

and x y = case x of 
  False -> False
  True  -> y

not x = case x of False -> True
               True -> False

Maybe passing around top level functions in the compiled system isn't handled correctly?

built-in data Bool

it would be nice to have data Bool = False | True
and some basic operations ( not, ||, && )

and also if-then-else (translated to case).

Note: data Bool is needed anyway for the result type
of the top-level constraint.

solved, but test fails (perhaps due to use of "undefined")

commit f3c98ba
file CO4/Test/SL.hs execute example, gives

(RULES a -> b)
[([[]],[[True]])]
fromList [(b,[True]),(a,[])]
Start producing CNF
Cache call hits: 61 (18%), call misses: 267 (81%)
Cache case hits: 29 (16%), case misses: 149 (83%)
Assertion: 196
CNF finished (#variables: 99, #clauses: 221)
Starting solver
Solver finished in 0.0 seconds (result: True)
Starting decoder
Decoder finished
Solution: Step [([],[[Finite [False]]]),([True],[[Finite [True]]])] []
Test: *** Exception: Prelude.undefined

instead it should print True?

main constraint should not be called "main"

otherwise it conflicts with the "main" function (of type IO()) that ghc expects when compiling an executable.

there are work-arounds like

  • putting $(runIO ..) in a separate file
  • using ghc -main-is foo

so it is probably not that important.

Nat with larger bit width broken (by inefficiency)

Test case: https://github.com/apunktbau/co4/blob/master/CO4/Test/Factor.hs
constains simple binary multiplication.

Obeserved behaviour: eats huge amount of space (try replacing 24 by 32 also).

This seems terribly wrong: https://github.com/apunktbau/co4/blob/master/CO4/PreludeNat.hs#L50

instance Encodeable Nat where
  encode n = encodedConstructor (value n) (2^(width n)) []

since it seems to iterate over 2^24 values.

This inefficiency is hidden for the lower bit widths that we usually have (<= 8)
but it should be fixed.

CO4 options: distinguish between static and dynamic

at the moment, CO4 is configurable like this

$( runIO $ configurable [ Verbose
                        , ImportPrelude
                        -- , DumpAll "/tmp/sl" 
                        -- , Profile
                        , Cache
                        ] 
         $ compileFile "CO4/Test/SL.standalone.hs" )

This means I have to know all the options at compile-time.

There should be a way to set some of these options at run-time (Verbose,Profile, maybe Cache)

encConstraint has wrong type (perhaps wrong number of arguments?)

commit ac8e7ec

this is fine:

ghci CO4/Test/WCB_Matrix.standalone.hs 

but this fails:

ghci CO4/Test/WCB_Matrix.hs

CO4/Test/WCB_Matrix.hs:73:8:
    Couldn't match type `e0 p0
                         -> CO4.EncodedAdt.Overlapping.Overlapping
                              Satchmo.Core.Boolean.Boolean'
                  with `Satchmo.Core.SAT.Minisat.SAT
                          (EncodedAdt Satchmo.Core.Boolean.Boolean)'
    Expected type: ParamConstraintSystem
                     Satchmo.Core.SAT.Minisat.SAT Satchmo.Core.Boolean.Boolean
      Actual type: CO4.EncodedAdt.Overlapping.Overlapping
                     Satchmo.Core.Boolean.Boolean
                   -> CO4.EncodedAdt.Overlapping.Overlapping
                        Satchmo.Core.Boolean.Boolean
                   -> e0 p0
                   -> CO4.EncodedAdt.Overlapping.Overlapping
                        Satchmo.Core.Boolean.Boolean
    In the fourth argument of `solveAndTestBooleanP', namely
      `encConstraint'

now ParamConstraintSystem SAT Boolean expands to

Overlapping Boolean
-> Overlapping Boolean
-> SAT (Overlapping Boolean)

and this looks like encConstraint somehow got one extra argument (of type e p)

strange thing is: if I comment out the second clause (after &&) of

simple s (p, t) =
           ( geEnergy (bound p s) (get t (leftmostI p) (rightmostI p)) )
        && ( maxboundOk mi zero plus times eqEnergy cost p t )

then it works fine.

built-in == for Bool is not static enough

the new assertKnown is great help in debugging.

I found that x == y is not assertKnown when x and y both are,
but the explicit definition is fine

eqBool x y = case x of
    False -> not y
    True  -> y

add encodings of csplib problems

CSPLIB ( http://www.csplib.org/ ) contains combinatorial problems that are standard in the constraints community. It would be good to have at least some of them encoded in CO4, to compare encoding (does it look nice?) and performance. (I started in CO4/CSPLIB/ ) One recurring building block seems to be the alldifferent constraint (cf. Issue #44 )

built-in data List a

it would be nice to have

data [a] = [] | a : [a] 

and some standard operations from the Prelude like (++), concat, reverse, map, foldr, and, or.

built-in improved encodings for Boolean connectives

even with data Bool built-in, there's (probably) a difference between

x && y = case x of { False -> False; True -> y }

and Satchmo.Boolean.&&

Is there indeed? This should be investigated. Perhaps Minisat's simplifier removes the difference already.

more profiling info

the current profiler accumulates costs
(everything "in and under" a function, is attributed to that function).

it would be nice to have, in addition, the cost "in" the function
(without the cost "under" the function).

possible realisation: global variables with the meanings:

  • last snapshot of cost
  • name of current function

and each time where name-of-current-function changes, a new snapshot is taken, and the difference to the previous snapshot is attributed to the previous function.

profiling data

how to compile and execute with ghc's profiler (commit 60351a6)

ghc -O2 CO4/Test/WCB.hs -main-is mainz -rtsopts
ghc -O2 CO4/Test/WCB.hs -main-is mainz -rtsopts -prof -auto-all -osuf oprof
./CO4/Test/WCB '((...))' +RTS -K1G -M7G -P -h

then look at WCB.prof

COST CENTRE                   MODULE                         %time %alloc  ticks     bytes

solve                         CO4.Algorithms.Eitherize.Solve  20.0    0.0   1050    125088
emit                          CO4.Profiling                   14.2   22.1    746 392602912
caseOfBits.caseOf2Bits.merge2 CO4.EncodedAdt                  10.8   20.3    565 360866736

is this expected?

functions from PreludeNat should appear in profiles

commit 7ad6117

at present it seems plusNat8 etc. are ingnored for profiling:

constraint s xs = 
    eqNat8 s ( summe xs )

summe xs = case  xs of
    [] -> nat8 0
    x:xs' -> plusNat8 x (summe xs')

Profiling (inner): 
("summe",ProfileData {numCalls = 12, numVariables = 633, numClauses = 1878})
("constraint",ProfileData {numCalls = 1, numVariables = 17, numClauses = 41})
("__init",ProfileData {numCalls = 0, numVariables = 0, numClauses = 0})

this gives the wrong impression that summe allocates variables. Work-around:

constraint s xs = 
    eqNat8 s ( summe xs )

summe xs = case  xs of
    [] -> nat8 0
    x:xs' -> plus x (summe xs')

plus = plusNat8

Profiling (inner): 
("plus",ProfileData {numCalls = 11, numVariables = 633, numClauses = 1878})
("constraint",ProfileData {numCalls = 1, numVariables = 17, numClauses = 41})
("summe",ProfileData {numCalls = 12, numVariables = 0, numClauses = 0})
("__init",ProfileData {numCalls = 0, numVariables = 0, numClauses = 0})

Unique Ids for terms by memoizing constructor calls

For optimized memoization we need an ideal (collision-free) hash function, so we can find an item in the cache in one step (and avoid the traversal of the item).

The "easy" (brute force) solution is to memoize constructor calls. To each node, we attach a globally unique Identifier (a number). Then all terms are reduced (as in "reduced BDDs").

more built-ins: data Either a b, data Maybe a, data (a,b)

by "built-in" I mean: it should be available to the user, not necessarily: it should be treated in a special way by the compiler (only Bool is special).

It is more like reading a module with definitions that the Haskell programmer expects (it should be a subset of the official Haskell Prelude).

Typed allocators

Add typed allocators to statically guarantee that an allocator matches an ADT.

order of patterns

as I understand, currently

  • each pattern (in a case) must be Constructor followed by variables
  • the order of patterns must match the order of constructors in the data declaration

these decisions are ok for a prototype, but the point of this bug report is that the ordering restriction is not enforced. My suggestion:

  • check the order and reject the program it it does not match
  • or re-order the patterns

type synonyms

Support type synonyms, so we can write:

type Foo = Bar
data Spam = Spam Foo

Possible implementation: expand type synonyms in Template Haskell frontend.

debug/trace output must go to stderr (not stdout)

Example: for termination tools, it is required that stdout contains just the final proof (starting with YES) so everything that happens before, must be silent.

In general, CO4 applications expect to have control over any output.

They want to enable/disable, and also to direct it to some handle.

Or, instead of printing, it could be collected into some structured data type so it can be analysed programmatically later.

named components in records

it would be nice to be able to declare

data T = C { foo :: T1, bar :: T2 }

and then have

  • translation to record construction (and pattern match) with positional arguments (with any order of components, but we could require that each component must be present for construction - this conforms to strict semantics)
  • component accessor functions (as partial functions)

split working examples from Test/ directory

For the sake of testability we should introduce an Example/ directory that includes working examples. New versions of co4 must solve these examples in order to get released.

The existing Test directory should contain drafts and tests that may not work with the latest version of co4.

add encodings of minizinc challenge problems

(mini)zinc is (planned to be) the standard CSP modelling language. We should compare CO4 encodings to minizinc encodings, for standard/interesting/challenging problems. Cf. http://www.minizinc.org/challenge2013/

The programming models are still different: minizinc does have only limited forms of abstraction, and no pattern matching. So we should not directly translate the minizinc encoding, but rewrite to more idiomatic Haskell.

built-in data (e.g., Int) that are never encoded

It would be really really helpful to have some data that is never encoded.

An important use case is indices for collections (e.g., lists, maps).

We do want caching for non-encoded data.

For monomorphic Prelude types, we could just take

data Overlapping p = Overlapping p [p] [ Overlapping p ] 
                   | Bottom
                   | Known_Int Int

but the problem is in

  • user-defined types
  • polymorphic types

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.