augustss / microhs Goto Github PK

View Code? Open in Web Editor NEW

269.0 269.0 16.0 38.27 MB

Haskell implemented with combinators

License: Other

Haskell 56.57% Makefile 0.92% C 42.34% Shell 0.17%

microhs's People

Contributors

Stargazers

Watchers

Forkers

rewbert abhiroop svenssonjoel dunhamsteve dmjio j-hui exaexa marinelli smunix melted gergoerdi jmaessen 414owen yobson ysangkok aadorian

microhs's Issues

Using cabal packages, dependencies for users

I am pondering what would be a minimal way to enable using cabal packages from hackage.

Let's assume that PackageImports aren't needed, i.e. each module exists only in one package. And let's assume that versions bounds don't exist, i.e. the latest version always works.

Let's say our target package depends on foo and bar, which are on hackage. Would it work to just to 'cabal get' those packages and move them to the same source tree? It would be cumbersome to identify the source directory of the library component without using the Cabal library which I would like to avoid.

One could use the topologically sorted build plan (e.g. the output of cabal-plan) from hackage (should be available since docs are built for most?) to find the list of dependencies with their versions, and then a solver could be avoided . If the build plan for bar mentions a package that is in foo too, one could obviously skip extracting it into the source tree.

But, if the plan mentions a different version, one would have to fail the build. If there are too many failures one could use stackage to ensure that the deps of foo and bar are compatible. But then where would the build plan come from? I don't think stackage publishes this.

Maybe i am making this way too hard, and there is a way to just use cabal-install with microHs? Or maybe there is a way to easily get the -I include path given just given build-depends: foo, bar?

Would be curious to hear if anyone has thoughts on using microHs with multiple packages.

Forall in type synonym

I am not so knowledgeable in type theory. But I thought type synonyms should be completely transparent. But if they were, then surely this should compile (from the JHC test suite):

module Forall (id2) where

-- forall in type synonym
type IdentityFunc = forall a . a -> a

-- type synonym with forall
id2 :: IdentityFunc
id2 x = x

But I get:

line 8, col 5: Cannot satisfy constraint: (forall (a::a3) . (a -> a) ~ (a8 -> a8))

It does compile with GHC .

Constraint on class method doesn't propagate to instance

This module compiles with GHC, but not with MicroHs. I think it seems like it should compile, so I am reporting this bug.

% mhs Main
mhs: "./Main.hs": line 5, col 33: Cannot satisfy constraint: (Eq a#44)
% ghc -fforce-recomp Main.hs
[1 of 2] Compiling Main             ( Main.hs, Main.o )
[2 of 2] Linking Main [Objects changed]

The contents of Main.hs (taken from transformers-0.4.3.0):

-- | Lifting of the 'Eq' class to unary type constructors.
class Eq1 f where
    eq1 :: (Eq a) => f a -> f a -> Bool

instance Eq1 Maybe where eq1 = (==)

main = pure()

TupleSections with two commas

Seems like MicroHs doesn't like this, I am not sure whether this is a GHC extension to the TupleSections extension that isn't supposed to be supported, or whether it is a genuine bug.

module Example2(main) where

import Prelude

f :: Bool -> Int -> (Bool, String, Int)
f = (,"Hello World",)

main :: IO ()
main = pure ()

This fails on MicroHs:

 % ./bin/mhs Example2
ERR: mhs: "./Example2.hs": line 6, col 20:
  found:    ,
  expected: { select LQIdent ( UQIdent [ literal primitive :: )

But works on GHC 9.8:

 % ghc -XTupleSections Example2.hs
# Succeeds

Taken from tcrun024.hs.

Different runtime support at build time

This may be a misunderstanding about how the compiler is meant to be used,

But looking at the code in Main.hs, it always loads the eval-unix-_.h file and passes this to the C compiler. It also gets the word size from the host instead of the target.

My two questions are

Would this not break on 32 bit systems? Or other systems which don't use the unix runtime?
Does this mean that cross compiling with only mhs is currently not possible?

Depending on your answers, I can patch it myself

Can't compile parsec-2.1.0.1

Unsure what's going on here, so I thought I'd report it just in case. Apologies if this is somehow a GHC-ism. I am trying to compile an old version of parsec, one that doesn't need mtl.

$ cabal get parsec-2.1.0.1
$ cd parsec-2.1.0.1
$ mhs Text.ParserCombinators.Parsec.Error
mhs: "./Text/ParserCombinators/Parsec/Error.hs": line 162, col 45: Cannot satisfy constraint: (HasField "null" (Bool -> Bool) ([Char] -> Bool))
CallStack (from HasCallStack):
  error, called at src/MicroHs/Expr.hs:542:24 in MicroHs-0.9.10.0-e-mhs-6e71179c8eb4b60198f11a95498fc388cebaabd20be20739e5f2d4e4806ae1cf:MicroHs.Expr
  errorMessage, called at src/MicroHs/TCMonad.hs:30:11 in MicroHs-0.9.10.0-e-mhs-6e71179c8eb4b60198f11a95498fc388cebaabd20be20739e5f2d4e4806ae1cf:MicroHs.TCMonad
  tcError, called at src/MicroHs/TypeCheck.hs:2755:7 in MicroHs-0.9.10.0-e-mhs-6e71179c8eb4b60198f11a95498fc388cebaabd20be20739e5f2d4e4806ae1cf:MicroHs.TypeCheck
  checkConstraints, called at src/MicroHs/TypeCheck.hs:1363:7 in MicroHs-0.9.10.0-e-mhs-6e71179c8eb4b60198f11a95498fc388cebaabd20be20739e5f2d4e4806ae1cf:MicroHs.TypeCheck
  tcDefValue, called at src/MicroHs/TypeCheck.hs:1251:39 in MicroHs-0.9.10.0-e-mhs-6e71179c8eb4b60198f11a95498fc388cebaabd20be20739e5f2d4e4806ae1cf:MicroHs.TypeCheck

The module compiles even using a modern GHC.

let, case, do and layout

Single line let in statements don't seem to be getting their layout applied correctly.

It seems like this is the "parse-error(t)" case in the Haskell Report section 10.3, where the parser has to get involved in the lexing phase.

layout (t : ts) (m : ms) = } : (L (t : ts) ms) | if m /= 0 and parse-error(t)

Examples below

main :: IO ()
main = putStrLn $ intercalate " " [ showToken t | t <- lexTop "let x = 3 in x" ]
-- let { x = 3 in x } 

-- ^ should be "let { x = 3 } in x"

main = mapM_ print (lexTop "let x = 3 in x")

{- tokens

TIdent (1,1) [] "let"                                                                                                                                                                                                                         
TSpec (1,5) '{'                                                                                                                                                                                                                               
TIdent (1,5) [] "x"                                                                                                                                                                                                                           
TIdent (1,7) [] "="                                                                                                                                                                                                                           
TInt (1,9) 3                                                                                                                                                                                                                                  
TIdent (1,11) [] "in"                                                                                                                                                                                                                         
TIdent (1,14) [] "x"                                                                                                                                                                                                                          
TSpec (0,0) '}'   

-}

There's a similar story with same-line case

In GHC the below is valid

λ> (case 1 of 1 -> 1, 2)  
(1,2)

But in mhs it will lex as:

( case 1 of { 1 -> 1 , 1 ) } 

{- tokens

TSpec (1,1) '('                                                                                                                                                                                                                                                                       
TIdent (1,2) [] "case"                                                                                                                                                                                                                                                                
TInt (1,7) 1                                                                                                                                                                                                                                                                          
TIdent (1,9) [] "of"                                                                                                                                                                                                                                                                  
TSpec (1,12) '{'                                                                                                                                                                                                                                                                      
TInt (1,12) 1                                                                                                                                                                                                                                                                         
TIdent (1,14) [] "->"                                                                                                                                                                                                                                                                 
TInt (1,17) 1                                                                                                                                                                                                                                                                         
TSpec (1,18) ','                                                                                                                                                                                                                                                                      
TInt (1,20) 1                                                                                                                                                                                                                                                                         
TSpec (1,21) ')'                                                                                                                                                                                                                                                                      
TSpec (0,0) '}'

-}

do as well

> putStrLn $ intercalate " " [ showToken t | t <- lexTop "if True then do putStrLn \"hey\" else do pure ()" ] 
if True then do { putStrLn "hey" else do { pure ( ) } } 

{- tokens

TIdent (1,1) [] "if"                                                                                                                                                                                                                                                                  
TIdent (1,4) [] "True"                                                                                                                                                                                                                                                                
TIdent (1,9) [] "then"                                                                                                                                                                                                                                                                
TIdent (1,14) [] "do"                                                                                                                                                                                                                                                                 
TSpec (1,17) '{'                                                                                                                                                                                                                                                                      
TIdent (1,17) [] "putStrLn"                                                                                                                                                                                                                                                           
TString (1,26) "hey"                                                                                                                                                                                                                                                                  
TIdent (1,32) [] "else"                                                                                                                                                                                                                                                               
TIdent (1,37) [] "do"                                                                                                                                                                                                                                                                 
TSpec (1,40) '{'                                                                                                                                                                                                                                                                      
TIdent (1,40) [] "pure"                                                                                                                                                                                                                                                               
TSpec (1,45) '('                                                                                                                                                                                                                                                                      
TSpec (1,46) ')'                                                                                                                                                                                                                                                                      
TSpec (0,0) '}'                                                                                                                                                                                                                                                                       
TSpec (0,0) '}' 

-}

Contribute buildroot supprt for MicroHs?

Buildroot is a framework for creating small linux systems. At this time, I'm not aware of any any ML-family language in the buildroot package set -- so users looking to experiment with incorporating such code into their systems are on their own.

Since MicroHs can bootstrap itself, it seems like it might be quite easy to get up and running on build-root, compared with alternatives. For reference, this is the current set of packages in buildroot: https://git.busybox.net/buildroot/tree/package.

I might be willing to look into this, after the holidays, if you favor the idea.

Passing IO function as FunPtr

I am trying to make some libuv bindings, such that I can have asynchronous networking.

But I noted that there is no FunPtr. And even if there were, I am not sure I'd be able to pass Haskell functions as callback to the UV functions.

diff --git a/src/runtime/eval.c b/src/runtime/eval.c
index 6c68c49..fcc0119 100644
--- a/src/runtime/eval.c
+++ b/src/runtime/eval.c
@@ -1115,6 +1115,8 @@ const struct ffi_info ffi_table[] = {
 #if defined(FFI_EXTRA)
 FFI_EXTRA
 #endif  /* defined(FFI_EXTRA) */
+  { "uv_loop_init", (funptr_t) uv_loop_init, FFI_Pi },
+  { "uv_run", (funptr_t) uv_run, FFI_Pii },
 };
 
 /* Look up an FFI function by name */
@@ -2732,7 +2734,14 @@ execio(NODEPTR *np)
         case FFI_PPP: FFI (2); xp = PTRARG(1);yp = PTRARG(2);  rp = (*(void*   (*)(void*, void*    ))f)(xp,yp); n = mkPtr(rp); RETIO(n);
         case FFI_IPI: FFI (2); xi = INTARG(1);yp = PTRARG(2);  ri = (*(value_t (*)(value_t, void*  ))f)(xi,yp); n = mkInt(ri); RETIO(n);
         case FFI_iPi: FFI (2); xi = INTARG(1);yp = PTRARG(2);  ri = (*(int     (*)(int,   void*    ))f)(xi,yp); n = mkInt(ri); RETIO(n);
+        case FFI_Pii: FFI (2); xp = PTRARG(1);yi = INTARG(2);  ri = (*(int     (*)(void*, int      ))f)(xp,yi); n = mkInt(ri); RETIO(n); // e.g. uv_run
+        case FFI_PiP: FFI (2); xp = PTRARG(1);yi = INTARG(2);  ri = (*(void*   (*)(void*, int      ))f)(xp,yi); n = mkPtr(ri); RETIO(n); // e.g. uv_connection_cb
+        case FFI_PPi: FFI (2); xp = PTRARG(1);yp = PTRARG(2);  ri = (*(int     (*)(void*, void*    ))f)(xp,yp); n = mkInt(ri); RETIO(n); // e.g. uv_tcp_init, uv_accept
         case FFI_iPV: FFI (2); xi = INTARG(1);yp = PTRARG(2);       (*(void    (*)(int,   void*    ))f)(xi,yp);                RETIO(combUnit);
+        case FFI_PiPi:FFI (3); xp = PTRARG(1);yi = INTARG(2); zp = PTRARG(3); (*(int     (*)(void*, int,   void* ))f)(xp,yi,zp); n = mkInt(ri); RETIO(n); // e.g. uv_listen
+        case FFI_PPPi:FFI (3); xp = PTRARG(1);yp = PTRARG(2); zp = PTRARG(3); (*(int     (*)(void*, void*, void* ))f)(xp,yp,zp); n = mkPtr(ri); RETIO(n); // e.g. uv_read_start
+        case FFI_PiPV:FFI (3); xp = PTRARG(1);yi = INTARG(2); zp = PTRARG(3); (*(void    (*)(void*, int,   void* ))f)(xp,yi,zp); RETIO(combUnit); // e.g. uv_alloc_cb, uv_read_cb
+        case FFI_PPii:FFI (3); xp = PTRARG(1);yp = PTRARG(2); zi = INTARG(3); (*(int     (*)(void*, void*, int   ))f)(xp,yp,zi); n = mkInt(ri); RETIO(n); // e.g. uv_tcp_bind
         case FFI_PPzV:FFI (3); xp = PTRARG(1);yp = PTRARG(2); zi = INTARG(3); (*(void    (*)(void*, void*, size_t))f)(xp,yp,zi); RETIO(combUnit);
         case FFI_PIIPI:FFI (4);xp = PTRARG(1);yi = INTARG(2); zi = INTARG(3); wp = PTRARG(4);
           ri = (*(int     (*)(void*, int, int, void*    ))f)(xp,yi,zi,wp); n = mkInt(ri); RETIO(n);

(note the example UV functions in the comments above)

Not sure if I am approaching this wrong. How would you recommend doing asynchronous networking bindings? It is futile to try to do it outside the RTS?

In the above snippet, you can see the signature of uv_connection_cb. I would like to define a Haskell functions that I can pass to e.g. uv_listen: https://docs.libuv.org/en/v1.x/stream.html#c.uv_listen

Interesting but probably out-of-scope idea: Use copy-and-patch compilation

From what I get, MicroHs uses bracket abstraction to break down Haskell into supercombinators.
The language of supercombinators can be seen as a form of parameterless bytecode IR; this IR is then interpreted in eval.c.

For nothing else than fun and profit (and for what else is this project :)), you could consider, instead of interpreting this bytecode, to compile it just in time.

This process can be done really fast, and yield quite fast code as well, if you precompile your supercombinators into a library of stencils at bootstrap time and then just copy/mmap them together for the particular bytecode IR at compiler run-time. Of course, these stencils will in general have holes in them for register/stack operands and constants which need to be patched-in by the compiler. This is the basis of copy-and-patch compilation, a work which you might enjoy. Python 3.13 recently shipped with such a JIT.

There apparently is a C++ library called MetaVar to generate and do AST pattern-matching for stencils using Clang and its GHC calling convention (which is attractive because MetaVar compiles stencils to CPS), but I haven't found it so far after a bit of Googling. This is probably a no-go for the stated objectives of your project anyway, since that is quite a heavy dependency footprint. Nevertheless, I thought you might find this avenue interesting.

Floats all over the shop

Targeting a platform with WANT_FLOAT turned off, the C compilation step fails due to parse_double missing. Seemingly this is easy enough to fix -- we can just gate the '&' primop behind WANT_FLOAT.

However, if I try that, I quickly discover that the combinator output contains quite a lot of double constants. Even for the empty program main = pure (), there's a small handful of 0.0 and 1.0 constants. Where are these coming from? And how do I get rid of them?

`bin/mhs Example -oEx` fails with missing md5 symbol

Building executables with mhs is failing for me because it is not including the md5.c source when compiling. I needed to make this change to get it to work:

diff --git a/src/MicroHs/Main.hs b/src/MicroHs/Main.hs
index e3ff1e4..6749392 100644
--- a/src/MicroHs/Main.hs
+++ b/src/MicroHs/Main.hs
@@ -118,7 +118,7 @@ mainCompile mhsdir flags mn = do
        hClose h
        ct1 <- getTimeMilli
        mcc <- lookupEnv "MHSCC"
-       let cc = fromMaybe ("cc -w -Wall -O3 " ++ mhsdir ++ "/src/runtime/eval.c $IN -lm -o $OUT") mcc
+       let cc = fromMaybe ("cc -w -Wall -O3 " ++ mhsdir ++ "/src/runtime/eval.c " ++ mhsdir ++ "/src/runtime/md5.c $IN -lm -o $OUT") mcc
            cmd = substString "$IN" fn $ substString "$OUT" outFile cc
        when (verbose flags > 0) $
          putStrLn $ "Execute: " ++ show cmd

getRaw seems to be failing on Fedora Linux

I have only tried on Fedora Linux Rawhide so far, but this might affect other Linux distros too?
Maybe I should re-bootstrap?

$ mhs
Welcome to interactive MicroHs!
loaded Data.Bool_Type
loaded Data.Ordering_Type
loaded Primitives
loaded Data.Function
loaded Data.Functor
loaded Control.Applicative
loaded Data.List_Type
loaded Data.Char_Type
loaded Control.Error
loaded Data.Bounded
loaded Data.Eq
loaded Text.Show
loaded Data.Bool
loaded Control.Monad
loaded Data.Integer_Type
loaded Data.Num
loaded Data.Ord
loaded Data.Integral
loaded Data.Ratio_Type
loaded Data.Real
loaded Data.Int
loaded Data.Char
loaded Data.Maybe_Type
loaded Data.Tuple
loaded Data.List
loaded Data.Maybe
loaded Data.Bits
loaded Data.Fractional
loaded Data.Floating
loaded Data.Enum
loaded Data.Integer
loaded Data.Ratio
loaded Data.RealFloat
loaded Data.Word
loaded Data.Double
loaded Data.Either
loaded Foreign.Marshal.Alloc
loaded Foreign.C.String
loaded Foreign.Ptr
loaded System.IO
loaded Text.String
loaded Prelude
Type ':quit' to quit, ':help' for help
> mhs: getRaw failed
                    readline: warning: turning off output flushing
$

This also leaves the terminal in a bad state (eg Backspace no longer works): reset helps a little.

Records not supported?

If records are not supported, I think that would be worth mentioning in the README.

This is what I tried:

module Record (R(..)) where

data R = R1 { r1Field :: () }

The error is

mhs: "./Record.hs": line 3, col 13:
  found:    {
  expected: LQIdent ( UQIdent [ literal :: => ! | deriving ; }

MicroHs, version 0.8.1.0, combinator file version v5.1

Tabs not accepted

I am trying to compile some old Haskell programs from Bird/Wadler and they have tabs. It seems like MicroHs doesn't tolerate this. Not sure it's intentional or not, but I thought I'd bring it up.

 % hd Example.hs 
00000000  6d 6f 64 75 6c 65 20 45  78 61 6d 70 6c 65 28 6d  |module Example(m|
00000010  61 69 6e 29 20 77 68 65  72 65 0a 69 6d 70 6f 72  |ain) where.impor|
00000020  74 20 50 72 65 6c 75 64  65 0a 0a 6d 61 69 6e 20  |t Prelude..main |
00000030  3a 3a 20 49 4f 20 28 29  0a 6d 61 69 6e 20 3d 20  |:: IO ().main = |
00000040  64 6f 0a 09 70 75 74 53  74 72 4c 6e 20 22 48 65  |do..putStrLn "He|
00000050  6c 6c 6f 20 57 6f 72 6c  64 22 0a                 |llo World".|

 % ./bin/mhs -r Example
ERR: mhs: "./Example.hs": line 6, col 1:
  found:    TBrace
  expected: {

GHC accepts this.

Use as an interpreter

Is there a way to use MicroHs as an embedded interpreter/scripting language inside regular haskell like

import MicroHs
main = putStrLn $ microhs_run "case 1+1 of 2 -> True; _ -> False"

I think this would be great lightweight alternative to lua/hint.
I guess this would require the runtime system to be ported to Haskell/MicroHs?

KindSignature on EmptyDataDecl

Program:

module Example (main) where
import Prelude
import Data.Kind (Type)

main :: IO ()
main = pure ()
data SameKind :: Type

Fails with MicroHs:

ERR: mhs: "./Example.hs": line 7, col 15:
  found:    ::
  expected: LIdent ( = deriving ; }

Succeeds with GHC:

ghc Example.hs -XHaskell98 -XKindSignatures -XEmptyDataDecls

This is unexpected since these extensions are listed as supported.

primRnfNoErr can't ignore errors passed to primitives

I was fiddling with this as primRnfNoErr is a form of cheap-and-cheerful speculation (error-avoiding but not bottom-avoiding eagerness). But it was fairly obvious from reading eval.c that this was going to break:

module Example(main) where
import Prelude
import Primitives

main :: IO ()
main = do
  putStrLn "primRnfNoErr is kind of broken"
  return $! primRnfNoErr ((error "foo") + (3 :: Int))
  putStrLn "Success!"

And indeed:

% bin/mhs -r Example
primRnfNoErr is kind of broken
ERR: evalint, bad tag 2

Brainstorming a bit about how this might be solved cleanly. It'd be nice to be able to fix this without having to add checks at all the calls to evali etc.

GRIN and the GHC RTS

Hey Lennart, was curious to know your thoughts on these topics in general, and their possible inclusion into MicroHs.

GRIN

Was curious your thoughts on using first order GRIN as an IR (as seen in JHC, etc) to the MicroHs project (for something like -wpo mode). And importantly, do you think GRIN is a viable / practical IR for larger Haskell projects?

Possible Benefits:
- Smaller executables
- Removal of indirect calls
- Explicit laziness being inlined (eval inlining)
- Fewer allocations
- Simpler than STG
Possible Downsides:
- Potentially long compilation times (analysis other than Steensgaard + finding a fixed point during optimization might take a very long time).

GHC RTS

Would you consider implementing some of the GHC RTS features (threading, concurrency primitives, etc.) within MicroHs?
Could a standalone C implementation (divorced from STG) of the RTS be possible for MicroHs?

Thanks !