The sixten from ollef

Mixfix or infix definitions

So we can have plusses and diamonds and binds and fishes!

Internalise extern C declarations

Use opt -internalize-public-api-list to internalise the C functions stemming from extern C declarations. They don't need to be visible in the generated binaries.

Question about the type inference

Why did you choose to implement "Practical type inference for arbitrary-rank types" instead of "Complete and Easy Bidirectional Typechecking for Higher-Rank Polymorphism" (which you have already implemented before)? I was under the impression Dunfield's system was easier to implement and has the same features as SPJ's system.
I am interested in what advantages SPJ's system has over Dunfield's.

Implement Language Server Protocol

https://github.com/Microsoft/language-server-protocol

Error needed for overlapping class instances

Otherwise we don't get coherence.

Join point let bindings

To make tight local loops that aren't ruined by lambda lifting.

Use llvm-config to find the LLVM binary directory

llvm-config is able to provide the directory that the system places the LLVM binaries in.

We should also make the llvm-config binary configurable on the command-line, which would provide a way to use the compiler on systems like Ubuntu where the LLVM-binaries on the PATH have their versions appended (e.g. opt-5.0, see discussion in #9). They can then use e.g. sixten compile --llvm-config=llvm-config-5.0. We might also want to consider just trying e.g. llvm-config-5.0 if llvm-config doesn't exist.

Do inlining when simplifying

Pattern synonyms

One idea is to add a way to splice in an ordinary expression but with arbitrary free variables into a pattern. The expression would then be normalised and we can then just check if it's a pattern. Example (using ' for splicing):

f '(replicate 3 610) = "Got three 610s!"

replicate 3 610 normalises to [610, 610, 610], which is a valid pattern.

One disadvantage of doing it this way is that features that are unavailable in expressions can't be used in pattern synonyms, e.g. @-patterns and view patterns.

Typo (?) in license file

Hi! Just saw your project on r/haskell; I have to admit it looks pretty cool as an enthusiast 😄.

Looking at the license, I think you're going for BSD 3-clause license but the third line doesn't seem quite right ... is that a typo?

Records

It would be nice with row polymorphism but it'd also be nice to see how far we can get without them. Also keep lenses in mind.

Get rid of zonking in normal TC operation

We should try zonk as we go instead to avoid needlessly traversing ASTs.

Clean up metavar handling

There are some compiler passes that use the same metavariables that the typechecker uses even though they never do any unification. They should use something more restricted.

Add `check` command

Add a sixten check command to the command-line interface that runs the typechecker but doesn't go further, which will be faster than using compile when you just want to see if something typechecks.

Sanity typecheck the abstract syntax

After any pass that produces abstract syntax, right now typechecking and simplification, we should sanity typecheck it to find compiler bugs. This could perhaps only be done when using some command-line flag.

Pattern coverage and overlap checking

Unexhaustive patterns should be an error. Perhaps do this while doing #4.

Use llvm-hs to generate LLVM code

Instead of doing it with strings.

https://github.com/llvm-hs/llvm-hs

Also a good idea to clean up the code generator.

Consider using a separate constructor namespace or convention

The Swift constructor syntax is MyType.MyCon or .MyCon when MyType can be inferred from context. This means that constructors aren't in the same namespace as other identifiers and so works around the namespacing problem that forces type Tuple a b = MkTuple a b to use the ugly Mk prefix. The drawback is the ugly . prefix, but I'm sure that'll grow on me.

Build caching

We should cache the result of typechecking and compiling modules (or perhaps even be more granular) such that we don't have to redo that every time we compile something.

Make instance methods desugar to separate definitions

Instead of local lets. This'll make a future inliner's job easier.

Module re-exports

Normalisation of class methods during typechecking

Currently class instances are elaborated and filled in after the main typechecking has been done which means that we can't rely on them reducing to something while typechecking even if the instance is known. This is a desirable feature if we e.g. want to use methods that return types in a way that relies on the actual type of the instance.

To fix this we should interleave the class instance elaboration with the type checking. We will probably want the post-typechecking elaboration pass too though since unifications in distant parts of a function can affect the instances.

Use pointer tagging to elide constructor tags when possible

Is this feasible and a good idea even though we're not using a uniform representation?

Set up continuous integration

Should run test.sh.

Write own error printer

Instead of using the trifecta one. That'd also be a good opportunity to switch to the prettyprinter library.

Method type signatures are ignored

If we have

instance Foo Bar where
  foo : Baz
  foo = ...

then Baz is completely ignored.

We should either use the signatures or disallow them.

Make extern splices call back to Sixten at use site

Instead of passing fully evaluated splices as args, we should just pass the free vars of the splices as args and compute them as they're used.

Hackathon?

Hej!

Will there be a sixten hackathon soon? I could come visit in Oslo.

Cheers,
Dan

Top-level extern declarations

So we can e.g. add C includes by using

(C|#include <blah>|)

at the top level.

Inductive families/indexed types

So we can do more dependently typed goodness. If this is done, we can potentially also allow e.g.:

type Vector n a where
  Nil : Vector Zero a
  Cons : forall n. a -> Vector n a -> Vector (Succ n) a

Note that there's no Ptr on the tail of the Cons, meaning that this is an inductive definition of a flat vector. To get there, we can use pattern-matching on indices in the generated size-computing function for Vector. That function should be something like:

Vector Zero a = unitTypeRep
Vector (Succ n) a = productTypeRep a (Vector n a)

This also relies on adopting detagging (a la Edwin Brady) to remove the need for constructor tags (since we know what constructor we have just from the n index) and doing forcing to remove the n argument to the Cons constructor (since it can be recovered from the index).

Improve generalisation when mixing definitions with and without signatures in a binding group

As an example, we currently fail to typecheck the following program because apply1's ungeneralised type is unified with the local type variables in apply.

apply : forall f a b. (f a -> b) -> f a -> b
apply f x = apply1 f x

apply1 f x = apply f x

To fix this, the strategy for typechecking binding groups with and without signatures should be changed. We should typecheck and generalise the definitions without signatures first, assuming the signatures are correct. After generalising the definitions without signatures we typecheck and generalise the ones with signature.

This is in line with https://prime.haskell.org/wiki/RelaxedDependencyAnalysis.

Boxed data

Add a way to define boxed data, e.g.

boxed
type List a = Nil | Cons a (List a)

This would work almost as if every occurrence of List a was Ptr (List a), but

The programmer doesn't need to explicitly dereference the pointers
The size of the constructors doesn't need to be determined by the type, so e.g. boxed type Array a where MkArray : (n : Nat) -> Vector n A -> Array a would be allowed.
We don't need to allocate space for the biggest constructor, but can instead just allocate enough to fit the constructor we're actually using, so types like Maybe and List will save some space in the Nothing/Nil constructors if they are boxed.
The compiler is free to play tricks with the pointer and constructor tags since it knows that it has a pointer to a specific type. For example it can use the null pointer for List.Nil and elide the constructor tag for List.Cons, making List be represented more like how it would be represented in C or C++.

Propagate method type signatures from classes to instances

Let's say we have the following:

class MyClass a where
  myMethod : someType a

instance MyClass MyType where
  myMethod = impl

Currently we infer the type of impl and then check that it subsumes someType MyType. It would be better to check impl against someType MyType to start with, as we can then get better inference of constructor types and dependent types, and additionally get better code since the generalised version of impl may have implicit arguments in a different order than someType MyType.

Implement effects and IO in Sixten

And mutable variables, laziness, etc.

How far can we get with what we have (extern C and classes)? What new features are needed?

Use a proper higher-order unification algorithm

We should use a proper higher-order unifciation algorithm when typechecking, instead of the current half-assed one.

Read up on it:

Nipkow: Functional unification of higher-order patterns
Gundry & McBride: A tutorial implementation of dynamic pattern unification
Norell & Coquand: Type checking in the presence of meta-variables
Cockx & Devriese & Piessens: Unifiers as Equivalences
Abel & Pientka: Higher-Order Dynamic Pattern Unification for Dependent Types and Records

HasSize, IsPointer, HasGC typeclasses.

The size in memory of Pair a b is the size of a plus the size of b, and it's passed in registers or on the stack when used in a function.

Couldn't we implement this behavior with type classes internally? It has one major advantage. You don't have pass the size (isPointer, GC strategy etc) along with data, unless it is really needed.

Right now (if I understand correctly) the size of arguments will be put on stack even when monomorphic functions are evaluated. But we obviously need them only for polymorphic functions.

Start doing testing of compiler components

Currently there are only end-to-end tests. We should have property and unit testing of smaller components in the compiler. An example is the simplifier which should ideally be idempotent (it's currently not, but testing that is the first step to solve that problem).

Add checks for escaped meta variables

We should explicitly check for escaped meta variables and give better error messages for when that happens.

Type specialisation

Perhaps there should be a pragma or something to generate type-specialised versions of functions?

A nice thing is that the functions can be implemented by case on the type-representation, so we can pick the correct specialised versions even for types that aren't statically known.

View patterns

They're already handled by the matching algorithm. Parsing and typechecking remains to be done.

Precise garbage collection

See if we can get the LLVM gcroots stuff to work. Something that looks like a potential problem is that there's currently no support for marking non-pointer (e.g. structy) data as possibly containing pointers, so we may have to resort to conservative stack scanning or something and look into mostly copying GC.

With a GC that scans for pointers we might decide to use pointer tagging to decide whether something is a pointer or not (and reserve a bit in ints as well). If that's the case we'll need a new type for pointers to byte arrays so that their data is not scanned for pointers.

Named implicit arguments

Perhaps names should even be mandatory when explicitly giving implicit arguments (but how about implicit arguments that stem from generalisation and are not user-written?).

Overloaded literals

Strings and int literals should use some typeclass method to get overloading.

At patterns

Allowing e.g. xs@(Cons x _) would be useful.

Add comparison to intensional polymorphism in readme

Intensional polymorphism or intensional type analysis is an idea related to how our representation polymorphism is handled that should probably be mentioned in the readme.

Error recovery

Throughout the whole frontend (parser and typechecker especially) we should try to continue working even in the face of errors. We should insert coercion terms that throw errors if something that errored is used, but otherwise try our best to create a runnable program even when there are errors, a la deferred type errors.

This means that we can report multiple errors (which is important for tooling), and that we can test programs while they're being written.

ollef / sixten Goto Github PK

sixten's People

Contributors

Stargazers

Watchers

Forkers

sixten's Issues

Recommend Projects

Recommend Topics

Recommend Org