Cea

One of the core features of Haskell is immutability. It has greatly simplified the reasoning of Haskell programs, and together with the type system it can capture many bugs before they happen. However, immutability introduces overhead, and while the overhead is sometimes neglibible thanks to the various means of optimisations, there are situations where we want to avoid it.

In Haskell, a common way of achieving mutability is to use IORefs. However, IORefs are notoriously slow. In fact, they are often slower than pure StateTs.

A second way relies on Haskell's Foreign Function Interface (FFI), where we can have access to C pointers. Many common Haskell libraries use these pointers under the hood. However, working with raw, untyped pointers are error-prone, and it is easy to introduce memory leaks and segfaults. While Haskell has a Storable type class that offers basic types of pointers, it is only implemented for the basic types, and one have to write much boilerplate if they want to use custom data types.

The library cea is an attempt to provide a safe, high-level interface to C pointers. It can derive Pointable instances for most custom data types, and provides a variety of accessor functions that allows one to modify a certain field of a data structure without reading the whole data structure. All the type guarantees are checked at compile time, introducing minimum overhead over raw pointers.

Currently, it is still a prototype that can only handle non-recursive product types and arrays, but in the future I will extend it to support sum types to make it more useful.

Example

TODO

Basic Usage

To use Cea, you will need to import the following modules and enable the following extensions:

{-# LANGUAGE DataKinds #-}
{-# LANGUAGE DeriveGeneric #-}
{-# LANGUAGE DerivingVia #-}
{-# LANGUAGE TypeApplications #-}
{-# LANGUAGE UndecidableInstances #-}

import Cea.Pointer
import Cea.Pointer.Accessor

Pointable

Use make to create a pointer, use load and store to read and write to the pointer. Use delete to free the pointer.

main :: IO ()
main = do
  ptr <- make (0 :: Int)
  val <- load ptr
  print val -- 0
  store ptr 1
  val <- load ptr
  print val -- 1
  delete ptr -- free the pointer

After calling delete, the pointer is no longer valid and should not be used.

make can be used on any type that is an instance of Pointable, which is a partial generalisation of Storable. It has been implemented on many common data types, for example, tuples:

main :: IO ()
main = do
  ptr <- make (0 :: Int, 0 :: Int)
  val <- load ptr
  print val -- (0, 0)
  store ptr (1, 2)
  val <- load ptr
  print val -- (1, 2)
  delete ptr

To make custom data types an instance of Pointable, we need to derive Generic and Cea before we can derive Pointable:

data Foo = Foo Int Int
  deriving stock Generic
  deriving anyclass (Cea, Pointable)

data Bar = Bar Foo Foo
  deriving stock Generic
  deriving anyclass (Cea, Pointable)

Here Cea is a dummy type class that enables the generic derivation of Pointable. In most of the cases, we just need to derive Cea and Pointable for our custom data types, and the compiler will do the rest for us. In the rare occasion where customised Pointable instances are needed, we must not derive Cea as it will automatically derive the generic implementation of Accessible for us, which may lead to errors when working with custom Pointable. More details can be found in this section.

main :: IO ()
main = do
  ptr <- make (Bar (Foo 0 0) (Foo 0 0))
  val <- load ptr
  print val -- Bar (Foo 0 0) (Foo 0 0)
  store ptr (Bar (Foo 1 2) (Foo 3 4))
  val <- load ptr
  print val -- Bar (Foo 1 2) (Foo 3 4)
  delete ptr

In the example above, the fields of Bar, namely the two Foos, are stored as pointers, and the fields of Foo, namely the two Ints, are stored as values.

This is because custom types are considered as "indirect" types, and their fields are stored as pointers.

Finally, the delete function at the end of the do-block frees the structure recursively, so that we do not (and shouldn't) access the pointer of each fields and free them manually.

Built-in tuples, on the other hand, are treated as direct types, therefore ((1, 2), (3, 4)) is stored in one contiguous block of memory:

main :: IO ()
main = do
  ptr  <- make ((0, 0), (0, 0))
  val  <- load ptr
  print val -- ((0, 0), (0, 0))
  store ptr ((1, 2), (3, 4))
  val  <- load ptr
  print val -- ((1, 2), (3, 4))
  let ptr' = castPtr ptr :: Ptr (Int, Int, Int, Int)
  val' <- load ptr'
  print val' -- (1, 2, 3, 4)
  delete ptr

Note that we does not support deriving Pointable for sum types yet.

Accessor

Sometimes, we may only want to modify a certain field of a data structure. For example, we may want to increment the first field of a tuple, or the first field of a custom data type:

main :: IO ()
main = do
  ptr          <- make ((0, 0, 0, 0) :: (Int, Int, Int, Int))
  (a, b, c, d) <- load ptr
  store ptr (a + 1, b, c, d)
  delete ptr

This is not only tedious to write, but also introduces extra overhead since we need to read the whole tuple, modify it, and write it back.

Accessor functions provide ways to modify a certain field of a data structure without reading the whole data structure. For example, we can use access to acquire a pointer that points to the first field of a tuple:

main :: IO ()
main = do
  ptr  <- make ((0, 0, 0, 0) :: (Int, Int, Int, Int))
  ptr0 <- access @0 ptr
  store ptr0 1
  val  <- load ptr
  print val -- (1, 0, 0, 0)
  delete ptr

The type application @n specifies the n-th field. Note that the index starts from 0.

The access function comes from the Accessible type class, which is automatically provided for Cea and Pointable instances. Therefore, if the Pointable instance is not derived but an explicit customised implementation, we must not derive Cea and must implement Accessible manually. In the vast majority of the cases, of course, the derived instances are sufficient and we don't need to worry about implementing Accessible.

`access` is often used directly with a `load` or a `store`. In this case, we can use the shorthand functions `loadAt` and `storeAt`, which reads and writes to the pointer returned by `access`:

```Haskell
main :: IO ()
main = do
  ptr  <- make ((0, 0, 0, 0) :: (Int, Int, Int, Int))
  storeAt @0 ptr 1
  val  <- load ptr
  print val -- (1, 0, 0, 0)
  val0 <- loadAt @0 ptr
  print val0 -- 1
  delete ptr

If the fields has nested fields, we can use accesses to acquire a pointer that points to the nested field:

main :: IO ()
main = do
  ptr  <- make (((0, 1), (2, 3)) :: ((Int, Int), (Int, Int)))
  ptr0 <- accesses @[0, 0] ptr
  store ptr0 114
  val  <- load ptr
  print val -- ((114, 1), (2, 3))
  delete ptr

The type application @[n1, n2, ...] specifies the path to the nested field, namely the n1-th field's n2-th field's ... n-th field. Again the indices start from 0. In particular accesses @[] ptr is the same as the orginal poinrter ptr.

Similarly, we have the shorthand functions loadsAt and storesAt.

If we are using a custom data type with selector names, we can also use the selector names themselves to access the fields:

data Foo = Foo { foo1 :: Int, foo2 :: Int }
  deriving stock (Generic, Show)
  deriving anyclass (Cea, Pointable)

main :: IO ()
main = do
  ptr <- make (Foo 0 0)
  storeAt @"foo1" ptr 1
  val <- load ptr
  print val -- Foo {foo1 = 1, foo2 = 0}
  delete ptr

Array

There are two types of Pointable arrays, one has compile-time known size, and the other has runtime known size. To use arrays, import Cea.Array:

-- Previous imports...
import Cea.Array

Creation, Read & Write

To create an array with compile-time known size, we can use makeArr:

main :: IO ()
main = do
  arr <- makeArr @4 (0 :: Int)
  e0  <- readArr' @0 arr
  print e0 -- 0
  writeArr' @0 arr 1
  e0  <- readArr' @0 arr
  print e0 -- 1
  len <- arrLen arr
  print len -- 4
  deleteArr arr

makeArr takes a type application that specifies the size of the array, and a value that specifies the initial value of the array (the same initial value is used for all elements, similar to MArray's newArray).

To read and write to an array, we can use readArr' and writeArr', which takes a type application that specifies the index of the element to read/write. If we feed an index that is out of bound, the program will not compile.

The size of the array can be acquired by arrLen. In this case it is already known at compile time, but this function works with any array in general.

Finally, we use deleteArr to free the array. Note that the array is assumed to take ownership of all its elements, so we do not need to free the elements manually, and we should not free the elements manually.

To create an array with runtime known size, we can use makeArrFromList:

main :: IO ()
main = do
  arr <- makeArrFromList @Int [0, 1, 2, 3]
  e0  <- readArr 0 arr
  print e0 -- 0
  writeArr 1 arr 1
  e0  <- readArr 1 arr
  print e0 -- 1
  len <- arrLen arr
  print len -- 4
  deleteArr arr

Here we create an array directly from a list (or any Foldable), and the size of the array is determined by the length of the list. In this case we know that the length is 4, but in general the length cannot be determined at compile time.

To read/write the array, we can use readArr and writeArr, which takes an Int (as opposed to the type application in readArr' and writeArr'). If we feed an index that is out of bound, the program will throw an error. Note that these functions also work with arrays with compile-time known size; in fact, all functions that work with arrays with runtime known size also work with arrays with compile-time known size, but apparently not vice versa.

Take Snapshots

We can take a "snapshot" of the array an turn it into an immutable Array or list using loadArr or loadArrToList, similar to the freeze function for MArrays:

-- Assuming the relevant imports are already there
main :: IO ()
main = do
  arr  <- makeArrFromList @Int [0, 1, 2, 3]
  arr' <- loadArr arr
  print $ toList arr' -- [0, 1, 2, 3]
  list <- loadArrToList arr
  print list -- [0, 1, 2, 3]
  deleteArr arr

Access Element Pointers

So far, when we talk about writing to an array, we are overwriting the entire element, but sometimes our elements could represent a more complicated data type, and we only want to modify a sub-field of them. In this case, we can use the accessArr function to acquire a pointer to the element, and then use load and store to read/write to the element (as discussed in the previous section).

Assuming we have the following data type Student:

data Student = Student { name       :: String
                       , age        :: Int
                       , isDeanList :: Bool }
  deriving stock (Generic, Show)
  deriving anyclass (Cea, Pointable)

Here's an example of creating an array of two students, then adding them to the Dean's List:

main :: IO ()
main = do
  let bob = Student { name = "Bob", age = 20, isDeanList = False }
      tom = Student { name = "Tom", age = 21, isDeanList = False }
  arr    <- makeArrFromList [bob, tom]
  bobPtr <- accessArr 0 arr
  tomPtr <- accessArr 1 arr
  storeAt @"isDeanList" bobPtr True
  storeAt @"isDeanList" tomPtr True
  list   <- loadArrToList arr
  print $ all isDeanList list -- True
  deleteArr arr

For arrays with compile-time known size, we can also use accessArr' instead, which takes a type application that specifies the index of the element to access, similar to readArr' and writeArr'.

mmzk1526 / cea Goto Github PK

cea's Introduction

Cea

Example

Basic Usage

Pointable

Accessor

Array

Creation, Read & Write

Take Snapshots

Access Element Pointers

cea's People

Contributors

Stargazers

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent