Code Monkey home page Code Monkey logo

libgc's Introduction

libgc

Bors enabled

libgc is a garbage collection library for Rust. It can be used as a standalone library, but it is highly recommended that programs are compiled with the companion rustc fork, which offers language support for much better performance.

libgc is in active development - there will be bugs!

Example

libgc provides a smart pointer, Gc<T>, which can be used to make garbage collected values. An example, with the necessary global allocator setup, looks as follows:

use libgc::{Gc, GcAllocator};

#[global_allocator]
static ALLOCATOR: GcAllocator = GcAllocator;


fn foo() -> Gc<Vec<usize>> {
    let foo = Gc::new(vec![1,2,3]);  
    let a = foo; // GC pointers are copyable
    let b = foo;

    foo 
}

fn main() {
    let gc = foo();
}

Overview

If you want to write code with shared ownership in Rust, Rc makes this possible. Unfortunately, managing cyclic data structures with reference counting is hard: weak pointers are needed to break strong cycles and thus prevent memory leaks. In programs where these sorts of structures are common, garbage collection is a natural fit.

libgc is not a replacement to the single ownership model - it is intended to complement it by providing a garbage collected alternative for values which might be too difficult to manage with Rc. Values must opt-in to using garbage collection with the Gc::new(x) constructor. This tells libgc to heap allocate x, and GC it for you when you're done with it. Gc can be thought of as a special Box type, where x's lifetime is unknown at compile-time. Periodically, the garbage collector will interrupt the program (known as "stopping the world") to see which Gc values are still in use, and drop those which aren't.

Garbage collection involves scanning parts of the stack and heap to look for live references to Gc values. This means that libgc must be aware of all heap allocated values, even those which aren't Gc. To do this, libgc has its own allocator, GcAllocator, which must be set as the global allocator when using libgc.

use libgc::GcAllocator;

#[global_allocator]
static ALLOCATOR: GcAllocator = GcAllocator;

Finalization

A Gc can be used to manage values which have a drop method. Like all tracing garbage collectors, libgc can not provide any guarantees about exactly when a 'dead' value is dropped. Instead, once libgc has determined that a value is unreachable, its drop method is added to a drop queue, which is ran on a parallel finalization thread at some point in the future. The order of finalization is intentionally undefined to allow libgc to run drop methods on values which contain cycles of Gc.

โš ๏ธ You must not dereference a field of type Gc<T> inside Drop::drop. Doing so is unsound and can lead to dangling pointers. TODO: Add a lint for this and explain why in further details.

Implementation

libgc is implemented using the Boehm-Demers-Weiser collector. It is a conservative, stop-the-world, parallel, mark-sweep collector.

TODO: Expand

Known Issues

  • Single-threaded support only.
  • No Drop Lint to prevent unsound dereferencing of Gc typed fields.

Using libgc with rustgc

There are two repositories which make up the gc infrastructure:

  • libgc the main library which provides the Gc<T> smart pointer and its API.
  • rustgc a fork of rustc with GC-aware optimisations. This can be used to compile user programs which use libgc, giving them better GC performance. Use of rustgc is not mandated, but it enables further optimisations for programs which use libgc.

This seperation between libgc and rustgc exists so that a stripped-down form of garbage collection can be used without compiler support.

TODO: Explain rustgc and it's optimizations.

libgc's People

Contributors

bors[bot] avatar jacob-hughes avatar ltratt avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

libgc's Issues

Multi-threading possible?

In the README.md of the repo, it says that the lib only supports single-threaded applications, but given that the inner type of a Gc pointer now requires Send and Sync, I wanted to check if this was still the case.

Gc type does not implement Send or Sync

There's currently an issue where the Gc type requires its inner type be Send and Sync, but the type itself does not implement these traits. As a result, any type which contains Gc pointers cannot itself be held inside a Gc pointer (at least not without explicitly implementing the traits).

Example:

struct Example(usize, Option<Gc<Example>>)

fn main() {
    let ptr = Gc::new(Example(5, None));
}

This code would fail to typecheck, as the type NonNull<gc::GcBox<Example>> does not implement Send or Sync. Manually implementing the traits on the Example type allows this program to run.

Is it possible to have Gc itself implement Send and Sync, or are these workarounds required?

Continual rebuilding?

Is it just me, or does the underlying C library get rebuilt on every change to Rust code? At least, the build is very slow compared to other Rust libraries, and given the small amount of Rust code, that seems an unlikely suspect to me!

DerefMut on Gc<T> allows undefined behaviour

As the Gc type implements DerefMut, one can write the following safe code:

let gc_val = Gc::new(5);
let ptr1 = gc_val;
let ptr2 = gc_val;
let ref1 = ptr1.deref_mut();
let ref2 = ptr2.deref_mut();

At this point, there are now two simultaneous mutable references to the same value, which is undefined behaviour.
I would recommend the DerefMut trait be removed from Gc<T> and to leave only Deref, as only immutable references can be safely created for a type managing shared ownership. Those wishing to add mutability should wrap the value in a RefCell<T>, as one would with Rc<T>.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.