tc39 / proposal-structs Goto Github PK

View Code? Open in Web Editor NEW

220.0 41.0 5.0 58 KB

JavaScript Structs: Fixed Layout Objects

proposal-structs's People

Contributors

Stargazers

Watchers

Forkers

takikawa isabella232 sirenkovladd seanpm2001 rbuckton

proposal-structs's Issues

Should structs be freezeable?

Something that came up when I was working on a spec draft is what the interaction between freezing and structs should be. Normally sealed objects can still be frozen afterwards.

However, perhaps struct instances should not be freezable? I believe JS engines may need to modify flags, etc. in the hidden class on Object.freeze (e.g., https://searchfox.org/mozilla-central/source/js/src/vm/Shape.cpp#776), and struct hidden classes are intended to be fixed.

Additionally, in the shared struct case, there is a possibility of one thread freezing an object concurrent with mutations occurring in another thread. Would this be a problem?

host integration: SharedMessageQueue or similar

So shared structs work well if you restrict yourself to only shared objects in communications, the problem is a lot of host types aren't sharable but are serializable and/or transferable. It would be nice if we could integrate cleanly with such types within a shared object.

As a simple proposal for such behaviour I'd propose that there should be a host type SharedMessageQueue that itself is a shared object, but can enqueue/dequeue serializable and transferable objects.

As a simple toy example, this is how one could make simple shared canvas that can be written from any thread:

shared struct SharedCanvas {
    static async #listenToQueue(commandQueue: SharedMessageQueue, canvas: OffscreenCanvas) {
         const ctx = canvas.getContext("bitmaprenderer");
         while (true) {
             const { kind, ...data } = await commandQueue.dequeue();
             if (kind === "display") {
                 ctx.transferFromImageBitmap(data.imageBitmap);
             }
         }
    }

    #commandQueue = new SharedMessageQueue();
    
    constructor(canvas: OffscreenCanvas) {
        SharedCanvas.#listenToQueue(this.#commandQueue, canvas);
    }

    display(imageBitmap: ImageBitmap): void {
        this.#commandQueue.enqueue({ kind: "display", imageBitmap }, [imageBitmap]);
    }
}

Interaction with object references from primitives

A curiosity more than an 'issue'. The Record&Tuple proposal could end up introducing primitives that contain references to objects.

What do we think should happen when these interact with shared structs?

shared struct class Structy {
  x;
  constructor(x) {
    this.x = #[1, 2, 3, Box({})];
  }
}

Would it immediately throw as this isn't deeply primitive?

EDIT: clarify shared struct

Can shared structs store SharedArrayBuffer?

So in the current proposal, shared structs can only have fields that are themselves primitives or other shared structs.

However SharedArrayBuffer feels like it should be sharable and is kind've important to be able to share as it the only mechanism through which Atomics instructions other than the four available to shared structs would be available. Of particular note is Atomics.wait and Atomics.notify which are important for implementing locks and such.

But currently SharedArrayBuffer probably couldn't be shared as it is an object, in fact the identity of the SharedArrayBuffer in general doesn't round trip as a new wrapper object is created everytime it's cloned.

Although having said that could we still set them anyway? There's a few possible strategies I could see:

Respec SharedArrayBuffer to be a shared struct so that it can be shared like any other shared struct
- This means the SharedArrayBuffer would be a frozen object, so this might not be web compatible
Create wrappers when accessed on a shared struct, i.e. sharedStruct.someBuffer lazily creates a SharedArrayBuffer object wrapper when accessed
- Two ways of doing this, either revive the same object identity each time, or everytime the field is accessed create a new object wrapper
Add a new shared struct object that can be transformed to and from a SharedArrayBuffer
- i.e. Something like const sabSharedRef = sab.toSharedArrayBufferRef() / const sab = sharedStruct.sabRef.toSharedArrayBuffer()

Applying shared structs to existing code is difficult with no member functions

Babylon.js has many existing classes representing primitives such as vectors and colors, with member function operations such as scaleToRef (vector_b = vector_a * scalar) and addInPlace (vector_a += vector_b). I experimented a bit with applying shared structs to these primitives. Since shared structs don't support attaching any sort of code, I had to work around the existence of these operations and found none of the possible solutions particularly satisfying:

Convert the mutator operations to static. This would of course require updating all call sites, and for heavily used classes like vectors and colors, there would be many. It also sacrifices type safety: where before we knew that this was, for example, a color, with a static method we have no such guarantee. This is especially unfortunate given that the origin trial does not yet have language-level support for shared structs, which means Typescript can't even do static checking at compile time.
Instead of converting the color class itself into a shared struct, keep it as a non-shared object but use a shared struct as backing storage under the hood. This works well for a given class in isolation but falls apart when objects are composed together. For example a Particle object might have several colors and vectors as members. In order to process that Particle on a worker thread, we would need some way to extract and pass over a graph of shared structs that underpin a corresponding graph of non-shared objects. That implies a degree of classes having familiarity with each others' internals that I found uncomfortable.
Introduce a means of serializing non-shared classes into shared structs and deserializing them on a worker thread. That may or may not be faster than passing non-shared objects via postMessage or flattening to SharedArrayBuffer, but it certainly isn't going to realize the full potential of sharing across threads.

My intuition after this exercise is that in order for shared structs to apply cleanly to existing codebases, we'll need some way of attaching functions to them. I recognize that there are challenges to doing this, and for what it's worth I think it would be okay if a function attached to a shared struct doesn't have the full set of capabilities that a function attached to a non-shared object has. It's already the case that shared structs are limited - they are of fixed shape - and having limited-capability member functions would follow that precedent. I think the only piece that's needed would be for a member function on a shared struct to be treated as a static function with an implicit this parameter.

Breaking Proxy?

Proxy is designed to be non-distinguishable from normal objects. If structs are typeof === "object", then the Proxy of structs can be detected with the following code.

function isProxyOfStruct(x) {
    try {
        struct class extends x {}
        return true
    } catch {
        return false // Nah, it's a proxy
    }
}

Propose new syntax

I once suggested a new simplified syntax that would be perfect for this case.

instead:

struct class Point {
  x;
  y;
  constructor(x, y) {
    this.x = x;
    this.y = y;
  }
  // ...methods...
}

how about this?

class Point(x, y) {
 // ...methods...
}

struct Point(x, y) {
 // ...methods...
}

link to original suggestion

Is it not somehow redundant with the record & tuple proposal ?

It looks very similar of the record & tuple proposal: https://github.com/tc39/proposal-record-tuple

Your proposal is using prototypes and inheritance instead of plain object. However, it feels better tu use a 'record' instead of struct ? (performance wise, readability and immutability). Do I miss something ?
What feels interesting in this proposal is the possibility to "share" these objects, but we could apply the same logic to 'records' instead

Also add the capability to support for bit filed or other underlying POD functionals.

Such as char,uchar,int8,int16,int32,int64 and so on?
Also pack are needed:)

how can i identify a struct, and whether it's shared?

My understanding is that a struct is just a sealed object with null, or another struct, as a [[Prototype]] (shared structs may have an object with methods as their [[Prototype]]).

Given some x, how can i determine if it's an unshared struct, or a shared struct, vs a non-struct?

Recommend explicit `extends null` for shared structs

While it may be a little early for in depth syntax debates, if shared functions is something we may consider in the future then we might want to ensure that shared structs can eventually have something like a [[Prototype]] chain of other shared structs with shared methods/accessors (unless we essentially just flatten those as well). It would be nice to be able to have instanceof and constructor and to be able to share the shared struct constructor function.

As such, I would recommend that if we continue to use class-like syntax, that we might want to consider enforcing that shared structs have an explicit extends null to enforce the current [[Prototype]] semantics, otherwise we might not be able to change that in the future.

Example (based on current explainer):

shared struct class Foo extends null {
  ...
}

Shared structs vs Serializable/Transferable objects

This is an issue covering quite a large idea, however I think it would be considerably more ergonomic that the much lower level shared struct idea, while still perfectly permitting high efficiency shared structs where no code sharing is used.

So the idea is as the title suggests, instead of having highly basic "shared structs" we expose the notion of serializable/transferable as a first-class concept within the JS language itself. This would allow authors to implement a rich class of objects rather than the rather painful status quo of marshalling such objects into lower-level serializable/transferable values.

As an example consider a host object like OffscreenCanvas returned from a call to .transferControlToOffscreen(), such an object will have an implicit shared memory buffer which the thread can write with the OffscreenCanvas abstraction, but the renderer can read from an entirely different thread.

Now to give an example of how such an API could potentially look, I give below an example of a theoretical version of AbortSignal that is transferable, involves shared state, but is otherwise compatible with the existing API:

// NOTE: In this example I am going to use ${Name} to indicate
// places where free variables are initialized based on the
// thread

// We declare the class as serializable struct, this gives it all
// the following super-powers that enable it to be cloned
// across threads, NOTE that we do require this class to also be a struct
// as we cannot dynamically add properties to the class
// 
// One of the first things to note is that this serializable declaration
// applies to the WHOLE class, it causes the class, all prototype methods, all
// static methods, the constructor and beyond to inherit serializable semantics,
// the meaning of these semantics will become clearer as you follow the example
// 
// Now by serializable extending to the whole class it means many things are
// well founded, for example suppose we received an AbortSignal as defined below:
// self.onmessage = (event) => {
//   const abortSignal = event.data.signal;
//   // This class is available and fully operational
//   // because of the shared semantics the entire class
//   // can be cloned into this thread
//   const AbortSignal = abortSignal.constructor;
//   const newSignal = new AbortSignal();
// }
// 
// Now one of the first things to notice about this declaration itself is that
// we can subclass ANY objects, not just transferable ones, when the AbortSignal
// class is transferred into a thread (either directly, or indirectly via an instance)
// we lookup the free variable in the new thread and initialize it as such
// i.e. the ${EventTarget} acts as a kind've template, which is filled in by
// values on the thread this value has been received on
serializable struct class AbortSignal extends ${EventTarget} {
    // Upon serialize to another thread, this property is also structured serialized
    // in this case as it's just a shared memory, it becomes usable as normal
    // 
    // Ideally we would be able to sugar this up somehow, like instead of
    // explictly making a SharedArrayBuffer and manipulating it, we could just declare
    // shared #aborted = false;
    // however then we need to extend Atomics to private fields somehow, as this is just
    // sugar it does not change the conceptual design of this example so I'm omitting it
    #aborted = new Int32Array(new SharedArrayBuffer(4));

    // No AbortController in this example, so we'll just use the revealing constructor
    // pattern to provide abort capability, note that the callback we pass into
    // the start function is not in anyway shared, in fact it lives only on
    // the thread where we actually created the AbortSignal and is not serialized
    // or transfered in anyway as basic closures are not transfered
    //
    // The constructor here is quite special in that on deserialization of such objects
    // on another thread the constructor will be called on an ALREADY INITIALIZED instance
    // of the class
    constructor(start) {
        // We have a new meta-property (or something) available in the constructor
        // of seriazable classes, if we are deserializing this value from another thread
        // then the value of "this" will already be partially defined, upon calling
        // super() the "this" value in the constructor will become new.serializedInstance
        // exactly, the reason this is available now is so that we can access fields with
        // data that are needed to be passed up into any superclass
        // in this case I'm just logging it for illustrative purposes as EventTarget
        // accepts no arguments so we don't have anything to do here
        console.log(new.serializedThis);
    
        // super() behaves fairly specially here as well
        // in particular if Superclass is also serializable
        // it will ALSO be called in deserialization mode rather
        // than being called as a constructor normally, when this happens
        // field initializers DO NOT RUN, as the data is already available
        // on the new.serializedThis
        super();
        
        // If we already had an instance from another thread, then the constructor
        // has been called as a such we wouldn't need to initialize it
        if (!new.serializedThis) {
            // This function simply closes over the object in the usual way
            // this is fine as this function isn't transfered over the thread
            const abort = () => this.#abort();
            start(abort);
        }
        // Regardless of thread, we need to observe the #aborted 
        this.#listenForAbort();
    }
    
    // This is an ordinary method, it is simply cloned by definition to other threads,
    // note that it won't actually be called on other threads unless they create
    // new abortSignals (i.e. using signal.constructor)
    #abort() {
        // This logic isn't overly defensive as writing this
        // value is only done on a single thread
        if (Atomics.load(this.#aborted, 0) === 1) {
            // Already aborted so do nothing
            return;
        }
        // We set the abort on the signal
        Atomics.store(this.#aborted, 0, 1);
        // Notify all threads that their abort signal needs to fire an event
        Atomics.notify(this.#aborted, 0);
    }
    
    // This is on the whole, a regular method that simply returns
    // the value stored in this.#aborted, it is cloned purely by
    // it's definition so doesn't need any special treatment
    get aborted() {
        return Boolean(Atomics.load(this.#aborted, 0));
    }
    
    // Now this is the MOST IMPORTANT magic of what serializable enables, essentially
    // this is how we are able to initialize our objects when they are received on
    // other threads, essentially this "function"-block-thing is called when a thread
    // deserializes an AbortSignal object
    deserialize {
        // On other threads, the constructor was never called for this object
        // so our post deserialization steps are simply to register to listen
        // out for abort's on our #aborted field
        this.#listenForAbort();
    }
    
    // Again, just another bog-standard method, this is cloned simply by redefining
    // this function at the destination
    #listenForAbort() {
        const { async, value } = Atomics.waitAsync(
            this.#aborted,
            0,
            // If the abort signal has been aborted already then this will cause
            // waitAsync to return synchronously
            0,                
        );
        // If the signal is not already aborted, we'll fire an event when it
        // eventually does become 
        if (async === true) {
            value.then(() => this.dispatchEvent(new Event("abort"));
        }
    }
}

// Nothing special about this really, the inner function creates a local
// closure to the local value of abortSignal
const abortSignal = new AbortSignal(abort => {
    setTimeout(abort, 5000);
});

// We can send the signal to another thread, what this does is 
worker.postMessage({ abortSignal });

self.addEventListener("message", (event) => {
    // An abort signal from another thread, entirely up and ready to go, prior
    // to firing this event the object was entirely deserialized into the current thread,
    // future events won't need to repeatedly deserialize the whole AbortSignal class
    // however as it can just be cached
    const abortSignal = event.data.abortSignal;

    // We can call all methods and such as per normal, as super() was called to initialize
    // abortSignal as an EventTarget in this thread, it has become an EventTarget
    // also in this thread
    abortSignal.addEventListener("abort", () => {
        console.log("Aborted!");
    });
});

Potential use case in webGPU

The WebGPU proposal has mentioned that they want something like this:

This should allow for example for GPUTextures created on one worker to be instantly visible to other workers.

The v1 of that proposal does not have multithreading, but it may become relevant to a future version.

Does this are a replacement of https://github.com/tschneidereit/proposal-typed-objects?

Atomics, shared struct fields, and references

In the slides there's an example of a possible future API for using shared structs with Atomics. I had been considering the same thing with https://github.com/rbuckton/proposal-struct and https://github.com/rbuckton/proposal-refs. The examples in the slides are as follows:

Atomics.store(sharedBox, 'x', 42);
Atomics.load(sharedBox, 'x');
Atomics.exchange(sharedBox, 'x', 84);
Atomics.compareExchange(sharedBox, 'x', 84, 42);

The approach I had been considering would have used ref instead:

Atomics.store(ref sharedBox.x, 42);
Atomics.load(ref sharedBox.x);
Atomics.exchange(ref sharedBox.x, 84);
Atomics.compareExchange(ref sharedBox.x, 84, 42);

That said, its not necessary that we take a dependency on https://github.com/rbuckton/proposal-refs. Should that proposal be accepted once I've had the opportunity to present it, I had intended to introduce ref-style "overloads" for Atomics for typed arrays as well, i.e.: Atomics.store(ref int32Array[0], 42), so in essence both patterns could exist.

Can non-shared structs have methods? Is an explicit constructor needed?

Hi,

I'm failing to see how non-shared structs are different from, say something like this:

struct class Box {
  x; // Initialized to undefined
  
  #_ = void Object.seal(this); // Sealed

  constructor(x) {
    this.x = x;
  }
}
Object.freeze(Box);
Object.freeze(Box.prototype);

const box = new Box(123);
box.y = 123; // TypeError: Object is non-extensible
box.x = 123; // Ok

It's also not very clear to me why would non-shared structs be allowed to have methods. It wouldn't make them much different from classes besides from the internally "fixed layout" thing (sealed).

Personally, I think there are more use cases for structs that resemble factory functions/constructors for "fixed layout" "plain objects". Ideally with no need for an explicit constructor function:

struct class Box { x; y = 0; };     // No constructor (has non-overridable constructor)

const box1 = new Box();             // Box { x: undefined, y: 0 }
const box2 = new Box({x: 1, y: 2}); // Box { x: 1, y: 2 }
const box3 = new Box({x: 1, O: 9}); // Box { x: 1, y: 0 }
const box4 = new Box(box2);         // Box { x: 1, y: 2 }

Object.isSealed(box4); // true

Structs can only extends structs makes sense. To clarify, can a class extends a struct? Personally, that would make sense to me.

class CardBox extends Box {}

const box5 = new CardBox(); // CardBox { x: undefined, y: 0 }

Object.isSealed(box5); // false

Now shared struct could have the same semantics as non-shared with the only difference of only being allowed to contain primitives to other shared structs:

struct class Box { x = 0; y = 0; }
struct shared class Asd { x = 0; }

const box1 = new Box();             // Box { x: 0, y: 0 }
const box2 = new Box({x: () => 0}); // Box { x: function, y: 0 }
const asd1 = new Asd(box1);         // Asd { x: 0 } <- x is primitive
const asd2 = new Asd(box2);         // Error (x is not shared struct or primitive)

This would - imho - reduce the confusion around non-shared structs vs sealed classes vs shared structs. And the syntax is minimal.

I see structs as a way to define plain-object data structures with a fixed layout (for the engine) that can be passed around from method to method minimizing the changes for de-optimizations. In this proposal shared structs seem to go one step ahead to allow passing whole structs across threads.

I have written a separate proposal here Data Structures: struct (as a way to put my thoughts down). Maybe it's redundant and initially I thought my proposal had a separate goal from this one here but after trying to read and understand more about non-shared structs, maybe we're talking about the same things.

Major advantages over manually sealing a class?

I can understand the use case of having a shared struct being used in conjunction with WASM.

But, I'm struggling to figure out the value of a non-shared struct. Compare these two examples:

struct class Point {
  x
  y
  constructor(params) {
    Object.assign(this, params)
  }
}

class Point {
  x = undefined
  y = undefined
  constructor(params) {
    Object.seal(this)
    Object.assign(this, params)
  }
}

The two act almost exactly the same. The main differences I can see are:

Perhaps the engine can optimize the struct easier.
An immutable prototype.

Are these the only benefit to having this non-shared struct syntax? Or do you see other benefits I'm not seeing?

I'm mostly wanting to make sure I understand this proposal correctly.

Potential for more minimalism

I'm still curious about https://twitter.com/annevk/status/1433688193321291778. It seems to me that shared structs (or at least the additional restrictions they add) would be a strictly simpler starting point than having both shared and non-shared structs.

And when I look at the motivation I only see compelling cases for shared structs.