Code Monkey home page Code Monkey logo

Comments (41)

jfbastien avatar jfbastien commented on June 15, 2024

@lukewagner suggests that we can also spec things in a well-defined way (return and wild write can't cause random control flow), and some implementations could be non-compliant in that respect.

from design.

sunfishcode avatar sunfishcode commented on June 15, 2024

It would also be useful not to conflate independent concerns here. Running outside a browser is different than running code untrusted (the context for my remark about node.js), which is different than supporting multiple sandboxing models.

from design.

jfbastien avatar jfbastien commented on June 15, 2024

I don't want NaCl / MinSFI specifically. I explicitly want to run outside the browser.

Agreed this mixes concerns: are we specifying the Web's security model? Is it OK to not follow that model?

from design.

BrendanEich avatar BrendanEich commented on June 15, 2024

You might underspecify a bit, to avoid dragging in too much relatively-recent/shifting stuff (CORS, CSP).

Dave Wagner (UCB) once said to me "I believe in object capabilities" and I know there's a cult-ish aspect but directionally he's right: integrity by least authority, safety first. Needs tools like type and other (semi-static) proof systems to scale up, Hobbit-y OCAP always goes wrong at scale because someone passes the DOM around (too convenient ;-).

That was a bit vague, sorry. I wanted to throw it out for consideration, anyway.

To swerve back to the concrete: properties like no ROP via control stack overwrite seem well worth keeping in constrained scope. I still don't have a formalized grip on all the safety properties at play, but I've been talking to Luke about it on the side a bit.

/be

from design.

lukewagner avatar lukewagner commented on June 15, 2024

If WebAssembly, the spec, doesn't provide a "trusted callstack", then even if we can say "no nasal demons", the majority of ops specified by the spec will not have deterministic semantics. E.g., Return won't mean "return to the caller", it will mean "jump anywhere" and GetLocal will mean "return anything".

It was suggested IRL by @jfbastien et al that we could go the ARM route of specifying "implementation-defined behavior". What that effectively means is that you cannot reason about your program given only the WebAssembly spec. Rather, you need to include a small additional spec provided by the "implementation" that says what happens when you evoke implementation-defined behavior. Effectively, the WebAssembly spec would be a spec template which had to be instantiated before use.

Given that the Web is the #1 primary use case here, I don't think we should make this eager generalization as it will bog down our reasoning and distract us from shipping a MVP.

Rather, here is a route I can see working:

  1. we start with a trusted callstack (and thus control-flow integrity, etc) in v.1
  2. non-Web environments that want to use a different security models can simply do so and be incorrect w.r.t the spec
  3. If such a non-Web environment gets enough momentum and real-world usage to justify the parameterization of the spec, we can do that in a backwards-compatible way (that doesn't change semantics for the Web); it would basically just be a reverse β reduction.

The MVP model for v.1 suggests we take this route and avoid over-eager generalization; I don't see this leading us down a path that inhibits the later generalization via parameterization.

from design.

lukewagner avatar lukewagner commented on June 15, 2024

Oh, but to answer the question in the issue's title: "Yes, totally". We state this in HighLevelGoals.md and have a specific section in V1.md. [Edit: oops, I was going off the issue title in my email: "Can Web Assembly run outside browsers?". I see the title has changed]

from design.

sunfishcode avatar sunfishcode commented on June 15, 2024

"The Web's security model" covers things like CORS and lots of API-specific concerns which are great to talk about, but they're sufficiently independent topics, so we don't need to let them defocus us here.

"Run outside the browser" has been in the high-level goals from the very beginning, and there are already several interesting discussions happening elsewhere about APIs, feature testing, dynamic libraries, and so on, so we don't need to let that defocus us here either.

Also, the concern here is not just about security. The questions raised in #75 and #102 were specifically about whether we want to write the spec in a way that can accommodate NaCl-style sandboxing instead of MinSFI-style sandboxing. There is a security angle involved, but there's also a portability angle, and a basic-sanity angle.

And finally, I don't see any value in us working to facilitate distribution of portable binary-only node.js modules that can run without sandboxing or with weakened sandboxing, if I understand what you were suggesting in #75. That's not in our HighLevelGoals, it's not something I've heard anyone involved express concern for (until now), there are better alternatives for the node.js use cases I'm familiar with no matter how much we compromise, and I especially don't want to weaken the platform for the things that are in our HighLevelGoals to do it. If you want us to pursue this use case, please file a new issue about it so that we can discuss it independently.

from design.

sunfishcode avatar sunfishcode commented on June 15, 2024

A general sentiment I'm hearing from numerous and diverse sources is well captured in @titzer's remark, "we should avoid UB like the plague". UB in this sense, sometimes referred to as "nasal demons", is undesirable in an application platform because it's an extreme on the spectrum of possible observable differences between implementations, and since it make it difficult to reason about what state an application might be in.

A trusted stack means that return addresses are outside the application address space. That nullifies a nontrivial nest of notoriously nasty nasal demons, and is a strong asset of the platform. Function tables mean that indirect calls can never jump into garbage memory or into the middle of another function. Putting features like these in the spec makes the entire platform far easier to reason about. I think it's worth the performance overhead today (which we don't presently have a good measure of, though we do have decent upper bound approximations for), and in the long term, I'm optimistic that we'll be able to reduce the overhead.

from design.

jfbastien avatar jfbastien commented on June 15, 2024

This isn't NaCl or MinSFI specifically. The issue does mix many concerns (security, UB, other platforms), and NaCl/MinSFI are potential targets, but not the only ones. I'd like the title to reflect this.

from design.

sunfishcode avatar sunfishcode commented on June 15, 2024

I'd like the title to ask a simple direct question that we might answer, and that will answer #75 and #102, rather than to present a vague abstraction that we'll likely get lost in.

from design.

sunfishcode avatar sunfishcode commented on June 15, 2024

I propose we answer this question in accordance with @jfbastien's own remarks, at the start of #75:

"We want to avoid all forms of undefined behavior which can lead to nasal demons, and instead discuss how the wasm platforms allows for implementation defined behavior and what acceptable behavior is."

This is a popular sentiment that I've heard in numerous places.

One of the cornerstones of achieving this is a trusted stack. In an untrusted-stack environment, return addresses are in the application address space, and it is difficult to put a bound on the behavior of an application if a return address is overwritten.

from design.

jfbastien avatar jfbastien commented on June 15, 2024

Agreed we want to avoid nasal demons, but a trusted stack is not the only way to do so. Mandating it heavily biases what Web Assembly implementations can do.

from design.

sunfishcode avatar sunfishcode commented on June 15, 2024

Please elaborate on how you intend to bound application behavior in an environment where return addresses are on the application-accessible stack.

from design.

titzer avatar titzer commented on June 15, 2024

On Thu, Jun 4, 2015 at 8:03 PM, JF Bastien [email protected] wrote:

Agreed we want to avoid nasal demons, but a trusted stack is not the
only way
to do so. Mandating it heavily biases what Web Assembly
implementations can do.

Yep. We're going for a spec without UB. This seems impossible without a
trusted stack for return addresses and local variables.


Reply to this email directly or view it on GitHub
WebAssembly/spec#107 (comment).

from design.

lukewagner avatar lukewagner commented on June 15, 2024

I agree that there is a lattice of nasal demos and that C++ nasal demos are > NaCl nasal demos... but still both destroy the relatively deterministic semantics we have now.

from design.

jfbastien avatar jfbastien commented on June 15, 2024

UB is not nasal demons.

NaCl is one example of this: return addresses are on the application-accessible stack, yet a misbehaving application cannot format your hard drive. NaCl has UB (or rather, implementation-defined behavior) yet no nasal demons.

A trusted stack is one way to reach our goal of avoiding UB. There are others, and I think we want to allow implementation-defined behavior for this reason.

Agreed with @lukewagner, in some cases we may say "some implementations may be non-conformant", and e.g. a node.js version may opt for this. I simply don't think we should mandate a trusted stack in the specification.

from design.

sunfishcode avatar sunfishcode commented on June 15, 2024

Sandboxing goes without saying. If applications can format your hard drive,

from design.

titzer avatar titzer commented on June 15, 2024

I would characterize the lattice thusly:

C++'s notion of UB is extremely broad. It's broad enough to allow compiler
implementations that statically notice UB and completely remove entire
functions, since it assumes that programs do not actually execute such
undefined behavior. That's a notion that stretches essentially from an
arbitrary point in the future of the program's execution backwards in time
to the point where that behavior becomes "inevitable" from the compiler's
perspective. Second, exactly what can happen is not at all specified.
Like JF said, it could literally format your harddrive or request SAC
launch the missiles. Fuck.

The other end of the lattice is no undefined or implementation-defined
behavior. Given the same inputs, the program will produce exactly the same
bits of memory on every engine. Awesome goal, awesome spec. Hard to realize
efficiently given the constraints.

The implementation-defined behavior that we've let in so far pretty much
always has the properties that it's:

A.) local; no action at a distance, like trashing weird parts of the heap
or local variables, either in this function or others; no jumping to random
places.
B.) constrained to a small set of behaviors. we could even consider
formalizing it as the engine nondeterministically chooses one of a small
set of spec'd behaviors.

I think JF is talking a different middle ground where missiles and
formatting cannot happen. But other crazy "in-program" stuff can happen,
like jumping into the middle of a function and then continuing to execute
garbage values. The eliminated behavior is reading/writing outside of the
sandboxed area. But unfortunately the stack of return addresses and
presumably the locals are in that sandbox. This opens up the non-local
action-at-a-distance behaviors. These make programs really really hard to
reason about. I don't think we want those.

On Thu, Jun 4, 2015 at 10:17 PM, Luke Wagner [email protected]
wrote:

I agree that there is a lattice of nasal demos and that C++ nasal demos
are > NaCl nasal demos... but still both destroy the relatively
deterministic semantics we have now.


Reply to this email directly or view it on GitHub
WebAssembly/spec#107 (comment).

from design.

pizlonator avatar pizlonator commented on June 15, 2024

Agreed, I also prefer limited UB. Limiting UB makes for a cleaner spec. It increases portability of code between implementations. It provides for a clearer path for implementors. It reduces the likelihood that an implementation will do something that looks like a bug to a user.

We should only allow UB in cases where it’s clear that we cannot get decent performance without it.

-Filip

On Jun 4, 2015, at 1:27 PM, titzer [email protected] wrote:

I would characterize the lattice thusly:

C++'s notion of UB is extremely broad. It's broad enough to allow compiler
implementations that statically notice UB and completely remove entire
functions, since it assumes that programs do not actually execute such
undefined behavior. That's a notion that stretches essentially from an
arbitrary point in the future of the program's execution backwards in time
to the point where that behavior becomes "inevitable" from the compiler's
perspective. Second, exactly what can happen is not at all specified.
Like JF said, it could literally format your harddrive or request SAC
launch the missiles. Fuck.

The other end of the lattice is no undefined or implementation-defined
behavior. Given the same inputs, the program will produce exactly the same
bits of memory on every engine. Awesome goal, awesome spec. Hard to realize
efficiently given the constraints.

The implementation-defined behavior that we've let in so far pretty much
always has the properties that it's:

A.) local; no action at a distance, like trashing weird parts of the heap
or local variables, either in this function or others; no jumping to random
places.
B.) constrained to a small set of behaviors. we could even consider
formalizing it as the engine nondeterministically chooses one of a small
set of spec'd behaviors.

I think JF is talking a different middle ground where missiles and
formatting cannot happen. But other crazy "in-program" stuff can happen,
like jumping into the middle of a function and then continuing to execute
garbage values. The eliminated behavior is reading/writing outside of the
sandboxed area. But unfortunately the stack of return addresses and
presumably the locals are in that sandbox. This opens up the non-local
action-at-a-distance behaviors. These make programs really really hard to
reason about. I don't think we want those.

On Thu, Jun 4, 2015 at 10:17 PM, Luke Wagner [email protected]
wrote:

I agree that there is a lattice of nasal demos and that C++ nasal demos
are > NaCl nasal demos... but still both destroy the relatively
deterministic semantics we have now.


Reply to this email directly or view it on GitHub
WebAssembly/spec#107 (comment).


Reply to this email directly or view it on GitHub WebAssembly/spec#107 (comment).

from design.

jfbastien avatar jfbastien commented on June 15, 2024

Agreed with @titzer's summary, but I think we want those in some form :-)

How to reconcile?

  • Allow for implementation-defined behavior?
  • Define different "implementation profiles"? Browsers would do one class of things, servers another?
  • Disallow impl-def behavior, but accept that some implementations will simply not conform?

It's not just a question of performance: some applications have an entirely different threat model.

from design.

lukewagner avatar lukewagner commented on June 15, 2024

Agreed with "limited UB" as in: we can formalize this in the spec as the op's operational semantics pick from one of a limited set of options and then we're back to completely-defined behavior.

@jfbastien That's what I was trying to address in my comment above: that amounts templating the spec and making everything more complicated for a non-Web use case. MVP goal means we should not do that for v.1 and consider doing it later only with concrete use cases.

from design.

titzer avatar titzer commented on June 15, 2024

On Thu, Jun 4, 2015 at 10:36 PM, Luke Wagner [email protected]
wrote:

Agreed with "limited UB" as in: we can formalize this in the spec as the
op's operation semantics pick from one of a limited set of options and then
we're back to completely-defined behavior.

Can we keep with the term "implementation-defined" behavior and not use UB?
Because that fits perfectly with the model that the implementation
"chooses" one of the limited set of options. It would also be nice if we
could say that every place where we have implementation-defined behavior,
there is a "trap" choice. The "sanitizer mode" or "debug mode" always
chooses trap.

@jfbastien https://github.com/jfbastien That's what I was trying to
address in my comment above
WebAssembly/spec#107 (comment):
that amounts templating the spec and making everything more complicated for
a non-Web use case. MVP goal means we should not do that for v.1 and
consider doing it later only with concrete use cases.


Reply to this email directly or view it on GitHub
WebAssembly/spec#107 (comment).

from design.

lukewagner avatar lukewagner commented on June 15, 2024

Agreed we should avoid UB, but even "implementation-defined" I think is too broad. At a formal semantic level, with the design we have so far (trusted stack), I've been thinking of this as just that the operational semantics, in some cases, provide a transition from one state to one or more next states. So: limited non-determinism.

from design.

sunfishcode avatar sunfishcode commented on June 15, 2024

Oops. If applications can format your hard drive, then nothing else we do here matters. What we're saying here is that we don't want UB inside the sandbox either.

If your plan is to be nonconforming, can we accept #102?

from design.

jfbastien avatar jfbastien commented on June 15, 2024

I'm asking whether we want to force valid implementations of wasm to be non-conforming, or if we want to design wasm to welcome these implementations. I don't think anyone comes in thinking "I feel like being non-conforming today".

I like @lukewagner's suggestion of nailing things down in MVP and saying "we know other implementations may want to do something else, we'll consider widening acceptable behavior after MVP". If we agree on this I can edit the repo, and then #102 is indeed something I can get behind.

from design.

titzer avatar titzer commented on June 15, 2024

On Thu, Jun 4, 2015 at 10:54 PM, JF Bastien [email protected]
wrote:

I'm asking whether we want to force valid implementations of wasm to be
non-conforming, or if we want to design wasm to welcome these
implementations. I don't think anyone comes in thinking "I feel like being
non-conforming today".

Well, that's part of the point of a spec. It discriminates conforming
implementations from non-conforming ones. In some sense, it is not a
welcoming thing at all, since it weeds out the non-conformant ones :-)

I like @lukewagner https://github.com/lukewagner's suggestion of
nailing things down in MVP and saying "we know other implementations may
want to do something else, we'll consider widening acceptable behavior
after MVP". If we agree on this I can edit the repo, and then #102
WebAssembly/spec#102 is indeed something I can
get behind.


Reply to this email directly or view it on GitHub
WebAssembly/spec#107 (comment).

from design.

sunfishcode avatar sunfishcode commented on June 15, 2024

I'm similarly not coming into this thinking I want to specify "the core of WebAssembly will have no UB, except that sometimes it will".

from design.

lukewagner avatar lukewagner commented on June 15, 2024

Given agreement not to have nasal demons and to have a trusted stack in the initial versions, I'd like to avoid explicitly mentioning that we're considering otherwise in the future since it waters down our message to the web, risks confusion and unnecessary early fracturing and a web assembly is our primary goal here. Before we reconsider this choice, I think we should have strong evidence of demand: a clear performance win (not solvable with new (non-nasal-demonic) features, compiler optimization or hardware support—we need to take a long-term view in this evaluation) for a broad community of non-browser users.

So can we merge #102?

from design.

jfbastien avatar jfbastien commented on June 15, 2024

sgtm, I'll leave this open. To be clear the issue isn't about getting nasal demons so I'll rename (yet again).

from design.

sunfishcode avatar sunfishcode commented on June 15, 2024

As discussed above, this isn't just about security. I've replaced "nasal demons" with "spooky action at a distance" to reflect the fact that sandboxing means that demons will never actually fly out anyone's nose (unless there are APIs that do that), but to also reflect the kind of semantic consequences under discussion here.

from design.

lukewagner avatar lukewagner commented on June 15, 2024

We currently have a file named IncompletelySpecifiedBehavior.md. "Incompletely specified" seems equivalent to "Not specified" as in "the spec doesn't say what will happen". "Spooky action at a distance" has a nice physics metaphor, but I wonder if we could phrase this more specifically (and rename that file to match). I think what we've agreed that what we want here is a semantics that is well-defined (never "goes off the rails"), but has limited (enumerated set of choices for a limited set of cases) non-determinism. Given this, how about renaming the file to Nondeterminism.md and generally referring to "limited non-determinism" instead of "implementation-defined" or "incompletely-specified"?

from design.

jfbastien avatar jfbastien commented on June 15, 2024

Nondeterminism sgtm when expressed the way @lukewagner did (no Ozzy-going-off-the-rails, but enumerated choices for implementations).

from design.

titzer avatar titzer commented on June 15, 2024

"Limited non-determinism" captures the semantic scope nicely. The phrase
"spooky action at a distance" expresses the local nature of the
non-determinism; no updates to random parts of memory, no trashing locals,
no jumping to weird places, etc. It'd be nice to incorporate the latter
guarantees somehow as well.

On Tue, Jun 9, 2015 at 6:51 PM, JF Bastien [email protected] wrote:

Nondeterminism sgtm when expressed the way @lukewagner
https://github.com/lukewagner did (no Ozzy-going-off-the-rails, but
enumerated choices for implementations).


Reply to this email directly or view it on GitHub
WebAssembly/spec#107 (comment).

from design.

lukewagner avatar lukewagner commented on June 15, 2024

"Limited local non-determinism"?

from design.

trevnorris avatar trevnorris commented on June 15, 2024

Please correct me if I'm wrong, but what I gathered from all this conversation is whether special treatment should be given to wasm for non-browser users.

Personally, as a node core maintainer, only hope that wasm-to-native calls are highly optimized. Basically being able to call from wasm to native without having to jump through JS. That's the only thing I think we'd need.

Despite what so many say, use cases in the browser are not the same in the server. Our required feature sets are also incompatible. As long as the code can quickly interface with the native code and syscalls it has to make then no need to make special considerations for how the server will utilize wasm. Because we can take it from there.

from design.

sunfishcode avatar sunfishcode commented on June 15, 2024

The conversation in this github issue pertains to WebAssembly's own semantics. The behavior of any APIs that might be made available to WebAssembly programs is an entirely separate question.

from design.

trevnorris avatar trevnorris commented on June 15, 2024

My apologies. I see that now. Was skipping through issues to quickly and crossed some mental wires.

from design.

sunfishcode avatar sunfishcode commented on June 15, 2024

I'm good with the present resolution here. @jfbastien is there more you'd like to do here?

from design.

jfbastien avatar jfbastien commented on June 15, 2024

Not really. @lukewagner's message above captures some of my thinking:

Given agreement not to have nasal demons and to have a trusted stack in the initial versions, I'd like to avoid explicitly mentioning that we're considering otherwise in the future since it waters down our message to the web, risks confusion and unnecessary early fracturing and a web assembly is our primary goal here. Before we reconsider this choice, I think we should have strong evidence of demand: a clear performance win (not solvable with new (non-nasal-demonic) features, compiler optimization or hardware support—we need to take a long-term view in this evaluation) for a broad community of non-browser users.

In general I agree that we don't want nasal demons, that we want limited non-determinism, but the current design is heavily biased towards what a JS engine offers in terms of limited non-determinism. Shadow stack is a great example of this.

I'm OK with the current design, conditionally on what Luke states: let's see how things work out later, and if the approach to limited non-determinism is too limiting, with data. I'd like to leave this issue open pending that data.

from design.

sunfishcode avatar sunfishcode commented on June 15, 2024

To clarify, the current design is heavily biased towards the needs of a multiple-implementation long-lived format, in terms of limited nondeterminism.

from design.

titzer avatar titzer commented on June 15, 2024

The consensus is no, WebAssembly should not have spooky action at a distance.

from design.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.