webassembly / feature-detection Goto Github PK
View Code? Open in Web Editor NEWHome Page: https://webassembly.github.io/feature-detection/
License: Other
Home Page: https://webassembly.github.io/feature-detection/
License: Other
The overview currently describes feature detection as being resolved at decode time so that feature_block
is decoded as either a normal block or as an unreachable
and features.supported
is decoded as either a 0 or 1. This is a convenient framing of the proposal, but it is also unprecedented. Should we instead treat feature_block
and features.supported
as normal instructions and include them in the abstract syntax and validation algorithms?
@rossberg, I'd be particularly interested in your thoughts on this.
I have a problem with switching from a feature detection as originally proposed to an "is faster" approach in that hardware changes and what "is faster" means nothing from generation to generation. You will still have to detect the exact hardware to determine if it "is faster". Feature detection should probably be used for code path detection and not performance detection as that should be left up to the application IMO.
I'd like to provide some context on this proposal from the perspective of open-source multimedia software (ffmpeg, x264, libass…): Usually, rather than using gcc/clang-style function multiversioning, we have multiple implementations of a given function with separate names (e.g. my_function_c
, my_function_sse2
, my_function_avx2
, etc). These symbols exist in all builds for the relevant architecture(s) (here, x86/x86_64), and a runtime check determines which one to insert into a vtable (e.g. a function pointer for my_function
would be populated with &my_function_sse2
if the SSE2 check passes but the AVX2 check fails).
Having all variants available at runtime also means that we can override the dispatch if needed (e.g. disable the AVX2 version at runtime to benchmark the SSE2 version, and call the C version to have a baseline to compare to during unit tests). It's also higher-performance than using a per-function runtime dispatcher symbol, since we only have to do the check once, and can eliminate the extra indirection layer afterwards.
The upshot of all this is:
It's not entirely clear how this ends up working from a code perspective in this proposal:
.wat
text format, what does that look like in code? What if we instead use C or AssemblyScript? Some of these questions might ultimately be up to compiler vendors to answer, but hopefully we can at least have some general ideas.The explainer describes feature_block
as a 'decoding time' feature.
If a binary to text converter is run on a module, would this imply that the feature_block
's present in the binary must be resolved (based off the supported features of the tool?) during the conversion to text? e.g. a features.supported
would be printed as the resolved i32.const
?
In addition, how would text to binary work? My understanding of the spec is that there is the abstract syntax, and conversions from the binary/text formats to/from the abstract syntax. If this is specified in the binary format and not in the abstract syntax, how can a conversion from text to binary be specified?
The discussion I've seen on this proposal has largely been around use with future extensions (notably future additions to SIMD), but there are current extensions that would benefit massively from the functionality being discussed, so I think it's important that the design at least attempt to avoid precluding JS-based polyfills.
For instance, I'd like to extend some open-source libraries (e.g. ffmpeg, libass, OpenSSL…) to use some features made available in some already-broadly-implemented extensions, including:
These features (with the exception of SIMD) are implemented in all major browser engines today, but in the same way that ffmpeg maintains compatibility back to at least Windows XP (and will likely support 7 for quite a while after), we need to support wasm implementations back to MVP. Currently, this would mean providing compile-time flags that consumers must enable to get newer features, which produces an untenable amount of build fragmentation and complexity already. This essentially bars these projects from making use of any of the wasm features that would be required for them to be seriously usable on the web platform.
It seems like this proposal should allow for JavaScript to stream down a wasm module, parse it at a high level, recognize feature blocks, and discard (or replace with no-ops?) anything that isn't supported by the current engine. Correct me if I'm wrong?
If I'm right about this generally being doable, I think the main thing this needs is to assign feature IDs to already-existing features. I suppose even without officially-assigned ones, tooling could always just define its own and shift the rest around them, but having this standardized would be best.
In principle features could be arbitrarily fine- or course-grained and could be retroactively applied to any instructions already in the spec. In practice, though, only features corresponding to toolchain-level target features that contain targeted, performance sensitive instructions that don't have a short path to widespread adoption will be useful.
I propose that the only feature we define to start out is "simd128", corresponding to the merged SIMD proposal. (If relaxed-simd ships before feature detection, it should have a feature as well.)
Here's why we might not want to define separate features for other proposals:
Are there any other features it would make sense to define for a feature detection MVP?
What is the intended opcode for the features.supported
instruction?
Options:
0xc5
)After seeing @tlively's presentation today, I offhandedly suggested using else
to allow a feature-block to have an alternative. I think it would be somewhat elegant to generalize features_block
... end
to feature.if [block_type]
... else
... end
. Thus it would look a lot like a regular if
, but the true block would be a binary blob that would not be decoded if the feature set is not supported.
We would have to decide what to do with block_type
in the else
block. One option would be to have a completely alternate type for else
or to have the same block_type
, but with alternative types substituted in.
The purpose of this proposal is to allow modules to validate even when they conditionally use instructions not supported by the engine, but the spec so far has no concept of optional or unsupported instructions. All instructions are currently either mandatory or do not exist as far as the spec is concerned.
One solution would be to avoid adding a concept of features to the normative spec, to specify features.supported
and feature_block
as succeeding if the feature bitvector is a particular constant and failing otherwise, and relying on non-normative text to document what each feature actually means. This is not a very satisfying solution because we want feature detection to be as well-specified and portable as any other part of WebAssembly.
A better option would be to add a notion of features to the spec, tag (every?) instruction with the feature(s?) it belongs to, and explicitly allow implementations to choose whether or not to support each feature. The spec would essentially be parameterized by the supported features.
Are there any problems with that approach? Are there better alternatives? @rossberg, I'd be particularly interested in your thoughts here.
ISTR that there was an assumption that a restriction of feature detection to the code section would still allow the conditional use of types that may not be present in all engines - eg, v128 would be usable for local variables within suitably conditionalized code for engines that support SIMD, though not in function signatures or on globals. Currently all locals are declared at the function head, not within any block. Yet feature_block is decoded as a block according to the overview, not allowing locals to be declared within it. Is there a hidden dependency on LET in here? LET is on the chopping block over in the function-references proposal.
In the discussion on #6, two observations have been made:
Limiting alternative code path choices to function granularity seems to be sufficient, at least for the primary gcc/clang use case.
On the other hand, it is not sufficient to enable alternative choices in the code section alone:
In the light of this, I'd suggest to reconsider a more general mechanism operating on the level of sections.
The conditional sections proposal did that, but had one significant drawback, namely that it was too liberal and allowed the resulting sections to have completely different sizes (including absence), which would make it difficult for tools to process a module coherently.
We could refine this as follows:
Instead of a unary construct
#if <condition> <section> #endif
we change it to an n-ary construct
#if <condition> <section> (#if <condition> <section>)* #else <section> #endif
where all of the section alternatives must have the same type and size.
As before, this is combined with the ability to have multiple occurrences of each section type (like we already want for other reasons as well), such that a conditional can be reduced to a diff.
Separately, we can revisit what the representation of "conditions" is.
The n-ary construct mirrors the #if-#elif
of C. Crucially, it enforces "well-formedness" of the index spaces created.
In terms of the binary format, such a section conditional would perhaps only store the section type and size once, as a form of "type annotation" on the conditional itself, instead of repeating it in every nested section (which then would only contain the section body).
Honestly, I expect that it is no more complex to define this conditional construct generically for all section types than to have separate equivalent constructs for (at least) code, function, and type section.
I'd like to lay out a concern regarding feature-detection along with a way to address this concern while still addressing the needs of SIMD (which as @tlively was saying in his last presentation, seems to be our only short-term need for feature-detection).
The concern is that, despite our best intentions in the short-term, if we allow wasm modules to validate with completely un-decoded bits, then over the long-term (as wasm gets more toolchains and engines and business contexts), we'll end up with a lot of wasm modules containing non-standard bits. While it is certainly possible today for particular toolchains and engines to agree to produce/consume non-standard/pre-standard bits (which is actually an essential part of the process of standardization), there is a natural check-and-balance in that these non-standard/pre-standard bits will only run on those particular engines and fail to validate on the rest. If it becomes possible to produce modules with non-standard bits that validate on all wasm engines, then it becomes much easier to push out non-standard bits and circumvent the standardization process in general. As with most vague concerns, maybe there comes a time when this is a risk we want to take, but I think it's a good idea to hold off on crossing this Rubicon for as long as possible.
If we zoom in on the SIMD use case, I think a reasonable alternative solution makes sense (which is slightly different from what I've suggested before) based on the observation that the engineering-effort and engine-size required to decode, but not validate or compile, SIMD (or any other future wasm feature for that matter) is miniscule compared to the overall effort required to implement MVP wasm 1.0, and thus it's not worth trying to optimize away SIMD decoding on engines that don't support SIMD (classic Amdahl's Law).
Based on this observation, I think we can derive a reasonable compromise:
i32
value).unreachable
instruction, and thus they are decoded as dead code.
v128
with an actual 128-bit value -- they could use nothing or a dummy.Some other nice technical benefits from this approach are:
With respect to the other major concepts of "version" and "profile":
At least, that's the best I've been able to come up with; happy to discuss!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.