tc39 / ecmascript_simd Goto Github PK
View Code? Open in Web Editor NEWSIMD numeric type for EcmaScript
License: Other
SIMD numeric type for EcmaScript
License: Other
@param {uint32x4} An instance of a float32x4.
should be @param {uint32x4} An instance of a uint32x4.
https://github.com/johnmccutchan/ecmascript_simd/blob/master/src/ecmascript_simd.js#L667
float32x4 has 6 comparison operations:
int32x4 only has three:
I think there's a problem with how .signMask is used in the aobench benchmark:
Around line 360:
var cond1 = SIMD.greaterThan(D, float32x4.zero());
if (cond1.signMask) {
var t2 = SIMD.sub(SIMD.neg(B), SIMD.sqrt(D));
var cond2 = SIMD.and(SIMD.greaterThan(t2, float32x4.zero()),
SIMD.lessThan(t2, isect.t));
if (cond2.signMask) {
This will go into the 'if' branches if just one of the x4 compares comes out to true and then do the computation for all 4. Is this intended? I don't fully understand the code, so I might have it wrong.
In any case, it's better to always compare the .signMask property with an explicit value. signMask returns an int in the range 0x0..0xf. In other words the lower 4 bits indicate the 4 compare results.
When implementing sinx4() (see benchmarks/sinx4.js), I needed a shiftLeft() operation on int32x4 values. For completeness we should add these operations on int32x4 values:
shiftLeft()
shiftRight()
shiftRightArithmetic()
In http://www.ecma-international.org/ecma-262/5.1/#sec-15, the behavior of "The Built-in ECMAScript Objects Constructor Called as a Function" are different.
For Number, String, Boolean, Object, when they are called as a function rather than as a constructor, it performs a type conversion.
For Function, Array, when they are called as a function rather than as a constructor, it creates and initialises a new Function/Array object. Thus the function call is equivalent to the object creation expression new ... with the same arguments.
We think float32x4 and uint32x4 is more like a number than an array, so we create a primitive float32x4/uint32x4 type and do the type conversion. The current polyfill implementation treats them the same as no primitive float32x4/uint32x4 type in the JavaScript engine.
int32x4.select and float32x4.select are intended to be used with masks that have either all bits set to 1 (0xFFFFFFFF) or all set to 0 (0x0). What happens if that is not the case?
Currently, if the input mask to select is, say int32x4(0x1, 0xF, 0xFF, 0xFFFF), for instance, the resulting output vector will be a strange mix of both inputs in each lane. Is that the intended behavior?
int32x4 bitcasted into a NaN float32x4 may pose semantics issues in runtimes that canonicalize NaNs. for example,
var m = int32x4.bool(true, true, true, true);
var n = SIMD.int32x4.bitsToFloat32x4(m);
var n2 = SIMD.float32x4.withX(n, n.x);
var m2 = SIMD.float32x4.bitsToInt32x4(n2);
var equal(m.x, m2.x); // won't be equal
See #36 and https://bugzilla.mozilla.org/show_bug.cgi?id=945382 for more details.
May be useful to move the SIMD operations back to the README even if they're now under the SIMD module rather than the float32x4 object. Otherwise the polyfill code serves as the API documentation.
These were recently added to Dart and need to be added to JS.
Current implementation just copies the double value without casting to float32.
SIMD.float32x4.fromFloat64x2 = function(t) {
checkFloat64x2(t);
var a = SIMD.float32x4.zero();
a.x_ = t.x_;
a.y_ = t.y_;
return a;
}
Using the float32x4 constructor fixes this.
SIMD.float32x4.fromFloat64x2 = function(t) {
checkFloat64x2(t);
var a = SIMD.float32x4(t.x_, t.y_, 0, 0);
return a;
}
With the introduction of the SIMD.float32x4 and SIMD.int32x4 subobjects, the source type on the conversion function names is redundant, e.g.
SIMD.float32x4.float32x4toInt32x4()
Can be:
SIMD.float32x4.toInt32x4()
Similarly for the other ones.
It even reads right :)
f4 = float32x4(1.0, 2.0, 3.0, 4.0);
n = Number(f4);
What is the expected result of n?
I am asking this question as we need to handle
f4 = float32x4(1.0, 2.0, 3.0, 4.0);
g4 = float32x4(f4, 2.0, 3.0, 4.0);
when introducing the float32x4 type, all the other types could be coerced to Number.
With Emscripten, we have the capacity to port native C&C++ code to the web. When/if people read tweets along the lines of "JS has SIMD", it will invariably result in a stream of Emscripten developers attempting to port their MMX/SSE1/SSE2/... -based codebases over to JS-SIMD. We need to have an answer to these developers about what the support of mapping these constructs over to JS-SIMD looks like.
In the Emscripten compiler, we already have small bits of such SIMD support available. To chart what this mapping would look like for SSE1 in particular (focusing on just one instruction set spec to start with, and SSE1 is the most interesting one) when completed, I wrote up this spreadsheet: https://docs.google.com/spreadsheets/d/1QAGGf2M2IA6l4cvh8eTXdXGEUcPjdmTe_BLKGn5YCB4/edit?usp=sharing
As one can imagine, comparing the current spec and the set of SSE1 intrinsics listed in the above spreadsheet, there is a large gap. I wonder how this could be resolved?
Use integer property names such as "i", "j", "k" and "l". This has leaked in the developer world as commonly used for indexes, and is inherited from the mathematics where they are used as indexes or as well as unit vectors.
Use floating point property names such as "x", "y", "z" and "t" (instead of "w"). In physics the fourth dimension is considered to be the dimension of time (or space-time) in the space-time vector of special relativity.
When implementing the sinx4() function, I needed an equal() operation on int32x4 values. For completeness we should add all the 6 compare operations on int32x4 values:
lessThan()
lessThanOrEqual()
equal()
notEqual()
greaterThanOrEqual()
greaterThan()
There's several ways to do this:
I don't like this approach. It will require the JIT generated code to insert checks on the operand types. The optimizing JIT compilers can probably hoist those checks out in most cases, but it does complicate doing inlining somewhat.
We'll need to add even more names when we start working on the AVX x8 types, so I'm not too fond of this solution either.
I like this approach better than the two above.
What do you think?
float32x4 will eventually be a value_object, so to create a float32x4 variable one would write:
var f4 = float32x4(1.0,2.0,3.0,4.0)
The current polyfill implementation assumes that new values are created with the 'new' operator, i.e.
function float32x4(x,y,z,w) {
this.storage_ = new Float32Array(4);
this.storage_[0] = x;
this.storage_[1] = y;
this.storage_[2] = z;
this.storage_[3] = w;
}
This should implemented like this instead:
function float32x4(x,y,z,w) {
var storage = new Float32Array(4);
storage[0] = x;
storage[1] = y;
storage[2] = z;
storage[3] = w;
return storage;
}
I think :)
load ('ecmascript_simd.js');
var f = new Float32Array(20);
var f4 = new Float32x4Array(f.buffer);
print (f4.length); // prints 0. 5 expected.
The problem is that Float32Array(a,b,undefined) is not equivalent to Float32Array(a,b)
This patch should fix the problem:
diff --git a/src/ecmascript_simd.js b/src/ecmascript_simd.js
index e441ca5..60fc2ef 100644
--- a/src/ecmascript_simd.js
+++ b/src/ecmascript_simd.js
@@ -195,13 +195,17 @@ function Float32x4Array(a, b, c) {
this.storage_[i] = a.storage_[i];
}
} else if (isArrayBuffer(a)) {
this.storage_ = new Float32Array(a, b, c);
// Note: new Float32Array(a, b) is NOT equivalent to new Float32Array(a, b, undefined)
this.storage_ = new Float32Array(a, b);
});
test('Float32Array view values', function() {
@@ -742,22 +743,26 @@ test('Float32Array view values', function() {
equal(start+3, d.getAt(0).w);
});
-test('Float32x4Array exceptions', function() {
+test('Float32x4Array exceptions', function () {
var a = new Float32x4Array(4);
var b = a.getAt(0);
var c = a.getAt(1);
var d = a.getAt(2);
var e = a.getAt(3);
test('View on Float32x4Array', function() {
Don't define SIMD polyfills if detect that the runtime has implemented float32x4 etc.
Enables same code to be run across different implementations.
If we don't want to keep this check in the polyfill, at least provide a best practice guide.
Fast path testing that all lanes in a Float32x4 / Uint32x4 are zero, negative, positive, etc.
In current polyfill implementation, the negative zero is not handled in signMask
.
For this case:
var a = SIMD.float32x4(0.0, 0.0, 0.0, -0.0)
a.signMask
It is expected to print 8
.
Currently polyfill implementation prints 0
.
I was speaking with @sunfishcode in #asm.js and brought up a use case that I don't think is represented in the API yet.
Sometimes you want to use the platform's SIMD capabilities to work on two-vectors or four-vectors. (Or two two-vectors at a time.)
SSE supports roughly three mechanisms for loading two-vectors:
movq xmm0, [eax]
; load 64-bit quantity, zero high 64 bitsmovlps xmm0, [eax]
; load 64-bit quantity, do not zero high 64 bitsmovhps xmm0, [eax]
; load 64-bit into high 64 bits, do not zero low 64 bitsLoading three-vectors is hard. In the past I've done something like:
movss xmm0, [eax]
movhps xmm0, [eax+4]
This leaves the 3-vector in the register like [x, _, y, z], which is as fine as any other data layout.
I think it would be beneficial for the API to provide, in the least,
@param {uint32x4} t An instance of a uint32x4
should be @param {float32x4} t An instance of a float32x4
https://github.com/johnmccutchan/ecmascript_simd/blob/master/src/ecmascript_simd.js#L420
It seems like it would be useful to have something like SIMD.float32x4.load(array, offset)
that is equivalent to SIMD.float32x4(array[offset+0], array[offset+1], array[offset+2], array[offset+3])
. This is clearly equivalent but easier to optimize. Also, I would explicitly not want to require that offset
be aligned in any particular way.
One open question is how accepting to be: Specifically, could array
be any array-like or must it be a typed array / typed object array of suitable type?
SIMD.toFloat32x4 should take an uint32x4 as parameter.
https://github.com/johnmccutchan/ecmascript_simd/blob/master/src/benchmarks/mandelbrot.js#L44
Math.imul(a.w * b.w)
should be Math.imul(a.w * b.w)
?
https://github.com/johnmccutchan/ecmascript_simd/blob/master/src/ecmascript_simd.js#L559
https://github.com/johnmccutchan/ecmascript_simd/blob/master/src/benchmarks/mandelbrot.js#L65
count4 = SIMD.add (count4, SIMD.and (mi4, one4));
should be count4 = SIMD.add (count4, SIMD.andu32 (mi4, one4));
float -> int truncate
float -> int round
int -> float
Will we support "+","-", '**","/", Math.sqrt() and others on float32x4 and uint32x4 values? If yes, we need to extend current IC stubs and mine the type information from IC stubs (a lot of work). If no, will we coerce float32x4 value to NaN?
To get the right typing we should add a .shuffleu32 method as well.
Could we write a=float32x4(NaN, NaN, NaN, NaN)?
Do we want to raise an exception when there is overflow, underflow or invalid operand for the SIMD operations? If yes, we need try and catch around SIMD operations. If no, a QNaN should be a valid lane value. A note is that the underflow and overflow exception are only for floating-point operations, no for Uint32 operations in Intel ISA manual.
We need to define the behavior in the spec.
Please add a license to the polyfill files, so that they can be used in other projects (specifically I am starting to work on SIMD in emscripten now). MIT license would be nice :)
The ES6 specification defines special behavior for NaN in many operations that are implemented as well for float32x4. For instance, Math.min returns NaN if any of the operands is NaN [1]. The polyfill doesn't have such behavior, and if you do
var x = SIMD.float32x4(1,2,3,4)
var y = SIMD.float32x4(NaN, NaN, NaN, NaN)
You'll have a different value if you do SIMD.float32x4.min(x, y) or SIMD.float32x4.min(y, x), which seems weird, with respect to the semantics of min (I'd expect min to be strictly commutative).
There might be other places where specific behavior should also be defined for other special Float32 values, such as +/- Infinity, +/- 0.
[1] http://people.mozilla.org/~jorendorff/es6-draft.html#sec-math.min
float32x4 has one and int32x4 should have one too, for completeness and symmetry.
A SIMD operation similar to movemask would be helpful for branching after a SIMD compare.
The V8 runtime implementation throws an error if you call with the wrong type, for example a binary op with float32x4 w/o bitcasting; may be useful to have a debug version of the polyfill that checks using instanceof.
if they're going to be on the simd module rather than the object, should the name include the from type?
(p.s. nitpick: might want to put all 4 conversion methods together, esp in the absence of API README doc, didn't see bitsToUint32x4 at first b/c not with the others)
We have extended Typed Array View Types by Float32x4Array
, Float64x2Array
and Int32x4Array
.
These SIMD typed array views load the SIMD data in 16-bytes alignment. In some use cases, it is very hard or impossible to arrange data in such way. These use cases require loading SIMD types from array buffer without 16-bytes alignment. It is similar to use C++ intrinsic _mm_loadu_ps/_mm_storeu_ps.
There are two options to extend Typed Array Specification and one option to extend SIMD module.
partial interface DataView {
SIMD.float32x4 getFloat32x4(unsigned long byteOffset, optional boolean littleEndian);
SIMD.float64x2 getFloat64x2(unsigned long byteOffset, optional boolean littleEndian);
SIMD.float32x4 getFloat32x4(unsigned long byteOffset, optional boolean littleEndian);
void setFloat32x4(unsigned long byteOffset, SIMD.float32x4 value, optional boolean littleEndian);
void setFloat64x2(unsigned long byteOffset, SIMD.float64x2 value, optional boolean littleEndian);
void setInt32x4(unsigned long byteOffset, SIMD.int32x4 value, optional boolean littleEndian);
};
partial interface Float32Array {
SIMD.float32x4 getFloat32x4(unsigned long index);
void setFloat32x4(unsigned long index, SIMD.float32x4 value);
};
partial interface Float64Array {
SIMD.float64x2 getFloat64x2(unsigned long index);
void setFloat64x2(unsigned long index, SIMD.float64x2 value);
};
partial interface Int32Array {
SIMD.int32x4 getInt32x4(unsigned long index);
void setInt32x4(unsigned long index, SIMD.int32x4 value);
};
SIMD.float32x4 SIMD.float32x4.load(Float32Array array, unsigned long index);
void SIMD.float32x4.store(Float32Array array, unsigned long index, SIMD.float32x4 value);
SIMD.float64x2 SIMD.float64x2.load(Float64Array array, unsigned long index);
void SIMD.float64x2.store(Float64Array array, unsigned long index, SIMD.float64x2 value);
SIMD.int32x4 SIMD.int32x4.load(Int32Array array, unsigned long index);
void SIMD.int32x4.store(Int32Array array, unsigned long index, SIMD.int32x4 value);
Hi, since last year we have 256 bit (8*int32) SIMD ISA shipping (AVX2) in Haswell processors.. seems next year we will have also 512 bit SIMD support (i.e. 16xint32) in form of AVX512.. since executing 128 bit SIMD instructions on a 512 bit SIMD capable processor (Intel Skylake?) is only 25% efficient i.e. similar to currently no SIMD support on a SSE only processor (pre 2011 SandyBridge's) seems you should already plan adding Int32x8 and Int32x16 instructions which in case of say only 128 bit SIMD support by processor should be lowered to 2 int32x4 or 4 int32x4 instructions respectively.. Make sense?
I believe it's pretty common to initialize a float32x4 or uint32x4 value to have the same value in all lanes.
We could overload the initializer to do the right thing depending on the number of arguments, i.e.
var f4 = float32x4(1.0, 2.0, 3.0, 4.0); // different values in the 4 lanes
var ones4 = float32x4(1.0); // same value (1.0) in all lanes
Overloading the number of arguments shouldn't cause a perf issue, since this can be statically determined at JIT time.
equal(false, c.w);
should be equal(false, c.flagW);
https://github.com/johnmccutchan/ecmascript_simd/blob/master/src/ecmascript_simd_tests.js#L394
I was coding up an implementation for a sinx4() function (computes the sin() value for all 4 lines), and had to do bit manipulation on float32x4 values. Doing this involves doing bit conversion to int32x4 of the operands and then bit conversion back to float32x4 of the result. The code becomes fairly unreadable, e.g.
x = SIMD.int32x4.bitsToFloat32x4(SIMD.int32x4.and(SIMD.float32x4.bitsToInt32x4(x), _ps_inv_sign_mask));
With logical operations available on float32x4 it could be written as:
x = SIMD.float32x4.and(x, _ps_inv_sign_mask));
Thoughts?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.