Code Monkey home page Code Monkey logo

ecmascript_simd's People

Contributors

ajklein avatar arunetm avatar aylusltd avatar billbudge avatar bnjbvr avatar bterlson avatar dtig avatar elchi3 avatar fenghaitao avatar flagxor avatar huningxin avatar johnmccutchan avatar kripken avatar littledan avatar mhaghigh avatar muojp avatar nmostafa avatar p-jensen avatar peterjensen avatar sunfishcode avatar thomas-daniels avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ecmascript_simd's Issues

Missing comparison operations in int32x4

float32x4 has 6 comparison operations:

  • lessThan
  • lessThanOrEqual
  • equal
  • notEqual
  • greaterThanOrEqual
  • greaterThan

int32x4 only has three:

  • equal
  • greaterThan
  • lessThan

Issue with use of .signMask in aobench?

I think there's a problem with how .signMask is used in the aobench benchmark:

Around line 360:

    var cond1 = SIMD.greaterThan(D, float32x4.zero());
    if (cond1.signMask) {
      var t2 = SIMD.sub(SIMD.neg(B), SIMD.sqrt(D));
      var cond2 = SIMD.and(SIMD.greaterThan(t2, float32x4.zero()),
                           SIMD.lessThan(t2, isect.t));
      if (cond2.signMask) {

This will go into the 'if' branches if just one of the x4 compares comes out to true and then do the computation for all 4. Is this intended? I don't fully understand the code, so I might have it wrong.

In any case, it's better to always compare the .signMask property with an explicit value. signMask returns an int in the range 0x0..0xf. In other words the lower 4 bits indicate the 4 compare results.

Add shift operations to int32x4 values

When implementing sinx4() (see benchmarks/sinx4.js), I needed a shiftLeft() operation on int32x4 values. For completeness we should add these operations on int32x4 values:

shiftLeft()
shiftRight()
shiftRightArithmetic()

The behavior of float32x4 and uint32x4 called as a function rather than as a constructor

In http://www.ecma-international.org/ecma-262/5.1/#sec-15, the behavior of "The Built-in ECMAScript Objects Constructor Called as a Function" are different.

For Number, String, Boolean, Object, when they are called as a function rather than as a constructor, it performs a type conversion.

For Function, Array, when they are called as a function rather than as a constructor, it creates and initialises a new Function/Array object. Thus the function call is equivalent to the object creation expression new ... with the same arguments.

We think float32x4 and uint32x4 is more like a number than an array, so we create a primitive float32x4/uint32x4 type and do the type conversion. The current polyfill implementation treats them the same as no primitive float32x4/uint32x4 type in the JavaScript engine.

Define select semantics if mask's values are not only all 1 or all 0

int32x4.select and float32x4.select are intended to be used with masks that have either all bits set to 1 (0xFFFFFFFF) or all set to 0 (0x0). What happens if that is not the case?

Currently, if the input mask to select is, say int32x4(0x1, 0xF, 0xFF, 0xFFFF), for instance, the resulting output vector will be a strange mix of both inputs in each lane. Is that the intended behavior?

README documentation

May be useful to move the SIMD operations back to the README even if they're now under the SIMD module rather than the float32x4 object. Otherwise the polyfill code serves as the API documentation.

SIMD.float32x4.fromFloat64x2 incorrect type conversion

Current implementation just copies the double value without casting to float32.
SIMD.float32x4.fromFloat64x2 = function(t) {
checkFloat64x2(t);
var a = SIMD.float32x4.zero();
a.x_ = t.x_;
a.y_ = t.y_;
return a;
}

Using the float32x4 constructor fixes this.
SIMD.float32x4.fromFloat64x2 = function(t) {
checkFloat64x2(t);
var a = SIMD.float32x4(t.x_, t.y_, 0, 0);
return a;
}

Remove source type from conversion method names

With the introduction of the SIMD.float32x4 and SIMD.int32x4 subobjects, the source type on the conversion function names is redundant, e.g.

    SIMD.float32x4.float32x4toInt32x4()

Can be:

   SIMD.float32x4.toInt32x4()

Similarly for the other ones.

It even reads right :)

What is the expected result of Number(float32x4Object)?

f4 = float32x4(1.0, 2.0, 3.0, 4.0);
n = Number(f4);

What is the expected result of n?

I am asking this question as we need to handle
f4 = float32x4(1.0, 2.0, 3.0, 4.0);
g4 = float32x4(f4, 2.0, 3.0, 4.0);
when introducing the float32x4 type, all the other types could be coerced to Number.

How will native code port on top of JS-SIMD?

With Emscripten, we have the capacity to port native C&C++ code to the web. When/if people read tweets along the lines of "JS has SIMD", it will invariably result in a stream of Emscripten developers attempting to port their MMX/SSE1/SSE2/... -based codebases over to JS-SIMD. We need to have an answer to these developers about what the support of mapping these constructs over to JS-SIMD looks like.

In the Emscripten compiler, we already have small bits of such SIMD support available. To chart what this mapping would look like for SSE1 in particular (focusing on just one instruction set spec to start with, and SSE1 is the most interesting one) when completed, I wrote up this spreadsheet: https://docs.google.com/spreadsheets/d/1QAGGf2M2IA6l4cvh8eTXdXGEUcPjdmTe_BLKGn5YCB4/edit?usp=sharing

As one can imagine, comparing the current spec and the set of SSE1 intrinsics listed in the above spreadsheet, there is a large gap. I wonder how this could be resolved?

Rename variables to match common expectations of types & dimensions.

Use integer property names such as "i", "j", "k" and "l". This has leaked in the developer world as commonly used for indexes, and is inherited from the mathematics where they are used as indexes or as well as unit vectors.

Use floating point property names such as "x", "y", "z" and "t" (instead of "w"). In physics the fourth dimension is considered to be the dimension of time (or space-time) in the space-time vector of special relativity.

Add compare operations on int32x4 values

When implementing the sinx4() function, I needed an equal() operation on int32x4 values. For completeness we should add all the 6 compare operations on int32x4 values:

lessThan()
lessThanOrEqual()
equal()
notEqual()
greaterThanOrEqual()
greaterThan()

Introduce add, sub, and mul for uint32x4 type

There's several ways to do this:

  1. Overload the existing operations to work on both float32x4 and uint32x4 operands

I don't like this approach. It will require the JIT generated code to insert checks on the operand types. The optimizing JIT compilers can probably hoist those checks out in most cases, but it does complicate doing inlining somewhat.

  1. Introduce new names for float32x4 and uint32x4 operations, e.g. .addf4, addi4, etc

We'll need to add even more names when we start working on the AVX x8 types, so I'm not too fond of this solution either.

  1. Introduce subobjects to the SIMD object, to hang the operations that works on different data types on, e.g. SIMD.float32x4.add(), SIMD.uint32x4.add(), etc.

I like this approach better than the two above.

What do you think?

The polyfill for float32x4() and uint32x4() shouldn't be invoked as a constructor

float32x4 will eventually be a value_object, so to create a float32x4 variable one would write:

var f4 = float32x4(1.0,2.0,3.0,4.0)

The current polyfill implementation assumes that new values are created with the 'new' operator, i.e.

function float32x4(x,y,z,w) {
this.storage_ = new Float32Array(4);
this.storage_[0] = x;
this.storage_[1] = y;
this.storage_[2] = z;
this.storage_[3] = w;
}

This should implemented like this instead:

function float32x4(x,y,z,w) {
var storage = new Float32Array(4);
storage[0] = x;
storage[1] = y;
storage[2] = z;
storage[3] = w;
return storage;
}

I think :)

Float32x4Array constructor doesn't work

load ('ecmascript_simd.js');
var f = new Float32Array(20);
var f4 = new Float32x4Array(f.buffer);
print (f4.length); // prints 0. 5 expected.

The problem is that Float32Array(a,b,undefined) is not equivalent to Float32Array(a,b)

This patch should fix the problem:

diff --git a/src/ecmascript_simd.js b/src/ecmascript_simd.js
index e441ca5..60fc2ef 100644
--- a/src/ecmascript_simd.js
+++ b/src/ecmascript_simd.js
@@ -195,13 +195,17 @@ function Float32x4Array(a, b, c) {
this.storage_[i] = a.storage_[i];
}
} else if (isArrayBuffer(a)) {

  • if ((b != undefined) && (b % Float32Array.BYTES_PER_ELEMENT) != 0) {
  • if ((b != undefined) && (b % Float32x4Array.BYTES_PER_ELEMENT) != 0) {
    throw "byteOffset must be a multiple of 16.";
    }
    if (c != undefined) {
    c *= 4;
  •  this.storage_ = new Float32Array(a, b, c);
    
  • }
  • else {
  •  // Note: new Float32Array(a, b) is NOT equivalent to new Float32Array(a, b, undefined)
    
  •  this.storage_ = new Float32Array(a, b);
    
    }
  • this.storage_ = new Float32Array(a, b, c);
    this.length_ = this.storage_.length / 4;
    this.byteOffset_ = b != undefined ? b : 0;
    } else {
    diff --git a/src/ecmascript_simd_tests.js b/src/ecmascript_simd_tests.js
    index 374b7cb..7fb8a26 100644
    --- a/src/ecmascript_simd_tests.js
    +++ b/src/ecmascript_simd_tests.js
    @@ -436,7 +436,7 @@ test('uint32x4 and', function() {
    equal(true, n.flagY);
    equal(true, n.flagZ);
    equal(true, n.flagW);
  • o = SIMD.and(m,n); // and
  • var o = SIMD.and(m,n); // and
    equal(0x0, o.x);
    equal(0x0, o.y);
    equal(0x0, o.z);
    @@ -472,7 +472,7 @@ test('uint32x4 xor', function() {
    equal(0xAAAAAAAA, n.y);
    equal(0xAAAAAAAA, n.z);
    equal(0xAAAAAAAA, n.w);
  • o = SIMD.xor(m,n); // xor
  • var o = SIMD.xor(m,n); // xor
    equal(0x0, o.x);
    equal(0x0, o.y);
    equal(0x0, o.z);
    @@ -675,6 +675,7 @@ test('Float32Array view basic', function() {
    equal(b.byteOffset, 0);
    equal(c.byteOffset, 16);
    equal(d.byteOffset, 0);

});

test('Float32Array view values', function() {
@@ -742,22 +743,26 @@ test('Float32Array view values', function() {
equal(start+3, d.getAt(0).w);
});

-test('Float32x4Array exceptions', function() {
+test('Float32x4Array exceptions', function () {
var a = new Float32x4Array(4);
var b = a.getAt(0);
var c = a.getAt(1);
var d = a.getAt(2);
var e = a.getAt(3);

  • throws(function() {
  • throws(function () {
    var f = a.getAt(4);
    });
  • throws(function() {
  • throws(function () {
    var f = a.getAt(-1);
    });
  • throws(function() {
  • throws(function () {
    // Unaligned byte offset.
    var f = new Float32x4Array(a.buffer, 15);
    });
  • throws(function () {
  • // Unaligned byte offset, but aligned on 4. Bug
  • var f = new Float32x4Array(a.buffer, 4);
  • });
    });

test('View on Float32x4Array', function() {

Polyfill -- check runtime SIMD implementation status

Don't define SIMD polyfills if detect that the runtime has implemented float32x4 etc.
Enables same code to be run across different implementations.
If we don't want to keep this check in the polyfill, at least provide a best practice guide.

signMask should handle negative zero correctly

In current polyfill implementation, the negative zero is not handled in signMask.

For this case:

var a  = SIMD.float32x4(0.0, 0.0, 0.0, -0.0)
a.signMask

It is expected to print 8.
Currently polyfill implementation prints 0.

load vec2 and vec3

I was speaking with @sunfishcode in #asm.js and brought up a use case that I don't think is represented in the API yet.

Sometimes you want to use the platform's SIMD capabilities to work on two-vectors or four-vectors. (Or two two-vectors at a time.)

SSE supports roughly three mechanisms for loading two-vectors:

  • movq xmm0, [eax] ; load 64-bit quantity, zero high 64 bits
  • movlps xmm0, [eax] ; load 64-bit quantity, do not zero high 64 bits
  • movhps xmm0, [eax] ; load 64-bit into high 64 bits, do not zero low 64 bits

Loading three-vectors is hard. In the past I've done something like:

movss xmm0, [eax]
movhps xmm0, [eax+4]

This leaves the 3-vector in the register like [x, _, y, z], which is as fine as any other data layout.

I think it would be beneficial for the API to provide, in the least,

  1. load a 3-vector into a vec4
  2. load a 2-vector into the low components of a vec4
  3. load a 2-vector into the high components of a vec4

Function to load consecutive values from an array

It seems like it would be useful to have something like SIMD.float32x4.load(array, offset) that is equivalent to SIMD.float32x4(array[offset+0], array[offset+1], array[offset+2], array[offset+3]). This is clearly equivalent but easier to optimize. Also, I would explicitly not want to require that offset be aligned in any particular way.

One open question is how accepting to be: Specifically, could array be any array-like or must it be a typed array / typed object array of suitable type?

Is NaN a valid value for a float32x4 lane?

Could we write a=float32x4(NaN, NaN, NaN, NaN)?

Do we want to raise an exception when there is overflow, underflow or invalid operand for the SIMD operations? If yes, we need try and catch around SIMD operations. If no, a QNaN should be a valid lane value. A note is that the underflow and overflow exception are only for floating-point operations, no for Uint32 operations in Intel ISA manual.

We need to define the behavior in the spec.

License for polyfills

Please add a license to the polyfill files, so that they can be used in other projects (specifically I am starting to work on SIMD in emscripten now). MIT license would be nice :)

Define behavior for special Float32 values

The ES6 specification defines special behavior for NaN in many operations that are implemented as well for float32x4. For instance, Math.min returns NaN if any of the operands is NaN [1]. The polyfill doesn't have such behavior, and if you do

var x = SIMD.float32x4(1,2,3,4)
var y = SIMD.float32x4(NaN, NaN, NaN, NaN)

You'll have a different value if you do SIMD.float32x4.min(x, y) or SIMD.float32x4.min(y, x), which seems weird, with respect to the semantics of min (I'd expect min to be strictly commutative).

There might be other places where specific behavior should also be defined for other special Float32 values, such as +/- Infinity, +/- 0.

[1] http://people.mozilla.org/~jorendorff/es6-draft.html#sec-math.min

movemask for branching

A SIMD operation similar to movemask would be helpful for branching after a SIMD compare.

Type checking

The V8 runtime implementation throws an error if you call with the wrong type, for example a binary op with float32x4 w/o bitcasting; may be useful to have a debug version of the polyfill that checks using instanceof.

naming of conversion methods

if they're going to be on the simd module rather than the object, should the name include the from type?

(p.s. nitpick: might want to put all 4 conversion methods together, esp in the absence of API README doc, didn't see bitsToUint32x4 at first b/c not with the others)

Support loading/storing SIMD types from array buffer without 16-bytes alignment.

Problem statement

We have extended Typed Array View Types by Float32x4Array, Float64x2Array and Int32x4Array.

These SIMD typed array views load the SIMD data in 16-bytes alignment. In some use cases, it is very hard or impossible to arrange data in such way. These use cases require loading SIMD types from array buffer without 16-bytes alignment. It is similar to use C++ intrinsic _mm_loadu_ps/_mm_storeu_ps.

Possible solutions

There are two options to extend Typed Array Specification and one option to extend SIMD module.

  • Option 1: extend DataView interface
partial interface DataView {
    SIMD.float32x4 getFloat32x4(unsigned long byteOffset, optional boolean littleEndian);
    SIMD.float64x2 getFloat64x2(unsigned long byteOffset, optional boolean littleEndian);
    SIMD.float32x4 getFloat32x4(unsigned long byteOffset, optional boolean littleEndian);
    void setFloat32x4(unsigned long byteOffset, SIMD.float32x4 value, optional boolean littleEndian);
    void setFloat64x2(unsigned long byteOffset, SIMD.float64x2 value, optional boolean littleEndian);
    void setInt32x4(unsigned long byteOffset, SIMD.int32x4 value, optional boolean littleEndian);
};
  • Option 2: extend Typed Array Buffer View interface
partial interface Float32Array {
    SIMD.float32x4 getFloat32x4(unsigned long index);
    void setFloat32x4(unsigned long index, SIMD.float32x4 value);
};

partial interface Float64Array {
    SIMD.float64x2 getFloat64x2(unsigned long index);
    void setFloat64x2(unsigned long index, SIMD.float64x2 value);
};

partial interface Int32Array {
    SIMD.int32x4 getInt32x4(unsigned long index);
    void setInt32x4(unsigned long index, SIMD.int32x4 value);
};
  • Option 3: introduce memory load/store APIs in SIMD module:
SIMD.float32x4 SIMD.float32x4.load(Float32Array array, unsigned long index);
void SIMD.float32x4.store(Float32Array array, unsigned long index, SIMD.float32x4 value);

SIMD.float64x2 SIMD.float64x2.load(Float64Array array, unsigned long index);
void SIMD.float64x2.store(Float64Array array, unsigned long index, SIMD.float64x2 value);

SIMD.int32x4 SIMD.int32x4.load(Int32Array array, unsigned long index);
void SIMD.int32x4.store(Int32Array array, unsigned long index, SIMD.int32x4 value);

Add/plan for 256 bit and 512 bit SIMD functions?

Hi, since last year we have 256 bit (8*int32) SIMD ISA shipping (AVX2) in Haswell processors.. seems next year we will have also 512 bit SIMD support (i.e. 16xint32) in form of AVX512.. since executing 128 bit SIMD instructions on a 512 bit SIMD capable processor (Intel Skylake?) is only 25% efficient i.e. similar to currently no SIMD support on a SSE only processor (pre 2011 SandyBridge's) seems you should already plan adding Int32x8 and Int32x16 instructions which in case of say only 128 bit SIMD support by processor should be lowered to 2 int32x4 or 4 int32x4 instructions respectively.. Make sense?

Introduce a 'same value' initializer

I believe it's pretty common to initialize a float32x4 or uint32x4 value to have the same value in all lanes.

We could overload the initializer to do the right thing depending on the number of arguments, i.e.

var f4 = float32x4(1.0, 2.0, 3.0, 4.0); // different values in the 4 lanes
var ones4 = float32x4(1.0); // same value (1.0) in all lanes

Overloading the number of arguments shouldn't cause a perf issue, since this can be statically determined at JIT time.

Add logical operations to SIMD.float32x4 (and, or, xor, not)

I was coding up an implementation for a sinx4() function (computes the sin() value for all 4 lines), and had to do bit manipulation on float32x4 values. Doing this involves doing bit conversion to int32x4 of the operands and then bit conversion back to float32x4 of the result. The code becomes fairly unreadable, e.g.

 x = SIMD.int32x4.bitsToFloat32x4(SIMD.int32x4.and(SIMD.float32x4.bitsToInt32x4(x), _ps_inv_sign_mask));

With logical operations available on float32x4 it could be written as:

 x = SIMD.float32x4.and(x, _ps_inv_sign_mask));

Thoughts?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.