Code Monkey home page Code Monkey logo

Comments (21)

wshager avatar wshager commented on July 19, 2024

This emulation of smmintrin.h (adapted from SSEPlus) suffices to compile. I wanted to see what Emscripten would make of this, but the generated javascript floods the CPU...

smmintrin.h.txt

Good luck with the project.

from simdcomp.

wshager avatar wshager commented on July 19, 2024

Correction: the first two tests work, the simple_demo() hangs (compiled with Emscripten).

from simdcomp.

weltling avatar weltling commented on July 19, 2024

Hi,

i've started to work on this having half a free day. But it is somehow strange :) I had to find out that i have no single machine not supporting sse4.1. Nevertheless, i made this small patch locally in the makefile:

diff --git a/makefile b/makefile
index 0e1c8de..228e737 100644
--- a/makefile
+++ b/makefile
@@ -2,10 +2,17 @@
 .SUFFIXES:
 #
 .SUFFIXES: .cpp .o .c .h
+
+ifeq ($(SSE),2)
+SSE_ARG=-msse2
+else
+SSE_ARG=-msse4.1
+endif
+
 ifeq ($(DEBUG),1)
-CFLAGS = -fPIC  -std=c89 -ggdb -msse4.1 -march=native -Wall -Wextra -pedantic
+CFLAGS = -fPIC  -std=c89 -ggdb $(SSE_ARG) -march=native -Wall -Wextra -pedantic
 else
-CFLAGS = -fPIC -std=c89 -O3 -msse4.1  -march=native -Wall -Wextra -pedantic
+CFLAGS = -fPIC -std=c89 -O3 $(SSE_ARG) -march=native -Wall -Wextra -pedantic
 endif # debug
 LDFLAGS = -shared
 LIBNAME=libsimdcomp.so.0.0.3

And then i compile like make SSE=2. Surprisingly, everything compiles fine and unit tests pass through. It is strange - firstly, it should not even compile, should it? Or, it could be good, that gcc already emulates something. Or gcc ignores this option as the VM supports SSE4_1. I have no chance to move the binaries and test on a machine without SSE4_1 as i don't have such HW. Can someone explain this?

Thanks.

from simdcomp.

weltling avatar weltling commented on July 19, 2024

@wshager thanks for the interest. Could you please be more verbose about how you compile and test it, and what HW was used?

Thanks.

from simdcomp.

wshager avatar wshager commented on July 19, 2024

@weltling I used Emscripten to compile, and since it only supports SIMD intrinsics up to SSE2, newer headers simply won't be found in the Emscripten lib. The HW doesn't seem relevant in this case. The current lack of SIMD support in javascript kinda puts this project on hold for me, since I'm looking for a solution in the browser.

from simdcomp.

lemire avatar lemire commented on July 19, 2024

@weltling

SSE 4.1 was introduced ~ 2007. I am not sure whether anyone makes x64 CPUs today without SSE 4.1.

Anyone knows?

from simdcomp.

lemire avatar lemire commented on July 19, 2024

Checked in a version of the code that should compile fine given the proper compiler settings.

@wshager Hopefully, the code should be compilable by Emscripten if there is SIMD support.

from simdcomp.

weltling avatar weltling commented on July 19, 2024

@lemire, yeah, i was linking this ttp://store.steampowered.com/hwsurvey in another thread which tells that sse4.1 is still used by round 83% in the gaming, which should be the indicator.

I was planning long to go for the emulation layer, just it was quite busy times on my side inbetween. But now when i've came to it, the strange thing was that gcc -msse2 was compiling just fine the existing code even before your patch. So maybe gcc already inserts some emulation, or ignores the option. That's why I was wondering :) BTW Visual Studio seems to not to have options to enforce some specific SSE version compatibility, seems the runtime will do the job. But by the plan I had was to implement a compatibility layer explicitly.

Thanks.

from simdcomp.

lemire avatar lemire commented on July 19, 2024

The goal of my fix is to allow @wshager to go forward with his plan to compile this library with Emscripten. I'd be very interested in hearing about the results...

yeah, i was linking this http://store.steampowered.com/hwsurvey in another thread which tells that sse4.1 is still used by round 83% in the gaming, which should be the indicator.

It is an interesting reference.

the strange thing was that gcc -msse2 was compiling just fine the existing code even before your patch. So maybe gcc already inserts some emulation, or ignores the option.

I do not think that -msse2 precludes SSE 4.1. If you want to forbid the compiler to use SSE 4.1, you have to use a negative flag like -mno-sse4.1.

You can certainly emulate SSE 4.1 functions, but you will get awful performance. It is probably not worth it. Better not use SIMD instructions in the first place if you have to support obsolete hardware.

from simdcomp.

weltling avatar weltling commented on July 19, 2024

@lemire great, -msse2 -mno-sse4.1 worked as expected. Looks like gcc expects the least version with -m, instead of a most one. Was my wrong interpretation. With these options now it properly fails to compile.

I got it exactly the way you told. With your fix it satisfies SSE2. But effectively it excludes several functionality. I was more talking about keeping the APIs, but having necessary parts emulated. Regarding performance - clear, SSE 2 were worse than 4.1. But I were more about compatibility. It's just down to whether to retain APIs compatible vs. moving forward with features.

Another point is also, that SIMD in any form will certainly provide a better performance compared to plain C/C++. More practical applicability vs. speed within SSE versions, as round 1/5 of cases sounds still significant to me. OFC i don't insist on this, just it was what we discussed earlier, why i opened this issue back then. I'm likely to call it a day then, mostly for the time reasons for my part.

Thanks.

from simdcomp.

lemire avatar lemire commented on July 19, 2024

SIMD in any form will certainly provide a better performance compared to plain C/C++.

Some non-SSE2 SIMD instructions we use (e.g., pshufb) cannot be emulated using SSE2 instructions. So you need to write the SIMD register to memory, reload it into regular registers, do the computation, then reload a SIMD register. That's going to be atrociously slow in some cases.

Even when the SIMD instruction can be emulated with SSE2 instructions (e.g., pmaxud), it may involve more operations than a non-SIMD algorithm would use. On the whole, you may end up with a significantly worse latency and throughput.

from simdcomp.

wshager avatar wshager commented on July 19, 2024

@lemire thanks! The example now compiles to Emscripten. Steps to reproduce:

emcc -O2 src/simdbitpacking.c -o simdbitpacking.bc -msse2 -Iinclude
emcc -O2 src/simdcomputil.c -o simdcomputil.bc -msse2 -Iinclude
emcc -O2 src/simdintegratedbitpacking.c -o simdintegratedbitpacking.bc -msse2 -Iinclude
emcc -O2 example.c -o example.bc -msse2 -Iinclude
emcc -O2 example.bc simdbitpacking.bc simdintegratedbitpacking.bc simdcomputil.bc -o project.html -s ALLOW_MEMORY_GROWTH=1

I'll puzzle some more on dynamic calls from JS later, and report my experiences in #13. Moving forward, I expect Emscripten to support SSE4.1 in short term.
emscripten-core/emscripten#4030

from simdcomp.

juj avatar juj commented on July 19, 2024

This looks like a very interesting use case of SIMD.js! I'd be very curious to see comparison graphs of native scalar vs native SIMD vs JS scalar vs JS SIMD performance to get a reality check of the suitability of the upcoming SIMD.js specification on your SIMD use case. If you generate such benchmark results at some point, please ping me in for the numbers!

from simdcomp.

wshager avatar wshager commented on July 19, 2024

@juj I'd be happy to.

from simdcomp.

lemire avatar lemire commented on July 19, 2024

@wshager

Which version of emscripten do you use? The fact that there is no -msse2 flag on my version does not make me hopeful.

$ emcc --version
emcc (Emscripten GCC-like replacement) 1.10.0 ()
Copyright (C) 2013 the Emscripten authors (see AUTHORS.txt)
This is free and open source software under the MIT license.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ emcc -O2 src/simdbitpacking.c -o simdbitpacking.bc -msse2 -Iinclude
clang: warning: argument unused during compilation: '-msse2'
In file included from src/simdbitpacking.c:4:
In file included from include/simdbitpacking.h:14:
In file included from include/simdcomputil.h:11:
In file included from /usr/share/emscripten/system/include/emscripten/emmintrin.h:1:
/usr/share/emscripten/system/include/emscripten/xmmintrin.h:6:1: error: redefinition of '_mm_set_ps'
_mm_set_ps(float z, float y, float x, float w)
^
/usr/share/emscripten/system/include/emscripten/xmmintrin.h:6:1: note: previous definition is here
_mm_set_ps(float z, float y, float x, float w)
^
In file included from src/simdbitpacking.c:4:
In file included from include/simdbitpacking.h:14:
In file included from include/simdcomputil.h:11:
In file included from /usr/share/emscripten/system/include/emscripten/emmintrin.h:1:
/usr/share/emscripten/system/include/emscripten/xmmintrin.h:12:1: error: redefinition of '_mm_set1_ps'
_mm_set1_ps(float w)
^
/usr/share/emscripten/system/include/emscripten/xmmintrin.h:12:1: note: previous definition is here
_mm_set1_ps(float w)
^
In file included from src/simdbitpacking.c:4:
In file included from include/simdbitpacking.h:14:
In file included from include/simdcomputil.h:11:
In file included from /usr/share/emscripten/system/include/emscripten/emmintrin.h:1:
/usr/share/emscripten/system/include/emscripten/xmmintrin.h:18:1: error: redefinition of '_mm_setzero_ps'
_mm_setzero_ps(void)
^
/usr/share/emscripten/system/include/emscripten/xmmintrin.h:18:1: note: previous definition is here
_mm_setzero_ps(void)
^
In file included from src/simdbitpacking.c:4:
In file included from include/simdbitpacking.h:14:
In file included from include/simdcomputil.h:11:
In file included from /usr/share/emscripten/system/include/emscripten/emmintrin.h:1:
/usr/share/emscripten/system/include/emscripten/xmmintrin.h:24:1: error: redefinition of '_mm_store_ps'
_mm_store_ps(float *p, __m128 a)
^
/usr/share/emscripten/system/include/emscripten/xmmintrin.h:24:1: note: previous definition is here
_mm_store_ps(float *p, __m128 a)
^
In file included from src/simdbitpacking.c:4:
In file included from include/simdbitpacking.h:14:
In file included from include/simdcomputil.h:11:
In file included from /usr/share/emscripten/system/include/emscripten/emmintrin.h:1:
/usr/share/emscripten/system/include/emscripten/xmmintrin.h:30:1: error: redefinition of '_mm_movemask_ps'
_mm_movemask_ps(__m128 a)
^
/usr/share/emscripten/system/include/emscripten/xmmintrin.h:30:1: note: previous definition is here
_mm_movemask_ps(__m128 a)
^
In file included from src/simdbitpacking.c:4:
In file included from include/simdbitpacking.h:14:
In file included from include/simdcomputil.h:11:
In file included from /usr/share/emscripten/system/include/emscripten/emmintrin.h:1:
/usr/share/emscripten/system/include/emscripten/xmmintrin.h:36:1: error: redefinition of '_mm_add_ps'
_mm_add_ps(__m128 a, __m128 b)
^
/usr/share/emscripten/system/include/emscripten/xmmintrin.h:36:1: note: previous definition is here
_mm_add_ps(__m128 a, __m128 b)
^
In file included from src/simdbitpacking.c:4:
In file included from include/simdbitpacking.h:14:
In file included from include/simdcomputil.h:11:
In file included from /usr/share/emscripten/system/include/emscripten/emmintrin.h:1:
/usr/share/emscripten/system/include/emscripten/xmmintrin.h:42:1: error: redefinition of '_mm_sub_ps'
_mm_sub_ps(__m128 a, __m128 b)
^
/usr/share/emscripten/system/include/emscripten/xmmintrin.h:42:1: note: previous definition is here
_mm_sub_ps(__m128 a, __m128 b)
^
In file included from src/simdbitpacking.c:4:
In file included from include/simdbitpacking.h:14:
In file included from include/simdcomputil.h:11:
In file included from /usr/share/emscripten/system/include/emscripten/emmintrin.h:1:
/usr/share/emscripten/system/include/emscripten/xmmintrin.h:48:1: error: redefinition of '_mm_mul_ps'
_mm_mul_ps(__m128 a, __m128 b)
^
/usr/share/emscripten/system/include/emscripten/xmmintrin.h:48:1: note: previous definition is here
_mm_mul_ps(__m128 a, __m128 b)
^
In file included from src/simdbitpacking.c:4:
In file included from include/simdbitpacking.h:14:
In file included from include/simdcomputil.h:11:
In file included from /usr/share/emscripten/system/include/emscripten/emmintrin.h:1:
/usr/share/emscripten/system/include/emscripten/xmmintrin.h:54:1: error: redefinition of '_mm_div_ps'
_mm_div_ps(__m128 a, __m128 b)
^
/usr/share/emscripten/system/include/emscripten/xmmintrin.h:54:1: note: previous definition is here
_mm_div_ps(__m128 a, __m128 b)
^
In file included from src/simdbitpacking.c:4:
In file included from include/simdbitpacking.h:14:
In file included from include/simdcomputil.h:11:
In file included from /usr/share/emscripten/system/include/emscripten/emmintrin.h:1:
/usr/share/emscripten/system/include/emscripten/xmmintrin.h:60:1: error: redefinition of '_mm_min_ps'
_mm_min_ps(__m128 a, __m128 b)
^
/usr/share/emscripten/system/include/emscripten/xmmintrin.h:60:1: note: previous definition is here
_mm_min_ps(__m128 a, __m128 b)
^
In file included from src/simdbitpacking.c:4:
In file included from include/simdbitpacking.h:14:
In file included from include/simdcomputil.h:11:
In file included from /usr/share/emscripten/system/include/emscripten/emmintrin.h:1:
/usr/share/emscripten/system/include/emscripten/xmmintrin.h:66:1: error: redefinition of '_mm_max_ps'
_mm_max_ps(__m128 a, __m128 b)
^
/usr/share/emscripten/system/include/emscripten/xmmintrin.h:66:1: note: previous definition is here
_mm_max_ps(__m128 a, __m128 b)
^
In file included from src/simdbitpacking.c:4:
In file included from include/simdbitpacking.h:14:
In file included from include/simdcomputil.h:11:
In file included from /usr/share/emscripten/system/include/emscripten/emmintrin.h:1:
/usr/share/emscripten/system/include/emscripten/xmmintrin.h:72:1: error: redefinition of '_mm_sqrt_ps'
_mm_sqrt_ps(__m128 a)
^
/usr/share/emscripten/system/include/emscripten/xmmintrin.h:72:1: note: previous definition is here
_mm_sqrt_ps(__m128 a)
^
In file included from src/simdbitpacking.c:4:
In file included from include/simdbitpacking.h:14:
In file included from include/simdcomputil.h:11:
In file included from /usr/share/emscripten/system/include/emscripten/emmintrin.h:1:
/usr/share/emscripten/system/include/emscripten/xmmintrin.h:80:1: error: redefinition of '_mm_cmplt_ps'
_mm_cmplt_ps(__m128 a, __m128 b)
^
/usr/share/emscripten/system/include/emscripten/xmmintrin.h:80:1: note: previous definition is here
_mm_cmplt_ps(__m128 a, __m128 b)
^
In file included from src/simdbitpacking.c:4:
In file included from include/simdbitpacking.h:14:
In file included from include/simdcomputil.h:11:
In file included from /usr/share/emscripten/system/include/emscripten/emmintrin.h:1:
/usr/share/emscripten/system/include/emscripten/xmmintrin.h:86:1: error: redefinition of '_mm_cmple_ps'
_mm_cmple_ps(__m128 a, __m128 b)
^
/usr/share/emscripten/system/include/emscripten/xmmintrin.h:86:1: note: previous definition is here
_mm_cmple_ps(__m128 a, __m128 b)
^
In file included from src/simdbitpacking.c:4:
In file included from include/simdbitpacking.h:14:
In file included from include/simdcomputil.h:11:
In file included from /usr/share/emscripten/system/include/emscripten/emmintrin.h:1:
/usr/share/emscripten/system/include/emscripten/xmmintrin.h:92:1: error: redefinition of '_mm_cmpeq_ps'
_mm_cmpeq_ps(__m128 a, __m128 b)
^
/usr/share/emscripten/system/include/emscripten/xmmintrin.h:92:1: note: previous definition is here
_mm_cmpeq_ps(__m128 a, __m128 b)
^
In file included from src/simdbitpacking.c:4:
In file included from include/simdbitpacking.h:14:
In file included from include/simdcomputil.h:11:
In file included from /usr/share/emscripten/system/include/emscripten/emmintrin.h:1:
/usr/share/emscripten/system/include/emscripten/xmmintrin.h:98:1: error: redefinition of '_mm_cmpge_ps'
_mm_cmpge_ps(__m128 a, __m128 b)
^
/usr/share/emscripten/system/include/emscripten/xmmintrin.h:98:1: note: previous definition is here
_mm_cmpge_ps(__m128 a, __m128 b)
^
In file included from src/simdbitpacking.c:4:
In file included from include/simdbitpacking.h:14:
In file included from include/simdcomputil.h:11:
In file included from /usr/share/emscripten/system/include/emscripten/emmintrin.h:1:
/usr/share/emscripten/system/include/emscripten/xmmintrin.h:104:1: error: redefinition of '_mm_cmpgt_ps'
_mm_cmpgt_ps(__m128 a, __m128 b)
^
/usr/share/emscripten/system/include/emscripten/xmmintrin.h:104:1: note: previous definition is here
_mm_cmpgt_ps(__m128 a, __m128 b)
^
In file included from src/simdbitpacking.c:4:
In file included from include/simdbitpacking.h:14:
In file included from include/simdcomputil.h:11:
In file included from /usr/share/emscripten/system/include/emscripten/emmintrin.h:1:
/usr/share/emscripten/system/include/emscripten/xmmintrin.h:110:1: error: redefinition of '_mm_and_ps'
_mm_and_ps(__m128 a, __m128 b)
^
/usr/share/emscripten/system/include/emscripten/xmmintrin.h:110:1: note: previous definition is here
_mm_and_ps(__m128 a, __m128 b)
^
In file included from src/simdbitpacking.c:4:
In file included from include/simdbitpacking.h:14:
In file included from include/simdcomputil.h:11:
In file included from /usr/share/emscripten/system/include/emscripten/emmintrin.h:1:
/usr/share/emscripten/system/include/emscripten/xmmintrin.h:116:1: error: redefinition of '_mm_andnot_ps'
_mm_andnot_ps(__m128 a, __m128 b)
^
/usr/share/emscripten/system/include/emscripten/xmmintrin.h:116:1: note: previous definition is here
_mm_andnot_ps(__m128 a, __m128 b)
^
fatal error: too many errors emitted, stopping now [-ferror-limit=]
20 errors generated.
ERROR    root: compiler frontend failed to generate LLVM bitcode, halting

from simdcomp.

wshager avatar wshager commented on July 19, 2024

@lemire I use 1.35.0. Choose 'emsdk_portable install latest' / 'emsdk_portable activate latest'.

from simdcomp.

lemire avatar lemire commented on July 19, 2024

@wshager Makes sense. Thanks.

from simdcomp.

lemire avatar lemire commented on July 19, 2024

@wshager Right. It builds.

$ emcc --version
emcc (Emscripten gcc/clang-like replacement) 1.35.0 (commit 42fb486c53d627b203b77af6e5d0c088c0ad03ad)
Copyright (C) 2014 the Emscripten authors (see AUTHORS.txt)
This is free and open source software under the MIT license.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ emcc -O2 src/simdbitpacking.c -o simdbitpacking.bc -msse2 -Iinclude
INFO:root:(Emscripten: Running sanity checks)
$ emcc -O2 src/simdcomputil.c -o simdcomputil.bc -msse2 -Iinclude
$ emcc -O2 src/simdintegratedbitpacking.c -o simdintegratedbitpacking.bc -msse2 -Iinclude
$ emcc -O2 example.c -o example.bc -msse2 -Iinclude
$ emcc -O2 example.bc simdbitpacking.bc simdintegratedbitpacking.bc simdcomputil.bc -o project.html -s ALLOW_MEMORY_GROWTH=1
WARNING:root:generating system library: libc.bc...

WARNING:root:                                     ok
WARNING:root:generating system library: dlmalloc.bc...
WARNING:root:                                         ok

from simdcomp.

wshager avatar wshager commented on July 19, 2024

@lemire you can add -s EXPORTED_FUNCTIONS="['_main', '_compress_decompress_demo','_varying_bit_width_demo','_simple_demo']" to export functions separately.

from simdcomp.

lemire avatar lemire commented on July 19, 2024

Thanks.

I am just checking things out in case others want to try it out (so that it is documented).

from simdcomp.

wshager avatar wshager commented on July 19, 2024

@lemire great. I'm going to try to rewrite some functions using SIMD.js. I'll report that too.

from simdcomp.

Related Issues (19)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.