redorav / hlslpp Goto Github PK
View Code? Open in Web Editor NEWMath library using hlsl syntax with SSE/NEON support
License: MIT License
Math library using hlsl syntax with SSE/NEON support
License: MIT License
HLSL++ vector and matric structs have user-defined copy constructor which breaks "rule of zero", but do not define copy assignment, move constructor and move assignment operators which also means that these types also do not follow "rule of five" resulting in missing support of move semantics and lower performance when used in STL containers like std::vector
.
It seems like HLSL++ types do not need to have user-defined copy constructor. Removing of user-defined copy-constructors will let the compiler generate correct implementations of noexcept copy/move constructors and noexcept assignment operators unlocking the effective memory management in modern C++.
hlslpp has been great so far. Great job.
Only had one issue with the code I tried to port some hlsl code, the lack of sincos():
https://docs.microsoft.com/en-us/windows/win32/direct3dhlsl/dx-graphics-hlsl-sincos
Matrices currently do not have any comparison operators defined for them. I can get around it by writing my own operators manually but it would be nice if these were built-in in hlslpp.
1>C:\Personal\ElectronicJonaJoy\src\EngineTests\math.tests.cpp(67,1): error C2678: binary '==': no operator found which takes a left-hand operand of type 'hlslpp::float4x4' (or there is no acceptable conversion)
1>c:\personal\electronicjonajoy\external\hlslpp\include\hlsl++_vector_float.h(1122,23): message : could be 'hlslpp::float1 hlslpp::operator ==(const hlslpp::float1 &,const hlslpp::float1 &)' [found using argument-dependent lookup]
1>c:\personal\electronicjonajoy\external\hlslpp\include\hlsl++_vector_float.h(1123,23): message : or 'hlslpp::float2 hlslpp::operator ==(const hlslpp::float2 &,const hlslpp::float2 &)' [found using argument-dependent lookup]
1>c:\personal\electronicjonajoy\external\hlslpp\include\hlsl++_vector_float.h(1124,23): message : or 'hlslpp::float3 hlslpp::operator ==(const hlslpp::float3 &,const hlslpp::float3 &)' [found using argument-dependent lookup]
1>c:\personal\electronicjonajoy\external\hlslpp\include\hlsl++_vector_float.h(1125,23): message : or 'hlslpp::float4 hlslpp::operator ==(const hlslpp::float4 &,const hlslpp::float4 &)' [found using argument-dependent lookup]
1>c:\personal\electronicjonajoy\external\hlslpp\include\hlsl++_vector_int.h(568,21): message : or 'hlslpp::int1 hlslpp::operator ==(const hlslpp::int1 &,const hlslpp::int1 &)' [found using argument-dependent lookup]
1>c:\personal\electronicjonajoy\external\hlslpp\include\hlsl++_vector_int.h(569,21): message : or 'hlslpp::int2 hlslpp::operator ==(const hlslpp::int2 &,const hlslpp::int2 &)' [found using argument-dependent lookup]
1>c:\personal\electronicjonajoy\external\hlslpp\include\hlsl++_vector_int.h(570,21): message : or 'hlslpp::int3 hlslpp::operator ==(const hlslpp::int3 &,const hlslpp::int3 &)' [found using argument-dependent lookup]
1>c:\personal\electronicjonajoy\external\hlslpp\include\hlsl++_vector_int.h(571,21): message : or 'hlslpp::int4 hlslpp::operator ==(const hlslpp::int4 &,const hlslpp::int4 &)' [found using argument-dependent lookup]
1>c:\personal\electronicjonajoy\external\hlslpp\include\hlsl++_vector_uint.h(573,22): message : or 'hlslpp::uint1 hlslpp::operator ==(const hlslpp::uint1 &,const hlslpp::uint1 &)' [found using argument-dependent lookup]
1>c:\personal\electronicjonajoy\external\hlslpp\include\hlsl++_vector_uint.h(574,22): message : or 'hlslpp::uint2 hlslpp::operator ==(const hlslpp::uint2 &,const hlslpp::uint2 &)' [found using argument-dependent lookup]
1>c:\personal\electronicjonajoy\external\hlslpp\include\hlsl++_vector_uint.h(575,22): message : or 'hlslpp::uint3 hlslpp::operator ==(const hlslpp::uint3 &,const hlslpp::uint3 &)' [found using argument-dependent lookup]
1>c:\personal\electronicjonajoy\external\hlslpp\include\hlsl++_vector_uint.h(576,22): message : or 'hlslpp::uint4 hlslpp::operator ==(const hlslpp::uint4 &,const hlslpp::uint4 &)' [found using argument-dependent lookup]
1>c:\personal\electronicjonajoy\external\hlslpp\include\hlsl++_vector_double.h(1079,24): message : or 'hlslpp::double1 hlslpp::operator ==(const hlslpp::double1 &,const hlslpp::double1 &)' [found using argument-dependent lookup]
1>c:\personal\electronicjonajoy\external\hlslpp\include\hlsl++_vector_double.h(1080,24): message : or 'hlslpp::double2 hlslpp::operator ==(const hlslpp::double2 &,const hlslpp::double2 &)' [found using argument-dependent lookup]
1>c:\personal\electronicjonajoy\external\hlslpp\include\hlsl++_vector_double.h(1081,24): message : or 'hlslpp::double3 hlslpp::operator ==(const hlslpp::double3 &,const hlslpp::double3 &)' [found using argument-dependent lookup]
1>c:\personal\electronicjonajoy\external\hlslpp\include\hlsl++_vector_double.h(1090,24): message : or 'hlslpp::double4 hlslpp::operator ==(const hlslpp::double4 &,const hlslpp::double4 &)' [found using argument-dependent lookup]
1>c:\personal\electronicjonajoy\external\hlslpp\include\hlsl++_quaternion.h(227,24): message : or 'hlslpp::float4 hlslpp::operator ==(const hlslpp::quaternion &,const hlslpp::quaternion &)' [found using argument-dependent lookup]
1>C:\Personal\ElectronicJonaJoy\src\EngineTests\math.tests.cpp(67,1): message : while trying to match the argument list '(hlslpp::float4x4, hlslpp::float4x4)'
There are lots of cases in animation require computing the inverse of affine matrices, there are many assumptions that one can make when a 4x4 matrix is an affine transformation, any chance something like that would be considered?
(p.s. this is Bryan from PG ;))
Apparently hlsl actually has a mad function
Lerp seems to be broken. Had to revert to path marked as slower in _hlslpp_lerp_ps in order to get it working.
It would be nice if it would contain a manual on how to incorporate it into an existing VisualStudio solution to be able to quickly use this awesome library. For example which settings have to be checked for a successful compilation or what else to pay attention to, because simple including the headers doesn't work.
I think that would be great, because it enables inexperienced c++, vs pipeline users to quickly try and use this awesome library.
Synthetic tests for the various configurations would be able to create a comparison table for all the different functions, and find problematic areas.
Apparently float also accepts the modulo operator so it needs to be added to every type
I tried replacing handmademath with this library in my opengl application but it broke.
Hi @redorav,
I've noticed that in one of the latest commits you've added manual implementation of the copy constructors and copy assignment operators to matrix types. As the result, C++ compiler does not generate move constructors and assignment operators automatically for these types and I received a bunch of issues from my static analysis system regarding std::move(matrix)
calls and other std::move(...)
calls for types that have matrix fields in Methane Kit. This can be fixed either by removing manual implementation of copy constructors and assignment operators to let C++ do the magic of auto-generating them properly or implement both copy and move constructors and assignment operators (according to rule of five). Also be sure to make move constructors and assignment operators noexcept
according to standard. I have suggested to do this before in issue #40 which was fixed with removal of manual implementations. Is there any reason to keep these manual implementations? Are they different from the auto-generated ones?
They're too generic currently and inefficient. We can probably specialize most combinations using constructs such as
vcombine_f32(vget_high_f32(x), vget_low_f32(y))
vrev64q_f32(x)
etc.
Other vector types have proper initialization of the internal storage with zeroes, but double vectors do not.
This is already halfway done, but here for keeping track. Takes advantage of AVX support to pack double3 and double4 into __m256d instead of two __m128d
It is ambiguous to do things like hlslpp::radians(0.3f) because float can be implicitly converted to floatN. Even if it's not the purpose of hlsl++ to provide scalar versions of these functions it's probably not hard and makes it more complete
Add a non-vectorized version of the library. This can allow to mix and match on platforms (like NEON 32-bit) that don't have vectorized double types but may want to use the math lib. It can also help in future comparisons between vectorized code and scalar code.
I think the function refract
is missing. The following code is stolen from here. I
is the incident vector, N
is the normal vector, and eta
is the ratio of indices of refraction.
k = 1.0 - eta * eta * (1.0 - dot(N, I) * dot(N, I));
if (k < 0.0)
R = floatN(0.0);
else
R = eta * I - (eta * dot(N, I) + sqrt(k)) * N;
Please add it thank you.
Below code snippet is the current behaviour for me.
float1 y{ -0.01f };
float uf = hlslpp::floor(y); // returns -1 : ok
float3 broken{ -11.15f,-0.1f,-15.0f };
// Accessing the Y component is correct
float yVal = broken.y;
// next statement returns -12.0f -> Floor of x component
float actualf = hlslpp::floor(broken.y);
The floor function seems to be flooring my x component and returning that value instead of the Y component.
hlslpp_inline float32x4_t vceilq_f32(float32x4_t x)
{
float32x4_t trnc = vcvtq_f32_s32(vcvtq_s32_f32(x)); // Truncate
float32x4_t gt = vcgtq_f32(trnc, x); // Check if truncation was greater or smaller (i.e. was negative or positive number)
uint32x4_t shr = vshrq_n_u32(vreinterpretq_u32_f32(gt), 31); // Shift to leave a 1 or a 0
float32x4_t result = vaddq_f32(trnc, vcvtq_f32_u32(shr)); // Add to truncated value
return result;
}
"float32x4_t gt = vcgtq_f32(trnc, x);" should be modified to "float32x4_t gt = vcgtq_f32(x, trnc);"
Floating point vector types have /= operator, but integer vectors do not.
hi,
hlslpp looks great,the only reason which prevents me to use it is the size and alignment of each types.
float1/2/3 is 16 bytes, and every floatN(xM) in hlslpp has alignment of 16 bytes(rather than 4).
that's very different from hlsl, we can't share some code between c++ and hlsl, such as some buffer struct defines.
any thoughts about it ? thanks.
I think there is a mistake in the NEON definition of _hlslpp_sel_ps
The SSE definition is
#define _hlslpp_sel_ps(x, y, mask) _mm_blendv_ps((x), (y), (mask))
which is correct, whem mask is 1 y is selected otherwise x
in NEON
#define _hlslpp_sel_ps(x, y, mask) vbslq_f32((mask), (x), (y))
which should be
#define _hlslpp_sel_ps(x, y, mask) vbslq_f32((mask), (y), (x))
in vbslq_f32 when mask is one the second argument is selected otherwise the third
operator ==, !=, <, <=, >, >=
is not implemented for double1
, double2
, double3
, double4
types and the implementation is commented out for uint1
, uint2
, uint3
, uint4
. Meanwhile these operators are properly implemented for floats and ints.
I'm trying to use vector types in my template wrapper class Point<T, N>
which is used with floats, ints, uints and doubles and its is currently failing to compile for T=uint32_t
and T=double
because of this asymetry in underlying vector types implementation.
Would it be possible to implement these comparison operators for all vector types?
Latest version of HLSL++ doest not build with GCC & MSVC at maximum warning level:
hlsl++_sse.h:792:41: error: invalid cast of an rvalue expression of type ‘__m128’ {aka ‘__vector(4) float’} to type ‘const n128i&’ {aka ‘const __vector(2) long long int&’} 792 | x = (const n128i&)_mm_load_ss((float*)p);
hlsl++_sse.h(792,20): warning C4238: nonstandard extension used: class rvalue used as lvalue
For vectors and matrices. Vectors return a float1, matrices a float4
countbits
reversebits
firstbithigh
firstbitlow
++, --
I noticed there's store() but no load(). There is a section specified as "Float Store/Load" but load is missing. Just making sure it's not forgotten. Would be handy.
For clarity, as the physical layout doesn't necessarily match (e.g. float4x3 and float3x4 have same physical layout)
Seems simple enough, it's these functions:
// Float
_mm_blend_ps
_mm_blendv_ps
_mm_trunc_ps
_mm_round_ps
_mm_ceil_ps
// Int
_mm_blend_epi16
_mm_mullo_epi32
_mm_mul_epi32
_mm_max_epi32
_mm_min_epi32
// Double
_mm_blend_pd
Add options like different ndcs, handedness, etc.
Hi,
I would found it very useful to add the 'any'/'all' HLSL syntax to branch according to vector comparison result.
ie.
void CommandList::setViewport(const uint4 & _viewport)
{
if ( any( _viewport != m_viewport ) )
{
bindViewport(_viewport);
m_viewport = _viewport;
}
}
Operators like intN operator != could return boolN to make it even clearer to use.
Thanks,
Benoît.
SSE doesn't have native division instructions for vectors. One possibility is to extract the scalars, divide, then put back. Another alternative is to take a look at this website which seems to have alternatives and claim to be fast
Would you consider adding support for double precision matrix such as double4x4?
~, <<, >>, &, |, ^, <<=, >>=, &=, |=, ^=
Also modulo %
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.