dangmoody / hlml Goto Github PK
View Code? Open in Web Editor NEWAuto-generated maths library for C and C++ based on HLSL/Cg
Auto-generated maths library for C and C++ based on HLSL/Cg
Currently, we include .h everywhere and the .h files will include their respective .inl files (if any) at the bottom of the file. This will affect the user's compile times.
Instead, source files that include HLML files should include .inl files, which include .h files. The header of the user's app can still include HLML headers.
Had a look through this part of the library not long ago and I think we can optimise some of those functions.
I've been thinking about this a bit. Would this even be worth it?
It could be a fair amount of work to even know whether or not this is worth doing.
The generator probably needs better benchmarking support to be able to tell if this would be faster.
Would it be more beneficial to have users run the generator locally so that they can only have the math types they need in their codebase?
The main advantage of this would be that the user only has the code they care about in their codebase, meaning less code bloat.
This would have to be done via some kind of config file, which specifies which types and features the user wants the generator to generate.
This could be cool to show what the assembly of each function is likely to be (with -O3
and -ffast-math
on, for instance).
When making changes to the generated code you'd also be able to see diffs in the assembly too, to see maybe exactly when a function became faster/slower?
On Ubuntu 22.04.3 LTS using G++ and Clang++ with CPP standard 20 there are conflicting definitions of the lerp function. Here is the full error message:
[build] [ 50%] Building CXX object CMakeFiles/testexe.dir/main.o [build] In file included from /home/sigill/Dev/CPP/HLMLTest/./cpp/hlml. [HLML_lerp_conflict_test.zip](https://github.com/dangmoody/HLML/files/14343402/HLML_lerp_conflict_test.zip) h:127, [build] from /home/sigill/Dev/CPP/HLMLTest/main.cpp:1: [build] /home/sigill/Dev/CPP/HLMLTest/./cpp/hlml_functions_scalar.h:170:69: error: ‘float lerp(float, float, float)’ conflicts with a previous declaration [build] 170 | HLML_INLINE float lerp( const float a, const float b, const float t ) [build] | ^ [build] In file included from /usr/include/c++/11/math.h:36, [build] from /home/sigill/Dev/CPP/HLMLTest/./cpp/hlml_functions_scalar.h:41, [build] from /home/sigill/Dev/CPP/HLMLTest/./cpp/hlml.h:127, [build] from /home/sigill/Dev/CPP/HLMLTest/main.cpp:1:
Please find attached the sample project used to generate the above error message:
HLML_lerp_conflict_test.zip
The lerp functions were added in standard 20: https://en.cppreference.com/w/cpp/numeric/lerp but for some reason MSVC does not throw any compilation errors here.
The zip on the releases page isn't actually a zip. I could get it extracted with 7z but Windows complained.
It'll be neater than what I'm doing right now.
Whoops! Need to re-add that!
Do they just need updating or something?
E.G:
float3 x = float3( 1.0f, 1.0f, 1.0f );
float3 y = -x;
I thought I'd added this before but it looks like I haven't. This wants to get added sooner rather than later because this is used quite often!
In C this will want to be float3_negate( &x )
(using the example above).
Sometimes in the tests you'll see a test (float4x4_caddv, for instance) take 0.5 microseconds in one test, and then 10 microseconds in the next one. It would be good to know why that is.
The size of a bool can change depending on what compiler is used to compile the program (Or even defined by the user).
HLML ideally needs some way to always guarantee that its bool types are the same size. My use case for this would be sending data to a shader via a uniform buffer, data may become misaligned when using different compilers to target different platforms.
It's become apparent that with the current way the SIMD functions are laid out that we really only need the following types of input structs:
translate_sse()
).scale_sse()
).So the current implementation where we have a separate input struct for each SIMD function should be re-done.
Doing this would make for nicer use, and potentially faster code as the user wouldn't be having to shuffle lots of data/registers around nearly as much.
The codebase has become a mess over time due to me not being able to see just what the code for the generator would look in it's current state. There's a lot more code than there probably needs to be. No semantic compression happening, etc.
It would be good to refactor the entire generator so that all the code for generating the C files is in one file, all the code for generating the C++ files in another, etc. I think that would be much neater than what we have now.
Then main
could just be something like the following:
int main( int argc, char** argv ) {
// some other pre-existing setup that's probably the same as before...
Gen_CodeC( ... );
Gen_CodeCPP( ... );
// any other shutdown stuff that was here from before
return EXIT_SUCCESS;
}
I've been trying to treat C and C++ as the same thing with some minor differences, but I think it could much more beneficial to just treat them as two separate languages, and then any similarities can be compressed into helper functions as needed.
If I'm right, this will significantly reduce the amount of code that exists in the generator atm, the codebase would be easier to navigate, and it would be easier to read.
I could be wrong about this, and it could actually be worse but this would be worth for me setting some time aside one day to look at.
Either way, the current state of the generator codebase is a mess and could definitely be done a lot better than how it is now.
Typing comp_
every time for a component-wise transformation can be a little bit of a PITA. Typing c
would be easier.
For example, instead of typing:
float4_comp_addv( &a, &b );
It would be easier to type:
float4_caddv( &a, &b );
I'm still not 100% sure this is something worth doing. What do we think?
Even if they're crap to begin with! It's important to be transparent with people about this sort of thing.
@Flave229 is seeing folder deletion/creation randomly fail and crash the generator locally for him.
This likely just needs someone adding a bunch more calls to GetLastError()
and then working off the returned errors from there.
It just looks like Travis have changed how their Windows VMs work on their end and I need to change some config stuff to get this to work again.
Then again - it's only MSVC; will we really miss it?
Trying to compile in a cpp project on visual studio 2022 tool chain results in the following two errors:
Error c3861 is easily solved by including assert.h at the top of hlml.h. C4146 can be fixed by removing the "-" from the return types of negation operator functions.
Every time I need a script I have to write it twice: Once for Batch, and again for Bash. This is annoying and not that easy to maintain. I need a better solution.
I was looking at ODIN recently, could be good?
We could do with tests for the following:
When warnings Level 4 is enabled for projects including HLML, compiling with MSVC, the following warnings occurs: C4201: nonstandard extension used: nameless struct/union
Temper 1.0 didn't have things like parametric test support, which Temper 2.0 now does.
We can really start to throw a lot of tests at HLML now to harden it.
This would take a while because nearly every test probably wants to get parameterised.
I saw the xxHash documentation the other day that they have an optional #define
for using the API with namespaces. It's completely optional and seems to 'just work'.
This means that HLML could also do the same thing and provide support for users who have been complaining of name collisions with other libraries/modules.
I'm only going to do an initial investigation for now which looks into how much friction and boilerplate this ends up introducing to the codebase. My guess is quite a bit.
The following error occurs multiple times for many functions in this file:
warning C4244: 'return': conversion from 'int32_t' to 'float', possible loss of data
Currently division for non-square matrix types does a component-wise division. Should do a multiply-by-pseudo-inverse to be consistent with square-matrices (which do a multiply-by-inverse).
Useful helper function, definitely faster than comparing against a bool vector/matrix constructor of all true
. Should definitely go in.
Basically should just look like this:
bool all( const bool2& x ) {
return x.x && x.y;
}
// and so on for bool3, bool4, bool3x2, etc.
Well, this is embarrassing...
This probably needs to be added.
Machines with ARM ISAs are becoming more present so it's probably a good idea to make HLML support it.
Not sure how much work this will be but hopefully it's not going to be a massive amount.
Will testing on a Raspberry Pi be sufficient?
Will mean I need to rename float4x4_rotate
to float4x4_rotate_angle_axis
.
I've just checked the build logs on Travis and it looks like the Mac OS/Linux timer implementation is just returning results that are incorrect.
I thought I'd got this correct the first time around. Is it possible that testing via a Linux VM guest was screwing with the results somehow?
I'll sort it.
People have said that they don't like the fact that if they initialise any math type they'll have to pay the cost of the zero-initialisation. Therefore this needs to be option (either in the generator or through the code usage itself) that people can opt-in to.
I've initially overlooked the fact that it makes more sense that the code doesn't do anything other than what the programmer tells it to, but something like this should still be given as optional functionality that's minimal and easy to use.
One of the main issues trying to compile a couple of generic HLSL functions with HLML was swizzling. https://github.com/redorav/hlslpp has a templated solution to this and it looks like a perfect thing for autogeneration: https://github.com/redorav/hlslpp/tree/master/include/swizzle :)
A friend of mine using the library found that he wanted to do a dot product for 2 float4
s but he only cared about dotting the X, Y, and Z components. He suggested a dot_lean()
function which would do this.
Sounds like a good idea, but I'm wondering if stuff like this could be added per-application instead of just having in the library; would it add confusion? Would this be bloat?
The main headers that users include are currently:
It's not obvious that these are the main headers to include (both by name and documentation).
So either this needs to be documented, or better header names need to be thought of.
Noticed this in operator+=
for instance in some vector types.
Hi! Thanks for the great lib. Just trying to compile some HLSL code, stumbled upon this one.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.