Code Monkey home page Code Monkey logo

thermite's People

Contributors

novacrazy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Forkers

isgasho iq-scm

thermite's Issues

Is non-x86 SIMD planned?

If NEON support is not planned then repository and project description should be updated, reflecting the x86-only goals.

Dense linear algebra library

It would be great to have some optimized BLAS-like kernels for vector/matrix operations, both SoA and AoS styles.

Feature Tracking

Backends

  • Scalar
  • SSE2 (in-progress)
  • SSE4.2 (in-progress)
  • AVX (in-progress)
  • AVX2
  • AVX512F
  • WASM SIMD
  • ARM/aarch64 NEON

Extra data types

  • i16/u16
  • i8/u8

These can use 128-bit registers even on AVX/AVX2, and 256-bit registers on AVX512

Polyfills

  • Emulated FMA on older platforms
    • For f32, promote to f64 and back.
    • For f64, implement this method

Iterator library

  • Prototype

Vectorized math library

Currently fully implemented for single and double-precision:
sin, cos, tan, asin, acos, atan, atan2, sinh, cosh, tanh, asinh, acosh, atanh, exp, exp2, exph (0.5 * exp), exp10, exp_m1, cbrt, powf, ln, ln_1p, ln2, ln10, erf, erfinv, tgamma, lgamma, next_float, prev_float

Precision-agnostic implementations: lerp, scale, fmod, powi (single and vector exponents), poly, poly_f, poly_rational, summation_f, product_f, smoothstep, smootherstep, smootheststep, hermite (single and vector degrees), jacobi, legendre, bessel_y

TODO:

  • Beta function
  • Zeta function
  • Digamma function

Bessel functions:

  • Bessel J_n for n > 1, n=0 and n=1 are implemented.
  • Bessel J_f (Bessel function of the first kind with real order)
  • Bessel Y_f (Bessel function of the second kind with real order)
  • Bessel I_n (Modified Bessel function of the first kind)
  • Bessel K_n (Modified Bessel function of the second kind)
  • Hankel function?

Complex and Dual number libraries

  • Make difficult parts branchless, ideally.

Precision Improvements

  • Improve precision of lgamma where possible.
    • Should it fallback to ln(tgamma(x)) when we know it won't overflow?
  • Improve precision of trig functions when angle is a product of ฯ€ (sin(x*ฯ€), etc.)
  • Compensated float fallbacks on platforms without FMA

Performance improvements:

  • Investigate ways to improve non-FMA operations.
  • Look for ways to simplify more expressions algebraically.
  • Experiment with the "crush denormals" trick to remove denormal inputs?
    • 1 - (1 - x) is the trick.

Policy improvements:

  • Improve codegen size for Size policy, especially when WASM support is added (both scalar and SIMD)

Testing

  • Structured tests for all vector types and backends (some partial tests exist, but I need to clean them up)
  • Tests for the math library

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.