Code Monkey home page Code Monkey logo

faer-rs's People

Contributors

aentity avatar bernhard-musch avatar cramt avatar delicioushair avatar djduque avatar edyounis avatar geo-ant avatar goulart-paul avatar kenken-neko avatar martineekgerhardsen avatar masahirokato71 avatar mhovd avatar oojo12 avatar pleepleus avatar ptnobel avatar sarah-ek avatar tastaturtaste avatar zusez4 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

faer-rs's Issues

Unit test fails in `faer-core`

Sorry I don't have time currently to try and track and fix the issue in more detail:

Running cargo t in faer-core fails non-deterministically with the error:

failures:

---- mul::tests::triangular stdout ----
thread 'mul::tests::triangular' panicked at 'attempt to subtract with overflow', /home/daniel/.cargo/registry/src/github.com-1ecc6299db9ec823/gemm-0.12.0/src/gemm.rs:960:35


failures:
    mul::tests::triangular

test result: FAILED. 12 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.07s

implement `core::ops::Add` for `Mat<T>`

implement addition for Mat<T>

the traits core::ops::Add<Mat<T>> and core::ops::Add<&Mat<T>> should be implemented for Mat<T> and &Mat<T>

T should satisfy the trait ComplexField for those operations to be implementable. Mat::with_dims could be used, with a closure that takes two elements and adds them

implement the qr module

implement the qr decomposition with and without column pivoting, along with the solve/reconstruct/inverse/least_squares functions to complete the decomposition api

the qr module is also needed for the future svd module implementation, since it could be used as a preconditioner.

Bump Polars Version to 0.34

Is your feature request related to a problem? Please describe.
Faer right now doesn't work with Polars 0.34. We should update it and make sure nothing breaks and the tests still pass.

Describe the solution you'd like

polars = { version = "0.34", optional = true, features = ["lazy"] }

and see if existing functions still work.

Describe alternatives you've considered
NA

Additional context
NA

Thanks in advance

Error `Assertion failed` when computing eigenvalues for a particular matrix

When this matrix

[
[0.0 + 0.0 * I, 0.0 + 0.0 * I, 0.0 + 0.0 * I, 2.220446049250313e-16 + -1.0000000000000002 * I],
[0.0 + 0.0 * I, 0.0 + 0.0 * I, 2.220446049250313e-16 + -1.0000000000000002 * I, 0.0 + 0.0 * I],
[0.0 + 0.0 * I, 2.220446049250313e-16 + -1.0000000000000002 * I, 0.0 + 0.0 * I, 0.0 + 0.0 * I],
[2.220446049250313e-16 + -1.0000000000000002 * I, 0.0 + 0.0 * I, 0.0 + 0.0 * I, 0.0 + 0.0 * I],
]

as a Mat<c64> is passed to eigenvalues or complex_eigenvalues, it throws an error.

numpy does compute eigenvalues:

ar = np.array([
[0.0 + 0.0j, 0.0 + 0.0j, 0.0 + 0.0j, 2.220446049250313e-16 + -1.0000000000000002j],
[0.0 + 0.0j, 0.0 + 0.0j, 2.220446049250313e-16 + -1.0000000000000002j, 0.0 + 0.0j],
[0.0 + 0.0j, 2.220446049250313e-16 + -1.0000000000000002j, 0.0 + 0.0j, 0.0 + 0.0j],
[2.220446049250313e-16 + -1.0000000000000002j, 0.0 + 0.0j, 0.0 + 0.0j, 0.0 + 0.0j],
])

In [1]: np.linalg.eigvals(ar)
Out[1]: 
array([ 1.11022302e-16-1.j, -2.22044605e-16+1.j,  1.11022302e-16-1.j,
       -2.22044605e-16+1.j])

The error messages are

Assertion failed at /home/lapeyre/.cargo/registry/src/index.crates.io-6f17d22bba15001f/faer-core-0.13.0/src/lib.rs:2093:13:
  assert!( row < this.nrows() )
with expansion:
  4 < 4

Several other matrices do give the same eigenvalues that numpy does.

Version: 0.13.1
Arch Linux

Rename fill_with_constant to fill?

I was skimming through the high level API PR, and saw renaming of set_constant to fill_with_constant. It's really a minor thing, but how about naming this method just fill as analogous to slice::fill? As for fill_with_zero, it does not have a counterpart in std (or I am now aware of it), but fill_zero sounds fine to me. This way, you can leave fill_with name for the method taking a filling closure, as is also a convention in std (slice::fill_with).

`AddAssign` and `SubAssign` traits

Is your feature request related to a problem? Please describe.
I have a x: Mat<f64> and y: MatRef<'_, f64> and I need to update x = x - y.

Describe the solution you'd like
I think it would be convenient to have an impl SubAssign<Rhs = MatRef> for Mat such that it reuses the memory allocated for x.

Describe alternatives you've considered
I am currently solving my issue with Mat::with_dims to create a new matrix. Implementing these traits could be a bit more performant, and it would definitely be more readable.

Segfault in faer-mt-cplx-llt-512

Running cargo bench faer-mt-cplx-llt-512 leads to a segfault on my machine (the corresponding single threaded bench seems to run fine). I get the following traceback:

#0  0x000055c4371ca569 in core::ptr::write<[num_complex::Complex<f64>; 3]> () at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/core/src/ptr/mod.rs:1354                                                                      
#1  core::ptr::mut_ptr::{impl#0}::write<[num_complex::Complex<f64>; 3]> () at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/core/src/ptr/mut_ptr.rs:1449                                                                     
#2  gemm_common::pack_operands::pack_generic_inner_loop<num_complex::Complex<f64>, 1, 3> () at /home/adr/.cargo/registry/src/github.com-1ecc6299db9ec823/gemm-common-0.15.2/src/pack_operands.rs:163                                   
#3  gemm_common::pack_operands::pack_generic<num_complex::Complex<f64>, 1, 3> () at /home/adr/.cargo/registry/src/github.com-1ecc6299db9ec823/gemm-common-0.15.2/src/pack_operands.rs:256                                              
#4  gemm_common::pack_operands::pack_rhs::{closure#0}<num_complex::Complex<f64>, 1, 3, gemm_common::simd::x86::Fma> () at /home/adr/.cargo/registry/src/github.com-1ecc6299db9ec823/gemm-common-0.15.2/src/pack_operands.rs:301        
#5  gemm_common::simd::x86::{impl#2}::vectorize<gemm_common::pack_operands::pack_rhs::{closure_env#0}<num_complex::Complex<f64>, 1, 3, gemm_common::simd::x86::Fma>> (f=...)                                                           
    at /home/adr/.cargo/registry/src/github.com-1ecc6299db9ec823/gemm-common-0.15.2/src/simd.rs:50                                                                                                                                     
#6  0x000055c4371c74ef in gemm_common::pack_operands::pack_rhs<num_complex::Complex<f64>, 1, 3, gemm_common::simd::x86::Fma> (n=<optimized out>, k=8192, dst=..., src=..., src_cs=48, src_rs=255, dst_stride=768)                      
    at /home/adr/.cargo/registry/src/github.com-1ecc6299db9ec823/gemm-common-0.15.2/src/pack_operands.rs:299                                                                                                                           
#7  0x000055c4371c4992 in gemm_common::gemm::gemm_basic_generic::{closure#3}<gemm_common::simd::x86::Fma, num_complex::Complex<f64>, 2, 6, 3, 3, gemm_c64::gemm::f64::fma_cplx::gemm_basic_cplx::{closure_env#0}> (                    
    tid=<optimized out>) at /home/adr/.cargo/registry/src/github.com-1ecc6299db9ec823/gemm-common-0.15.2/src/gemm.rs:395                                                                                                               
#8  0x000055c4371f88ea in core::ops::function::impls::{impl#0}::call<(usize), (dyn core::ops::function::Fn<(usize), Output=()> + core::marker::Send + core::marker::Sync)> (args=..., self=<optimized out>)                            
    at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/core/src/ops/function.rs:263                                                                                                                                            
#9  core::ops::function::impls::{impl#1}::call_mut<(usize), &(dyn core::ops::function::Fn<(usize), Output=()> + core::marker::Send + core::marker::Sync)> (args=..., self=<optimized out>)                                             
    at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/core/src/ops/function.rs:274                                                                                                                                            
#10 core::iter::traits::iterator::Iterator::for_each::call::{closure#0}<usize, &&(dyn core::ops::function::Fn<(usize), Output=()> + core::marker::Send + core::marker::Sync)> (item=8192)                                              
    at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/core/src/iter/traits/iterator.rs:849                                                                                                                                    
#11 core::iter::traits::iterator::Iterator::fold<core::ops::range::Range<usize>, (), core::iter::traits::iterator::Iterator::for_each::call::{closure_env#0}<usize, &&(dyn core::ops::function::Fn<(usize), Output=()> + core::marker::
Send + core::marker::Sync)>> (self=..., f=..., init=<optimized out>) at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/core/src/iter/traits/iterator.rs:2477                                                                  
#12 core::iter::traits::iterator::Iterator::for_each<core::ops::range::Range<usize>, &&(dyn core::ops::function::Fn<(usize), Output=()> + core::marker::Send + core::marker::Sync)> (self=..., f=0x7fe657ffd740)                       
    at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/core/src/iter/traits/iterator.rs:852                                                                                                                                    
#13 rayon::iter::for_each::{impl#1}::consume_iter<&(dyn core::ops::function::Fn<(usize), Output=()> + core::marker::Send + core::marker::Sync), usize, core::ops::range::Range<usize>> (self=..., iter=...)                            
    at /home/adr/.cargo/registry/src/github.com-1ecc6299db9ec823/rayon-1.7.0/src/iter/for_each.rs:55                                                                                                                                   
#14 rayon::iter::plumbing::Producer::fold_with<rayon::range::IterProducer<usize>, rayon::iter::for_each::ForEachConsumer<&(dyn core::ops::function::Fn<(usize), Output=()> + core::marker::Send + core::marker::Sync)>> (self=...,     
    folder=...) at /home/adr/.cargo/registry/src/github.com-1ecc6299db9ec823/rayon-1.7.0/src/iter/plumbing/mod.rs:110                                                                                                                  
#15 rayon::iter::plumbing::bridge_producer_consumer::helper<rayon::range::IterProducer<usize>, rayon::iter::for_each::ForEachConsumer<&(dyn core::ops::function::Fn<(usize), Output=()> + core::marker::Send + core::marker::Sync)>> ( 
    len=<optimized out>, migrated=<optimized out>, splitter=..., producer=..., consumer=...) at /home/adr/.cargo/registry/src/github.com-1ecc6299db9ec823/rayon-1.7.0/src/iter/plumbing/mod.rs:438                                     

OS: arch linux
CPU: AMD Threadripper 1920X

version: current main branch (88dd024)

Matrix Addition and Subtraction using `+` and `-` Operators

Is your feature request related to a problem? Please describe.

Hey, this might be an embarrassingly dumb question, but I cannot see a way to add two matrices using + (or subtract using -). The official website does mention that the addition and subtraction operators should work as expected, but when I write a+b for e.g. two Mat<f64> instances, it won't compile. Looking at the faer_core crate I cannot find an implementation of core::ops::Add. What am I missing?

Describe the solution you'd like

It would be great if the + operator would just work for two suitable Mat<E> instances as well as one Mat<E> and the as a suitable MatRef instance. I can elaborate on what I mean by suitable, but roughly I mean if the entities can be added, then the matrix should be addable (given that the matrix dimensions match).

Describe alternatives you've considered
I could of course do element wise additions manually, but that would feel so much less ergonomic than having an Add trait implemented. Same for AddAssign.

Additional context

If I missed an obvious way to add matrices, I'd really appreciate a hint. If the Add and AddAssign, Sub, and SubAssign traits are indeed not implemented I'd be very happy to file a PR because this crate is really awesome. Please let me know and thanks for your work :)

implement `core::ops::Sub` for `Mat<T>`

implement subtraction for Mat<T>

the traits core::ops::Sub<Mat<T>> and core::ops::Sub<&Mat<T>> should be implemented for Mat<T> and &Mat<T>

T should satisfy the trait ComplexField for those operations to be implementable. Mat::with_dims could be used, with a closure that takes two elements and subtracts them

Unexpected results using `faer_core::solve::solve_lower_triangular_in_place_with_conj`

First of all, thanks a lot for your work in faer, it is impressive. It has also been a lot of fun to play around with the library.

Describe the bug
The results of solving a lower triangular system are unexpected.

To Reproduce
The following code reproduces the error. It also shows how the equivalent code with nalgebra gets the correct solution:

# Cargo.toml

[package]
name = "test"
version = "0.1.0"
edition = "2021"

[dependencies]
faer-core = "0.9.1"
nalgebra = "0.32.2"
// main.rs

const RESPONSE: [f64; 3] = [-18.0, -60.0, -80.0];

// Toeplitz and lower triangular
fn r_element(i: usize, j: usize) -> f64 {
    if i >= j {
        let diff = i - j;
        if diff < RESPONSE.len() {
            RESPONSE[diff]
        } else {
            0.0
        }
    } else {
        0.0
    }
}

fn r_matrix_faer(n: usize) -> faer_core::Mat<f64> {
    faer_core::Mat::with_dims(n, n, |i, j| r_element(i, j))
}

fn r_matrix_nalgebra(n: usize) -> nalgebra::DMatrix<f64> {
    nalgebra::DMatrix::from_fn(n, n, |i, j| r_element(i, j))
}

// Solve for X in Y = R * X
fn solve_x_faer(r: &faer_core::Mat<f64>, y: &mut faer_core::Mat<f64>) {
    faer_core::solve::solve_lower_triangular_in_place_with_conj(
        r.as_ref(),
        faer_core::Conj::No,
        y.as_mut(),
        faer_core::Parallelism::None,
    );
}

// Solve for X in Y = R * X
fn solve_x_nalgebra(r: &nalgebra::DMatrix<f64>, y: &mut nalgebra::DMatrix<f64>) {
    r.solve_lower_triangular_mut(y);
}

fn main() {
    let (i, j) = (400, 1);

    let r = r_matrix_faer(i);
    let mut x = faer_core::Mat::zeros(i, j);
    x.write(0, 0, 150.0);

    let mut y = r.clone() * x;

    solve_x_faer(&r, &mut y); // The values in `y` blow up

    // Now, the same thing with nalgebra
    let r = r_matrix_nalgebra(i);
    let mut x = nalgebra::DMatrix::zeros(i, j);
    x[(0, 0)] = 150.0;

    let mut y = r.clone() * x;

    solve_x_nalgebra(&r, &mut y); // The values in `y` are correct
}

Expected behavior
I would expect the same results as nalgebra

Automate Testing

Is your feature request related to a problem? Please describe.

At current there are test but no automated tests run via CI to ensure bugs aren't introduced as changes are made.

Describe the solution you'd like
A GitHub Action to run the test and output our current test coverage.

Describe alternatives you've considered
None seemed natural to use GitHub Actions since we're on GitHub. Believe travis-CI is a viable alternative though.

More specific Cholesky error

In the LAPACK Cholesky factorization routines (e.g. DPOTRF) when it encounters an error it returns the details about where it failed (INFO > 0). Is it possible to add something similar to the faer Cholesky routines?

`MulAssign<E: Entity>`

Is your feature request related to a problem? Please describe.
I have a x: Mat<f64> and an n: f64and I need to update x = n * x.

Describe the solution you'd like
I think it would be convenient to implement the MulAssign trait such that we can do x *= n and it reuses the memory allocated for x.

Describe alternatives you've considered
Maybe it would be good to have a generic map_with method that transforms all the elements of a matrix in place. It could be used to implement this MulAssign trait, and it would probably be useful for other things as well.

Add PGO version to benchmarks

Is your feature request related to a problem? Please describe.
Hi, I ran the benchmarks yesterday on my machine (Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz) and did a quick experiment with PGO. The wallclock results are even more impressive and are eventually worth adding to the benchmark section.

Describe the solution you'd like
Make user aware of PGO, link the section in Rust documentation and add some additional estimates of benchmarks.

Additional context

Two examples with least and most improvements. I have seen a consistent improvement for all benchmarks, especially for the parallelized version.

Matrix multiplication (f32)

with PGO w/o PGO
    n       faer  faer(par)    ndarray   nalgebra      eigen
   32        1ยตs      828ns      1.4ยตs      1.7ยตs      812ns
   64      4.8ยตs      4.8ยตs      7.3ยตs      9.5ยตs      4.6ยตs
   96     14.9ยตs     15.5ยตs     18.9ยตs     29.6ยตs      8.3ยตs
  128     36.6ยตs     21.8ยตs     43.3ยตs     67.5ยตs     21.7ยตs
  192      115ยตs       39ยตs     42.7ยตs    221.9ยตs     31.4ยตs
  256    272.5ยตs     80.2ยตs    133.7ยตs    513.3ยตs     81.6ยตs
  384    910.1ยตs    247.4ยตs    223.8ยตs      1.7ms    179.3ยตs
  512      2.3ms    530.9ยตs    610.9ยตs        4ms    574.6ยตs
  640      4.4ms    998.5ยตs      1.1ms      7.8ms    895.4ยตs
  768      7.5ms      1.6ms      1.4ms     13.4ms      1.3ms
  896     12.2ms      2.7ms      2.6ms     21.5ms      2.6ms
 1024     18.9ms        4ms      4.1ms     32.2ms      3.5ms
    n       faer  faer(par)    ndarray   nalgebra      eigen
   32      1.3ยตs      1.1ยตs      1.5ยตs      1.9ยตs      811ns
   64      5.5ยตs      5.4ยตs      7.4ยตs      9.9ยตs      4.3ยตs
   96     15.9ยตs     56.4ยตs     20.1ยตs     30.5ยตs      8.3ยตs
  128     38.9ยตs     64.6ยตs     43.7ยตs     69.3ยตs     21.7ยตs
  192    118.1ยตs     80.4ยตs     42.6ยตs    222.7ยตs     31.5ยตs
  256    278.5ยตs    114.1ยตs    133.3ยตs    512.1ยตs     81.2ยตs
  384    936.1ยตs    330.9ยตs    226.2ยตs      1.7ms    179.1ยตs
  512      2.3ms    636.4ยตs    590.5ยตs        4ms    570.3ยตs
  640      4.4ms      1.2ms      1.1ms      7.8ms      895ยตs
  768      7.5ms      1.9ms      1.4ms     13.4ms      1.3ms
  896     12.6ms      3.1ms      2.6ms     21.4ms      2.6ms
 1024     18.6ms      4.5ms      4.8ms     32.1ms      3.5ms

Thin matrix singular value decomposition (f32)

with PGO w/o PGO
    n       faer  faer(par)    ndarray   nalgebra      eigen
   32    942.2ยตs      1.3ms      5.8ms      2.8ms      1.7ms
   64      2.6ms      3.2ms       16ms     10.9ms      5.3ms
   96      5.2ms      6.2ms     29.6ms     24.6ms     11.4ms
  128      8.2ms        9ms     45.8ms     43.2ms     17.6ms
  192     16.6ms     16.1ms     64.3ms     98.3ms     36.1ms
  256     27.7ms     24.6ms     94.2ms    185.1ms     61.6ms
  384     57.8ms     44.5ms    147.2ms      445ms      118ms
  512    102.9ms     68.6ms    222.5ms    926.3ms    201.8ms
  640    160.9ms     93.9ms    295.2ms      1.53s    311.4ms
  768    236.9ms      127ms    368.9ms      2.49s    602.2ms
  896    328.2ms      166ms    469.9ms      3.65s    611.1ms
 1024    453.5ms    215.1ms    732.9ms      6.07s    808.6ms
    n       faer  faer(par)    ndarray   nalgebra      eigen
   32      1.5ms      2.5ms      5.7ms      3.1ms      1.7ms
   64      4.3ms      6.2ms     16.2ms     11.5ms      5.1ms
   96      8.3ms     13.9ms     29.6ms     25.5ms     10.8ms
  128     13.4ms     21.7ms     43.7ms     45.3ms     17.6ms
  192     27.4ms     43.9ms     73.1ms    102.8ms     34.4ms
  256     45.8ms       72ms     94.5ms    194.1ms     54.4ms
  384     94.6ms    128.7ms    147.9ms    451.9ms    112.9ms
  512    166.7ms    210.5ms    219.9ms    962.4ms    190.5ms
  640    258.1ms    313.3ms    288.2ms      1.53s      299ms
  768    385.5ms    443.6ms    367.1ms       2.5s      431ms
  896    526.4ms    578.5ms    466.4ms      3.61s      596ms
 1024    708.7ms      739ms    719.9ms      6.19s    789.8ms

f64_pgo.txt
f64_wo_pgo.txt
f32_wo_pgo.txt
f32_pgo.txt

Methods to read and write matrices from and to CSV

Is your feature request related to a problem? Please describe.
Especially in debugging, it is useful to be able to inspect the matrix in a non-terminal environment. The CSV format is easily implemented and read from, but has not yet been implemented.

Describe the solution you'd like
Methods for writing and reading matrices, either as implementations for a given type, or as generic functions.

High-level api for computing eigenvalues and eigenvectors from square matrices?

Hi @sarah-ek,

Is there any plans on adding any high-level APIs for computing eigenvectors and eigenvalues of square, but not necessarily symmetric, matrices?

I really like the ergonomics of your library but it would be nice to just have a functional, simple API for providing matrices and getting out eigenvalues and their associated eigenvectors without having to hand roll some form of iterative/inverse iteration method in each new crate.

Auto Generate Release Notes

Is your feature request related to a problem? Please describe.
Another maintenance task. Unlike #10 which is geared towards developers knowing what low level changes happened. This one is geared towards user facing changes so the average user knows what changes happened between releases without much hassle on the maintainers side.

Describe the solution you'd like
This is one I have to research a bit more. I forgot the means to do this but I know it's possible and seemed rather useful.

You can assign this one to me

faer calculates the eigenvalues of a nontrivial adjacency matrix as `0` and `+/- inf`

Describe the bug
The Vec<c64> output of eigenvalues() for the adjacency matrix of the following $20$-vertex tree are incorrectly reported as all zeroes.

Note: the path graph on $3$ vertices produces the wrong eigenvalues. Here's a failing unit test.

To Reproduce

/* use the smaller unit test linked above
#[test]
fn this_other_tree_has_correct_maximum_eigenvalue() {
    let edges = [(3, 2), (6, 1), (7, 4), (7, 6), (8, 5), (9, 4), (11, 2), (12, 2), (13, 2), (15, 6), (16, 2), (16, 4), (17, 8), (18, 0), (18, 8), (18, 2), (19, 6), (19, 10), (19, 14)];
    let mut a = faer::Mat::zeros(20, 20);
    for (v, u) in edges.iter() {
        a[(*v, *u)] = 1.0;
        a[(*u, *v)] = 1.0;
    }
    let eigs_complex: Vec<faer::complex_native::c64> = a.eigenvalues();
    println!("{eigs_complex:?}");
    let eigs_real = eigs_complex
        .iter()
        .map(|e| e.re)
        .collect::<Vec<_>>();
    println!("{eigs_real:?}");
    let lambda_1 = *eigs_real
        .iter()
        .max_by(|a, b| a.partial_cmp(&b).unwrap())
        .unwrap();
    let correct_lamba_1 = 2.6148611139728866;
    assert!(
        (lambda_1 - correct_lamba_1).abs() < 1e-10,
        "lambda_1 = {lambda_1}, correct_lamba_1 = {correct_lamba_1}",
    );
}
*/

Expected behavior

name: graphenv
# channels:
#   - defaults
# dependencies:
#   - python
#   - networkx
#   - matplotlib
#   - imageio
#   - msgpack-python
#   - scipy

import networkx as nx
import numpy as np
import matplotlib.pyplot as plt

# Create the graph
edges = [[3, 2], [6, 1], [7, 4], [7, 6], [8, 5], [9, 4], [11, 2], [12, 2], [13, 2], [15, 6], [16, 2], [16, 4], [17, 8], [18, 0], [18, 8], [18, 2], [19, 6], [19, 10], [19, 14]]
graph = nx.Graph()
graph.add_edges_from(edges)

# Get the adjacency matrix eigenvalues
adjacency_matrix = nx.adjacency_matrix(graph).todense()
eigenvalues = np.linalg.eigvals(adjacency_matrix)

# Print the eigenvalues in decreasing order
print("Eigenvalues in decreasing order:")
for eigenvalue in sorted(eigenvalues, reverse=True):
    print(eigenvalue)

# Plot the graph
nx.draw(graph, with_labels=True)
plt.show()

Normal math operators can't be used with sparse matrices

Is your feature request related to a problem? Please describe.
Normal math operators like +, -, *, == etc. can't be used on sparse matrices. There are currently two bounds that can't be satisfied:

  1. The inner storage types of sparse matrices have to impl GenericMatrix, which they don't.
  2. The generic sparse types have to implement the matrix operation traits like MatMul, MatAdd etc. Right now MatMul is implemented for sparse types, but not any of the other operations.

Describe the solution you'd like
Right now, to multiply a sparse matrix with another matrix, you must manually invoke the MatMul implementation as so:

<SparseRowMat<_> as MatMul<DenseCol>>::mat_mul(lhs, rhs.as_ref());

And there's no way to add or subtract sparse matrices. It would be nice to be able to do something like lhs * &rhs and lhs + &rhs like normal. As well, it would also be nice to have a Debug and Index implementation for sparse matrices.

Improved ergonomics for `polars_to_faer_fxx`

The current iteration of faer::polars::polars_to_faer_fxx requires that a slice &[&str] of column names be passed to the function. While there is nothing wrong with this in an absolute sense, it does make for a somewhat awkward workflow when working with polars.

A pattern more aligned to how polars functions is to remove the &[&str] item from the function signature, and have the function create a Mat<E> from the entire frame. The function should also error if columns containing invalid data types (such as strings) are passed in as well.

Guidelines for efficient faer dynamic library

Hi!

I am really impressed by your colossal work on this math kernel!
I am writing a Julia wrapper to benchmark faer against OpenBLAS and MKL.

So far I have only studied the dense matrix-matrix multiplication. My preliminary results show faer approximately 50% slower than OpenBLAS and 25% slower than MKL on an AMD Ryzen 5 7640U on 8 threads.

This is basically my first Rust project and I want to be fair to faer: is this a reasonable dynamic library exposing faer inplace matrix multiplication using the C ABI?

I am not sure if opening an issue is the right way to ask but the faer documentation is very sparse at the moment on how to import external matrices.

use faer::modules::core::mul::matmul;
use faer::{mat, Parallelism};
use std::usize;

// inplace c = a * b
#[no_mangle]
pub unsafe extern "C" fn mult(
    c_ptr: *mut f64,
    c_nrows: u64,
    c_ncols: u64,
    c_row_stride: u64,
    c_col_stride: u64,
    a_ptr: *const f64,
    a_nrows: u64,
    a_ncols: u64,
    a_row_stride: u64,
    a_col_stride: u64,
    b_ptr: *const f64,
    b_nrows: u64,
    b_ncols: u64,
    b_row_stride: u64,
    b_col_stride: u64,
    nthreads: u32,
) {
    assert!(!c_ptr.is_null());
    assert!(!a_ptr.is_null());
    assert!(!b_ptr.is_null());

    let c = unsafe {
        mat::from_raw_parts_mut::<f64>(
            c_ptr,
            c_nrows as usize,
            c_ncols as usize,
            c_row_stride as isize,
            c_col_stride as isize,
        )
    };

    let a = unsafe {
        mat::from_raw_parts::<f64>(
            a_ptr,
            a_nrows as usize,
            a_ncols as usize,
            a_row_stride as isize,
            a_col_stride as isize,
        )
    };

    let b = unsafe {
        mat::from_raw_parts::<f64>(
            b_ptr,
            b_nrows as usize,
            b_ncols as usize,
            b_row_stride as isize,
            b_col_stride as isize,
        )
    };

    matmul(c, a, b, None, 1.0, Parallelism::Rayon(nthreads as usize));
}

`Mat::eigenvalues()` may lead to a panic from a failing assertion

Describe the bug
Calling Mat::eigenvalues() may lead to a panic from a failing assertion. To reproduce this, generate a 4096x4096 matrix with random f64 elements of range 0 to 1 and call the eigenvalues funtion (See code below).
This will result in the following error:

thread 'main' panicked at /home/sthagel/.cargo/registry/src/index.crates.io-6f17d22bba15001f/faer-0.18.1/src/linalg/evd/hessenberg.rs:816:26:
assertion `left == right` failed: iterators must have the same length
  left: 31
 right: 32
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

To Reproduce

use faer::mat::Mat;
use num::complex::Complex64;
use rand::prelude::*;

fn main() {
    let n = 4096;
    let n_sq = n * n;
    let mat = (0..n_sq)
        .map(|_| {
            let mut rng = rand::thread_rng();
            rng.gen()
        })
        .collect::<Vec<f64>>();
    let results = eigenvals_faer(&mat, n - 2, n);
    println!("{:#?}", results)
}

#[inline]
pub fn eigenvals_faer(
    mat: &[f64],
    nev: usize,
    n: usize,
) -> Result<Vec<Complex64>, &str> {
    if nev * n == 0 || nev > n - 2 {
        return Err("Invalid parameters!");
    }

    let faer_mat: Mat<f64> = Mat::from_fn(n, n, |i, j| mat[i + n * j]);

    let evs = faer_mat.eigenvalues();

    Ok(evs)
}

Expected behavior
One would expect the calculation to produce the desired eigenvalues

Desktop (please complete the following information):

  • OS: Linux Mint 21.3, Kernel 6.5.0-21-generic
  • Rust version 1.76.0 (stable)
  • Version 0.18.1

Additional info
On smaller matrices (e.g. 40x40 instead of 4096x4096) it works just fine.

implement `core::ops::{Add,Sub,Mul}` for `Mat<T>`

implement basic operations (addition, subtraction, matrix multiplication, multiplication by a scalar) for Mat<T>

T should satisfy the trait ComplexField for those operations to be implementable. for the implementation of addition, subtraction, and scalar multiplication, Mat::with_dims could be used, with a closure that takes two elements and adds or subtracts them

for matrix multiplication, we could create a matrix full of zeros with Mat::zeros, and then fill it up with the result of the computation with faer_core::mul::matmul

some missing macros / convenience functions

Is your feature request related to a problem? Please describe.
It makes sense to have the macros col! and row! in order to be consistent with mat!. This also seems a good opportunity to include a related macro (or function) blockwise!:

blockwise![[A, B], [C, D]];

// โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”
// โ”‚       โ”‚  โ”‚
// โ”‚   A   โ”‚ Bโ”‚
// โ”‚       โ”‚  โ”‚
// โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”ค
// โ”‚   C   โ”‚ Dโ”‚
// โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”˜

Gemm slow on specific AMD epyic CPU

Describe the bug
Running the faer-bench benchmark on 2xAMD EPYC 7V13 64-Core Processor
is surprisingly slow. Both in absolute numbers, as well as in comparing the speedup of faer(par) over faer(seq).
This does not seem to be the case on other, larger AMD server cpu's

To Reproduce
Just to keep track for myself:
CXXFLAGS="-I/u/drehwald/prog" CXX=g++ cargo +nightly run --release --no-default-features --features faer

Expected behavior
Well, don't be slow.

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

โžœ  ~ g++ --version                                                                                            
g++ (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

โžœ  ~ cargo +nightly --version
cargo 1.71.0-nightly (d0a4cbcee 2023-04-16)

Additional context

Our admin just got back to me, university admins probably won't adjust the perf settings for us, the machine is too busy so it would be a perf risk. But I got access to two other AMD machines, maybe we can use that for pinning the issue down.

Auto Generate Changelog

Is your feature request related to a problem? Please describe.
This is more of a maintenance task to make it obvious and easy to see what changes have happened over the life of the project.

Describe the solution you'd like
I'm thinking of using an action to generate the Changelog on releases.

Describe alternatives you've considered
Considered GitHub Actions and some other functions for doing it. Decided to stick with native GitHub tooling.

Feel free to assign to me, as I think it would be fun to do.

Geometric Algebra

Will this repository include topics such as Geometric Algerbra, such as G^n. This would be something I would be interested in helping contribute to.

Matrix Exponentiation

Is your feature request related to a problem? Please describe.
Matrix exponentation utilizing the standard double precision algorithm by Al-Mohy and Higham.

Describe the solution you'd like
A simple function that takes in a reference or matrix view (of some sorts not familiar with this library just yet) and yields a Result< ,> containing the exponentiated matrix if it worked and an error if not.

Describe alternatives you've considered
n/a

Additional context
I have implemented this function for ndarray-linalg but the maintainers have taken months and still no word. I am using my implementation for research purposes, using ndarray I have an error one-norm of about 20x scipy at 200x200 dense matrices and an error of machine precision for 1-sparse matrices up to 1024x1024. It might take some time but I'd be willing to port it over to this library to help move it along, would like to hear how this should best be done.

Specify MSRV

Specifying or documenting the MSRV somewhere would be helpful for consumers of faer.

This was discussed on discord.

Large matrix support

I saw square matrix sizes up to 1k in the benchmark section. Is it in scope of the project to maintain and optimize larger (50-200k) matrix sizes for multiplication (or other operations)?

`no_std`

You do not support no_std, right?

Wiki code `use` statements seem incorrect

The top line of each of the wiki's examples has using faer_core::mat; or something like that. I think it should be something like use faer::core::mat; and so forth.

[Curiosity] Is there a future plan for linear programming?

Hello,

Just out of curiosity, are you planning on adding support for linear programming that could run on top of the algebraic operations provided by the project? Or do you think that such thing should be better provided by third-parties using faer-rs ?

Thanks

faer-bench fails with size mismatch

I ran cargo run in the faer-bench dir and the result was a panic.
The system is a Arm MacOS with the newest nichtly.

leifeld@MacStudovonDirk faer-bench % cargo +nightly run --release --no-default-features
Finished release [optimized] target(s) in 0.08s
Running target/release/faer-bench
f32

Matrix multiplication

Multiplication of two square matrices of dimension n.

    n       faer  faer(par)    ndarray   nalgebra      eigen
    4       33ns       32ns      142ns          -          -
    8       57ns       60ns      122ns          -          -
   16      206ns      211ns      303ns          -          -
   32      1.1ยตs      1.1ยตs      996ns          -          -
   64      8.1ยตs      8.1ยตs      6.7ยตs          -          -
   96     24.8ยตs     30.9ยตs      1.3ms          -          -
  128     59.8ยตs     39.1ยตs      1.2ms          -          -
  192    195.1ยตs     93.3ยตs      1.7ms          -          -
  256    479.7ยตs    202.1ยตs      1.9ms          -          -
  384      1.6ms    448.6ยตs      2.3ms          -          -
  512      3.7ms        1ms      2.6ms          -          -
  640      7.3ms      1.7ms        3ms          -          -
  768     12.5ms        3ms      5.1ms          -          -
  896     19.7ms      4.4ms      6.8ms          -          -
 1024     29.8ms      6.7ms      9.2ms          -          -

Triangular solve

Solving AX = B in place where A and B are two square matrices of dimension n, and A is a triangular matrix.

    n       faer  faer(par)    ndarray   nalgebra      eigen
    4       14ns       14ns     26.3ยตs          -          -
    8       90ns       90ns       27ยตs          -          -
   16      378ns      380ns     29.9ยตs          -          -
   32      1.4ยตs      1.4ยตs     35.9ยตs          -          -
   64      7.2ยตs      7.2ยตs     54.1ยตs          -          -
   96     20.4ยตs     27.2ยตs     93.5ยตs          -          -
  128     44.9ยตs     40.6ยตs    131.9ยตs          -          -
  192    131.7ยตs    117.5ยตs    290.3ยตs          -          -
  256    301.3ยตs    152.8ยตs    487.1ยตs          -          -
  384    913.6ยตs    450.8ยตs    996.1ยตs          -          -
  512      2.2ms      966ยตs      2.6ms          -          -
  640      4.1ms      1.8ms      2.7ms          -          -
  768      6.7ms      2.6ms      3.9ms          -          -
  896     10.8ms      3.6ms      5.3ms          -          -
 1024     17.7ms      5.5ms     12.6ms          -          -

Triangular inverse

Computing A^-1 where A is a square triangular matrix with dimension n.

    n       faer  faer(par)    ndarray   nalgebra      eigen
    4      160ns      9.3ยตs     26.3ยตs          -          -
    8      468ns       12ยตs     27.1ยตs          -          -
   16      1.2ยตs     16.4ยตs     29.9ยตs          -          -
   32      3.4ยตs     24.1ยตs     35.9ยตs          -          -
   64      9.9ยตs     31.6ยตs     54.1ยตs          -          -
   96     23.2ยตs     39.9ยตs     89.1ยตs          -          -
  128     33.9ยตs     49.9ยตs    133.3ยตs          -          -
  192     86.4ยตs     94.7ยตs    310.4ยตs          -          -
  256    153.1ยตs    143.9ยตs    507.7ยตs          -          -
  384    421.7ยตs    281.3ยตs        1ms          -          -
  512    907.5ยตs    492.6ยตs      2.7ms          -          -
  640      1.6ms    816.2ยตs      2.7ms          -          -
  768      2.6ms        1ms      3.9ms          -          -
  896      4.1ms      1.5ms      5.4ms          -          -
 1024      6.8ms      2.4ms     12.5ms          -          -

Cholesky decomposition

Factorizing a square matrix with dimension n as Lร—L.T, where L is lower triangular.

    n       faer  faer(par)    ndarray   nalgebra      eigen
    4       43ns       43ns      120ns          -          -
    8      154ns      154ns      354ns          -          -
   16      767ns      766ns      1.2ยตs          -          -
   32      2.1ยตs      2.1ยตs      4.8ยตs          -          -
   64      6.3ยตs      6.3ยตs       15ยตs          -          -
   96     15.8ยตs     15.8ยตs     34.2ยตs          -          -
  128     22.2ยตs     22.3ยตs    428.7ยตs          -          -
  192     59.7ยตs       87ยตs     12.5ms          -          -
  256    106.1ยตs    126.4ยตs      3.2ms          -          -
  384    292.4ยตs    331.5ยตs      5.3ms          -          -
  512    651.6ยตs    483.6ยตs        7ms          -          -
  640      1.1ms      1.1ms     11.4ms          -          -
  768        2ms      1.5ms     19.6ms          -          -
  896        3ms        2ms      1.82s          -          -
 1024      4.7ms      2.5ms     22.3ms          -          -

LU decomposition with partial pivoting

Factorizing a square matrix with dimension n as Pร—Lร—U, where P is a permutation matrix, L is unit lower triangular and U is upper triangular.

    n       faer  faer(par)    ndarray   nalgebra      eigen
    4       92ns       89ns      154ns          -          -
    8      263ns      238ns      456ns          -          -
   16      895ns      905ns        2ยตs          -          -
   32      2.9ยตs      2.9ยตs        7ยตs          -          -
   64     11.9ยตs     12.2ยตs     22.8ยตs          -          -
   96     27.2ยตs     28.2ยตs     46.3ยตs          -          -
  128       54ยตs     53.7ยตs     88.3ยตs          -          -
  192    141.9ยตs    215.8ยตs    206.9ยตs          -          -
  256    285.8ยตs    384.5ยตs    749.8ยตs          -          -
  384    801.3ยตs    919.1ยตs      1.6ms          -          -
  512      1.8ms      1.7ms      2.4ms          -          -
  640      3.2ms      3.1ms      4.2ms          -          -
  768      5.3ms      4.2ms      5.8ms          -          -
  896      8.2ms      5.8ms      7.7ms          -          -
 1024     12.8ms      7.9ms      9.8ms          -          -

LU decomposition with full pivoting

Factorizing a square matrix with dimension n as Pร—Lร—Uร—Q.T, where P and Q are permutation matrices, L is unit lower triangular and U is upper triangular.

    n       faer  faer(par)    ndarray   nalgebra      eigen
    4      160ns      160ns          -          -          -
    8      388ns      403ns          -          -          -
   16      1.5ยตs      1.5ยตs          -          -          -
   32      7.9ยตs      7.8ยตs          -          -          -
   64     53.4ยตs       53ยตs          -          -          -
   96    174.2ยตs    173.9ยตs          -          -          -
  128    417.9ยตs      417ยตs          -          -          -
  192      1.5ms      1.5ms          -          -          -
  256      3.5ms      3.5ms          -          -          -
  384     11.8ms     11.3ms          -          -          -
  512     28.9ms     21.8ms          -          -          -
  640     55.6ms     33.2ms          -          -          -
  768       98ms     49.2ms          -          -          -
  896    158.7ms     72.1ms          -          -          -
 1024    242.3ms    103.1ms          -          -          -

QR decomposition with no pivoting

Factorizing a square matrix with dimension n as QR, where Q is unitary and R is upper triangular.

    n       faer  faer(par)    ndarray   nalgebra      eigen
    4      102ns      100ns      729ns          -          -
    8      294ns      293ns      1.5ยตs          -          -
   16      1.1ยตs      1.2ยตs      4.7ยตs          -          -
   32      4.6ยตs      4.7ยตs     17.2ยตs          -          -
   64     25.3ยตs     25.6ยตs     65.4ยตs          -          -
   96     57.7ยตs     59.4ยตs    454.9ยตs          -          -
  128    110.2ยตs    109.9ยตs      4.4ms          -          -
  192    298.2ยตs    297.7ยตs     37.8ms          -          -
  256    615.9ยตs    796.9ยตs       59ms          -          -
  384      1.8ms      2.1ms    118.2ms          -          -
  512        4ms      3.7ms    197.2ms          -          -
  640      6.7ms      5.1ms    301.4ms          -          -
  768     11.2ms        8ms    429.2ms          -          -
  896     17.3ms     11.9ms    523.8ms          -          -
 1024     25.6ms     16.5ms    632.4ms          -          -

QR decomposition with column pivoting

Factorizing a square matrix with dimension n as QRP, where P is a permutation matrix, Q is unitary and R is upper triangular.

    n       faer  faer(par)    ndarray   nalgebra      eigen
    4      153ns      140ns          -          -          -
    8      414ns      410ns          -          -          -
   16      1.5ยตs      1.6ยตs          -          -          -
   32      6.5ยตs      6.5ยตs          -          -          -
   64     40.7ยตs     66.3ยตs          -          -          -
   96    123.3ยตs    167.8ยตs          -          -          -
  128    266.4ยตs    327.5ยตs          -          -          -
  192    802.5ยตs    862.3ยตs          -          -          -
  256      1.8ms      2.3ms          -          -          -
  384      5.6ms      6.8ms          -          -          -
  512     12.5ms     12.8ms          -          -          -
  640     24.6ms     19.9ms          -          -          -
  768     41.3ms     28.1ms          -          -          -
  896     65.1ms     38.3ms          -          -          -
 1024     94.5ms     51.7ms          -          -          -

Matrix inverse

Computing the inverse of a square matrix with dimension n.

    n       faer  faer(par)    ndarray   nalgebra      eigen
    4      726ns     14.3ยตs      412ns          -          -
    8      1.7ยตs     17.4ยตs      907ns          -          -
   16      4.4ยตs     25.3ยตs      3.9ยตs          -          -
   32       13ยตs     51.3ยตs     12.6ยตs          -          -
   64       42ยตs     84.9ยตs     49.4ยตs          -          -
   96    100.1ยตs      141ยตs    443.4ยตs          -          -
  128    168.5ยตs    209.1ยตs      2.8ms          -          -
  192      430ยตs    454.3ยตs      3.1ms          -          -
  256    837.1ยตs    711.4ยตs      7.3ms          -          -
  384      2.3ms      1.6ms     17.9ms          -          -
  512      5.3ms        3ms     19.4ms          -          -
  640      9.6ms        5ms     31.4ms          -          -
  768     15.2ms        7ms     34.6ms          -          -
  896       24ms      9.7ms     43.1ms          -          -
 1024     37.6ms     14.3ms     50.6ms          -          -

Square matrix singular value decomposition

Computing the SVD of a square matrix with dimension n.

    n       faer  faer(par)    ndarray   nalgebra      eigen
    4      1.1ยตs      1.1ยตs        2ยตs          -          -
    8      5.8ยตs     21.5ยตs      5.8ยตs          -          -
   16     20.3ยตs       54ยตs     19.5ยตs          -          -
   32       72ยตs    116.1ยตs     71.6ยตs          -          -
   64    270.6ยตs    360.5ยตs      3.9ms          -          -
   96    664.6ยตs    971.7ยตs      6.7ms          -          -
  128      1.3ms      1.7ms     44.4ms          -          -
  192      3.5ms      5.1ms    108.8ms          -          -
  256      7.3ms      9.3ms    173.6ms          -          -
  384     21.4ms     20.8ms    310.9ms          -          -
  512     47.5ms     35.9ms    466.5ms          -          -
  640     85.2ms     54.2ms    616.3ms          -          -
  768      143ms       80ms    830.5ms          -          -
  896    225.4ms    115.2ms      1.07s          -          -
 1024    327.3ms    160.4ms      1.23s          -          -

Thin matrix singular value decomposition

Computing the SVD of a rectangular matrix with shape (4096, n).

    n       faer  faer(par)    ndarray   nalgebra      eigen
    4     32.8ยตs     32.4ยตs    644.1ยตs          -          -
    8      112ยตs    150.6ยตs      1.4ms          -          -
   16    318.9ยตs    456.7ยตs      8.4ms          -          -
   32    947.4ยตs      1.2ms     15.5ms          -          -
   64      2.9ms      3.5ms     45.5ms          -          -
   96      5.3ms      5.8ms       61ms          -          -
  128      8.9ms        9ms    100.3ms          -          -
  192     18.7ms     17.5ms    248.2ms          -          -
  256     33.7ms     29.3ms    380.9ms          -          -
  384     75.2ms     54.5ms    657.6ms          -          -
  512    138.8ms       94ms      1.15s          -          -
  640    224.2ms    136.4ms      1.63s          -          -
  768    336.6ms    182.1ms      1.97s          -          -
  896    478.8ms    240.6ms      2.45s          -          -
 1024      656ms    327.2ms      2.68s          -          -

Hermitian matrix eigenvalue decomposition

Computing the EVD of a hermitian matrix with shape (n, n).

    n       faer  faer(par)    ndarray   nalgebra      eigen
    4      992ns      890ns        1ยตs          -          -
    8      2.5ยตs      2.4ยตs      3.5ยตs          -          -
   16      9.2ยตs      8.7ยตs     12.1ยตs          -          -
   32     36.5ยตs     34.8ยตs     55.5ยตs          -          -
   64    161.4ยตs    194.4ยตs    421.2ยตs          -          -
   96    391.3ยตs    499.3ยตs      1.1ms          -          -
  128    725.3ยตs    841.1ยตs      4.3ms          -          -
  192      1.9ms      2.2ms     22.6ms          -          -
  256      3.7ms        4ms     45.9ms          -          -
  384     10.7ms     12.4ms    114.5ms          -          -
  512     23.2ms     22.7ms    206.6ms          -          -
  640     41.9ms     34.5ms    330.7ms          -          -
  768     69.5ms       51ms      471ms          -          -
  896    107.4ms     70.9ms      657ms          -          -
 1024    156.6ms     92.4ms    888.3ms          -          -

Non Hermitian matrix eigenvalue decomposition

Computing the EVD of a matrix with shape (n, n).

    n       faer  faer(par)    ndarray   nalgebra      eigen
    4      3.9ยตs      4.2ยตs      2.4ยตs          -          -
    8     11.6ยตs       13ยตs      6.9ยตs          -          -
   16     43.7ยตs     42.8ยตs     27.4ยตs          -          -
   32    174.2ยตs    188.9ยตs    123.3ยตs          -          -
   64    806.5ยตs    834.3ยตs    658.5ยตs          -          -
   96      1.9ms        2ms      4.4ms          -          -
  128      3.2ms      3.4ms     31.6ms          -          -
  192      8.5ms        9ms    100.3ms          -          -
  256     16.4ms       17ms    228.4ms          -          -
  384     38.3ms     47.9ms    592.5ms          -          -
  512     82.8ms     96.1ms    798.3ms          -          -
  640    146.3ms    164.9ms      1.12s          -          -
  768    230.3ms    243.3ms      1.46s          -          -
  896    340.3ms    341.4ms      1.69s          -          -
 1024    540.2ms    539.9ms      2.45s          -          -

f64

Matrix multiplication

Multiplication of two square matrices of dimension n.

    n       faer  faer(par)    ndarray   nalgebra      eigen
    4       32ns       32ns      128ns          -          -
    8       80ns       84ns      171ns          -          -
   16      311ns      315ns      345ns          -          -
   32      2.1ยตs      2.2ยตs      1.8ยตs          -          -
   64     14.8ยตs     14.8ยตs     12.1ยตs          -          -
   96     50.3ยตs     41.3ยตs      1.2ms          -          -
  128    124.1ยตs     69.5ยตs      1.4ms          -          -
  192    392.9ยตs    173.1ยตs      1.5ms          -          -
  256    943.6ยตs    315.2ยตs      1.3ms          -          -
  384      3.2ms    889.3ยตs      2.3ms          -          -
  512      7.5ms      1.9ms      3.7ms          -          -
  640     14.5ms      3.4ms      4.9ms          -          -
  768     25.7ms      5.8ms      8.5ms          -          -
  896       30ms      7.3ms     13.1ms          -          -
 1024     45.1ms       11ms     17.4ms          -          -

Triangular solve

Solving AX = B in place where A and B are two square matrices of dimension n, and A is a triangular matrix.

    n       faer  faer(par)    ndarray   nalgebra      eigen
    4       14ns       14ns     27.9ยตs          -          -
    8       89ns       90ns     27.7ยตs          -          -
   16      429ns      431ns     29.8ยตs          -          -
   32      1.9ยตs      1.9ยตs     35.5ยตs          -          -
   64     11.3ยตs     11.3ยตs       54ยตs          -          -
   96     33.9ยตs     35.3ยตs     81.9ยตs          -          -
  128     75.4ยตs     63.5ยตs      126ยตs          -          -
  192      235ยตs    149.1ยตs    264.2ยตs          -          -
  256    555.4ยตs    254.1ยตs    759.9ยตs          -          -
  384      1.7ms    841.4ยตs      1.2ms          -          -
  512      4.5ms      1.6ms      3.5ms          -          -
  640      7.8ms      2.8ms      3.4ms          -          -
  768     13.1ms      4.5ms      7.3ms          -          -
  896     20.9ms      6.5ms        7ms          -          -
 1024     35.4ms       10ms     18.3ms          -          -

Triangular inverse

Computing A^-1 where A is a square triangular matrix with dimension n.

    n       faer  faer(par)    ndarray   nalgebra      eigen
    4      156ns     12.4ยตs       28ยตs          -          -
    8      479ns     11.4ยตs     27.7ยตs          -          -
   16      1.3ยตs       17ยตs     29.9ยตs          -          -
   32      3.7ยตs     24.1ยตs     34.2ยตs          -          -
   64     11.4ยตs       33ยตs     52.2ยตs          -          -
   96     28.8ยตs     45.4ยตs    212.9ยตs          -          -
  128     45.8ยตs     65.2ยตs    334.2ยตs          -          -
  192    124.9ยตs    130.4ยตs    405.3ยตs          -          -
  256      247ยตs      226ยตs      765ยตs          -          -
  384    712.3ยตs    395.6ยตs      1.2ms          -          -
  512      1.8ms    736.7ยตs      3.4ms          -          -
  640      2.9ms      1.3ms      3.3ms          -          -
  768      4.8ms        2ms      7.2ms          -          -
  896      7.6ms      2.6ms      7.1ms          -          -
 1024     13.5ms      3.9ms       18ms          -          -

Cholesky decomposition

Factorizing a square matrix with dimension n as Lร—L.T, where L is lower triangular.

    n       faer  faer(par)    ndarray   nalgebra      eigen
    4       46ns       46ns      121ns          -          -
    8      146ns      145ns      393ns          -          -
   16      631ns      625ns      1.3ยตs          -          -
   32        2ยตs        2ยตs      5.1ยตs          -          -
   64      7.3ยตs      7.3ยตs      187ยตs          -          -
   96     18.3ยตs     18.3ยตs      1.4ms          -          -
  128     32.2ยตs     32.4ยตs      1.3ms          -          -
  192     88.2ยตs    100.1ยตs        6ms          -          -
  256    186.8ยตs    188.5ยตs      4.8ms          -          -
  384    520.7ยตs      465ยตs      8.8ms          -          -
  512      1.3ms    935.6ยตs     10.4ms          -          -
  640      2.2ms      1.6ms     12.7ms          -          -
  768      3.8ms      2.2ms     20.5ms          -          -
  896      5.7ms        3ms     24.7ms          -          -
 1024      9.5ms      4.6ms     26.2ms          -          -

LU decomposition with partial pivoting

Factorizing a square matrix with dimension n as Pร—Lร—U, where P is a permutation matrix, L is unit lower triangular and U is upper triangular.

    n       faer  faer(par)    ndarray   nalgebra      eigen
    4       76ns       93ns      150ns          -          -
    8      225ns      213ns      493ns          -          -
   16      740ns      771ns      2.2ยตs          -          -
   32        3ยตs      2.9ยตs      6.6ยตs          -          -
   64     14.7ยตs     14.8ยตs     21.1ยตs          -          -
   96     37.7ยตs     37.7ยตs     46.5ยตs          -          -
  128     75.7ยตs     76.2ยตs    302.8ยตs          -          -
  192    218.3ยตs    276.5ยตs    620.6ยตs          -          -
  256    474.4ยตs      524ยตs    810.8ยตs          -          -
  384      1.4ms      1.3ms      1.5ms          -          -
  512      3.4ms      2.6ms        3ms          -          -
  640      5.9ms      4.4ms      4.6ms          -          -
  768      9.9ms      6.2ms      6.5ms          -          -
  896     15.5ms      8.5ms      8.7ms          -          -
 1024     24.8ms     12.8ms     11.3ms          -          -

LU decomposition with full pivoting

Factorizing a square matrix with dimension n as Pร—Lร—Uร—Q.T, where P and Q are permutation matrices, L is unit lower triangular and U is upper triangular.

    n       faer  faer(par)    ndarray   nalgebra      eigen
    4      119ns      132ns          -          -          -
    8      371ns      369ns          -          -          -
   16      1.4ยตs      1.4ยตs          -          -          -
   32      7.7ยตs      7.7ยตs          -          -          -
   64     53.4ยตs     53.4ยตs          -          -          -
   96    185.1ยตs    184.7ยตs          -          -          -
  128    460.1ยตs      459ยตs          -          -          -
  192      1.7ms      1.7ms          -          -          -
  256      4.3ms      4.3ms          -          -          -
  384     14.8ms       14ms          -          -          -
  512     38.2ms     25.8ms          -          -          -
  640     73.1ms       38ms          -          -          -
  768    128.8ms     56.5ms          -          -          -
  896    207.4ms     80.4ms          -          -          -
 1024    306.1ms    119.3ms          -          -          -

QR decomposition with no pivoting

Factorizing a square matrix with dimension n as QR, where Q is unitary and R is upper triangular.

    n       faer  faer(par)    ndarray   nalgebra      eigen
    4      106ns      106ns      658ns          -          -
    8      295ns      296ns      1.6ยตs          -          -
   16        1ยตs        1ยตs      4.7ยตs          -          -
   32        5ยตs        5ยตs       18ยตs          -          -
   64     31.9ยตs     31.8ยตs     75.6ยตs          -          -
   96     79.7ยตs     79.7ยตs    526.7ยตs          -          -
  128    158.8ยตs    157.9ยตs      4.7ms          -          -
  192    451.3ยตs    450.1ยตs     49.9ms          -          -
  256    963.8ยตs      1.1ms     72.8ms          -          -
  384      2.9ms      2.9ms    148.4ms          -          -
  512      6.6ms      5.4ms    291.2ms          -          -
  640     12.4ms      8.9ms    337.5ms          -          -
  768     20.8ms     13.7ms    413.4ms          -          -
  896     31.6ms     20.7ms    546.6ms          -          -
 1024     45.6ms     28.3ms    734.9ms          -          -

QR decomposition with column pivoting

Factorizing a square matrix with dimension n as QRP, where P is a permutation matrix, Q is unitary and R is upper triangular.

    n       faer  faer(par)    ndarray   nalgebra      eigen
    4      175ns      162ns          -          -          -
    8      422ns      436ns          -          -          -
   16      1.6ยตs      1.6ยตs          -          -          -
   32        8ยตs      7.9ยตs          -          -          -
   64     58.7ยตs     93.9ยตs          -          -          -
   96      169ยตs    213.4ยตs          -          -          -
  128    358.2ยตs    422.9ยตs          -          -          -
  192      1.2ms      1.2ms          -          -          -
  256      2.7ms        3ms          -          -          -
  384      8.8ms      8.5ms          -          -          -
  512     20.5ms     16.5ms          -          -          -
  640     39.9ms     27.4ms          -          -          -
  768     67.8ms     43.4ms          -          -          -
  896    107.9ms     59.4ms          -          -          -
 1024    159.2ms       77ms          -          -          -

Matrix inverse

Computing the inverse of a square matrix with dimension n.

    n       faer  faer(par)    ndarray   nalgebra      eigen
    4      715ns     12.4ยตs      406ns          -          -
    8      1.6ยตs     16.7ยตs      991ns          -          -
   16      4.2ยตs     25.4ยตs      4.1ยตs          -          -
   32     13.6ยตs     53.5ยตs     12.8ยตs          -          -
   64     51.3ยตs     96.7ยตs     56.2ยตs          -          -
   96      130ยตs      169ยตs    450.3ยตs          -          -
  128    240.2ยตs    260.5ยตs      3.3ms          -          -
  192    659.7ยตs    589.3ยตs      6.1ms          -          -
  256      1.4ms      1.1ms      7.2ms          -          -
  384      4.1ms      2.4ms     15.8ms          -          -
  512      9.8ms      4.5ms     21.3ms          -          -
  640     17.3ms      7.5ms     28.9ms          -          -
  768     28.7ms     10.8ms       40ms          -          -
  896     44.9ms     15.3ms     52.6ms          -          -
 1024     74.2ms     23.2ms     70.7ms          -          -

Square matrix singular value decomposition

Computing the SVD of a square matrix with dimension n.

    n       faer  faer(par)    ndarray   nalgebra      eigen
    4      1.4ยตs      1.3ยตs      2.4ยตs          -          -
    8      7.4ยตs     23.1ยตs      7.4ยตs          -          -
   16     23.3ยตs     56.8ยตs     24.9ยตs          -          -
   32     85.2ยตs    130.1ยตs     86.7ยตs          -          -
   64      374ยตs    464.2ยตs      4.1ms          -          -
   96      953ยตs      1.3ms        7ms          -          -
  128      1.9ms      2.2ms     51.7ms          -          -
  192      5.3ms      6.3ms    135.2ms          -          -
  256     11.4ms       11ms    204.5ms          -          -
  384     34.1ms     27.3ms    312.2ms          -          -
  512     77.3ms     48.9ms      516ms          -          -
  640    144.7ms     78.2ms    688.1ms          -          -
  768    248.5ms    119.6ms    912.3ms          -          -
  896    384.6ms    176.2ms      1.16s          -          -
 1024    560.5ms    249.6ms      1.64s          -          -

Thin matrix singular value decomposition

Computing the SVD of a rectangular matrix with shape (4096, n).

    n       faer  faer(par)    ndarray   nalgebra      eigen
    4     43.6ยตs     43.7ยตs    578.8ยตs          -          -
    8    165.7ยตs    180.3ยตs      1.5ms          -          -
   16    471.3ยตs    573.1ยตs      7.8ms          -          -
   32      1.4ms      1.5ms     15.2ms          -          -
   64      4.6ms      4.5ms     58.3ms          -          -
   96      9.3ms      8.2ms     62.8ms          -          -
  128     16.1ms     12.8ms    103.5ms          -          -
  192       34ms     24.8ms    237.3ms          -          -
  256     60.9ms     41.3ms    383.9ms          -          -
  384    134.4ms     83.4ms    663.6ms          -          -
  512      248ms    142.1ms      1.05s          -          -
  640    402.7ms    204.7ms      1.52s          -          -
  768    616.6ms    298.7ms      1.81s          -          -
  896    874.1ms    399.3ms      2.44s          -          -
 1024      1.19s    532.8ms      2.87s          -          -

Hermitian matrix eigenvalue decomposition

Computing the EVD of a hermitian matrix with shape (n, n).

    n       faer  faer(par)    ndarray   nalgebra      eigen
    4      1.1ยตs        1ยตs      1.2ยตs          -          -
    8        3ยตs      3.6ยตs      4.2ยตs          -          -
   16       12ยตs     11.8ยตs     15.5ยตs          -          -
   32     50.4ยตs     53.6ยตs       68ยตs          -          -
   64    224.6ยตs    245.4ยตs    437.8ยตs          -          -
   96    519.9ยตs    615.6ยตs      1.4ms          -          -
  128    989.7ยตs      1.1ms        6ms          -          -
  192      2.5ms      2.8ms     25.6ms          -          -
  256      5.1ms      4.8ms     51.1ms          -          -
  384     14.5ms     14.8ms    122.3ms          -          -
  512     31.6ms     27.8ms    232.1ms          -          -
  640     58.5ms     44.6ms    370.5ms          -          -
  768       97ms     65.1ms      554ms          -          -
  896    148.8ms     92.1ms    769.1ms          -          -
 1024    218.1ms    122.4ms      1.09s          -          -

Non Hermitian matrix eigenvalue decomposition

Computing the EVD of a matrix with shape (n, n).

    n       faer  faer(par)    ndarray   nalgebra      eigen
    4      4.2ยตs      4.3ยตs      2.5ยตs          -          -
    8     14.7ยตs       14ยตs      8.6ยตs          -          -
   16     51.5ยตs       49ยตs     31.3ยตs          -          -
   32    228.1ยตs    225.8ยตs    146.1ยตs          -          -
   64        1ms        1ms    822.7ยตs          -          -
   96      2.8ms        3ms      5.1ms          -          -
  128      5.2ms      5.2ms     25.1ms          -          -
  192     14.2ms     14.4ms    121.7ms          -          -
  256     26.9ms     26.7ms    312.8ms          -          -
  384     69.1ms     78.5ms    624.7ms          -          -
  512    159.2ms    177.8ms      1.02s          -          -
  640    256.7ms    259.4ms      1.17s          -          -
  768    407.6ms    382.6ms      1.69s          -          -
  896    588.6ms    531.4ms      2.03s          -          -
 1024      1.03s    931.6ms      2.75s          -          -

f128

Matrix multiplication

Multiplication of two square matrices of dimension n.

    n       faer  faer(par)    ndarray   nalgebra      eigen
    4      150ns      151ns          -          -          -
    8      1.3ยตs     16.2ยตs          -          -          -
   16      8.1ยตs     25.4ยตs          -          -          -
   32     60.4ยตs     40.6ยตs          -          -          -
   64    474.6ยตs    162.2ยตs          -          -          -
   96      1.6ms    519.4ยตs          -          -          -
  128      3.8ms    961.5ยตs          -          -          -
  192     12.8ms      2.6ms          -          -          -
  256     30.7ms      5.5ms          -          -          -
  384    103.2ms     16.5ms          -          -          -
  512    254.1ms     39.6ms          -          -          -
  640    481.4ms     70.9ms          -          -          -
  768    866.7ms    133.5ms          -          -          -
  896      1.34s      185ms          -          -          -
 1024      2.25s    311.7ms          -          -          -

Triangular solve

Solving AX = B in place where A and B are two square matrices of dimension n, and A
is a triangular matrix.

    n       faer  faer(par)    ndarray   nalgebra      eigen
    4       73ns       73ns          -          -          -
    8      670ns      669ns          -          -          -
   16      5.1ยตs     24.7ยตs          -          -          -
   32     37.8ยตs    179.7ยตs          -          -          -
   64    269.6ยตs    471.8ยตs          -          -          -
   96    892.4ยตs    568.2ยตs          -          -          -
  128      2.1ms    926.1ยตs          -          -          -
  192      6.9ms      2.3ms          -          -          -
  256       16ms      4.4ms          -          -          -
  384     53.9ms       12ms          -          -          -
  512    126.2ms     25.5ms          -          -          -
  640    243.8ms     47.3ms          -          -          -
  768    419.6ms     71.8ms          -          -          -
  896    662.1ms    118.7ms          -          -          -
 1024      1.03s    170.5ms          -          -          -

Triangular inverse

Computing A^-1 where A is a square triangular matrix with dimension n.

    n       faer  faer(par)    ndarray   nalgebra      eigen
    4      271ns       16ยตs          -          -          -
    8      965ns     13.8ยตs          -          -          -
   16      4.2ยตs     35.5ยตs          -          -          -
   32       22ยตs     85.3ยตs          -          -          -
   64    132.6ยตs    330.6ยตs          -          -          -
   96      389ยตs    566.4ยตs          -          -          -
  128    862.3ยตs        1ms          -          -          -
  192      2.7ms        2ms          -          -          -
  256      6.1ms        3ms          -          -          -
  384     19.6ms      7.6ms          -          -          -
  512       46ms       13ms          -          -          -
  640     96.4ms     25.3ms          -          -          -
  768      155ms     35.2ms          -          -          -
  896    235.6ms       54ms          -          -          -
 1024    349.3ms     75.2ms          -          -          -

Cholesky decomposition

Factorizing a square matrix with dimension n as Lร—L.T, where L is lower triangular.

    n       faer  faer(par)    ndarray   nalgebra      eigen
    4      289ns      286ns          -          -          -
    8      692ns      696ns          -          -          -
   16      2.4ยตs      2.4ยตs          -          -          -
   32     18.9ยตs     58.2ยตs          -          -          -
   64    128.9ยตs    393.8ยตs          -          -          -
   96    368.2ยตs      844ยตs          -          -          -
  128    858.1ยตs      1.6ms          -          -          -
  192      2.7ms      3.6ms          -          -          -
  256      6.1ms      5.2ms          -          -          -
  384     20.1ms     12.8ms          -          -          -
  512     45.8ms       19ms          -          -          -
  640     88.4ms     36.8ms          -          -          -
  768    147.9ms     50.1ms          -          -          -
  896    231.4ms     69.7ms          -          -          -
 1024    348.7ms     89.7ms          -          -          -

LU decomposition with partial pivoting

Factorizing a square matrix with dimension n as Pร—Lร—U, where P is a permutation matrix,
L is unit lower triangular and U is upper triangular.

    n       faer  faer(par)    ndarray   nalgebra      eigen
    4      214ns      186ns          -          -          -
    8      580ns      596ns          -          -          -
   16      2.4ยตs      2.5ยตs          -          -          -
   32     21.7ยตs     64.4ยตs          -          -          -
   64    173.7ยตs    369.3ยตs          -          -          -
   96    581.2ยตs    889.6ยตs          -          -          -
  128      1.3ms      1.5ms          -          -          -
  192      4.8ms      3.7ms          -          -          -
  256     10.9ms      6.4ms          -          -          -
  384     36.4ms     16.3ms          -          -          -
  512     85.5ms     33.5ms          -          -          -
  640    165.6ms       54ms          -          -          -
  768    284.5ms     80.2ms          -          -          -
  896    450.1ms      125ms          -          -          -
 1024    684.1ms    173.6ms          -          -          -

LU decomposition with full pivoting

Factorizing a square matrix with dimension n as Pร—Lร—Uร—Q.T, where P and Q are
permutation matrices, L is unit lower triangular and U is upper triangular.

    n       faer  faer(par)    ndarray   nalgebra      eigen
    4      310ns      314ns          -          -          -
    8      1.2ยตs      1.1ยตs          -          -          -
   16      5.4ยตs      5.5ยตs          -          -          -
   32     35.5ยตs     35.6ยตs          -          -          -
   64    255.5ยตs      256ยตs          -          -          -
   96    843.7ยตs    840.9ยตs          -          -          -
  128        2ms        2ms          -          -          -
  192      6.7ms      6.7ms          -          -          -
  256     16.2ms     16.2ms          -          -          -
  384     52.2ms     47.8ms          -          -          -
  512    125.9ms     76.2ms          -          -          -
  640    241.1ms    114.2ms          -          -          -
  768    417.1ms    166.8ms          -          -          -
  896    661.6ms    232.6ms          -          -          -
 1024    996.3ms    326.2ms          -          -          -

QR decomposition with no pivoting

Factorizing a square matrix with dimension n as QR, where Q is unitary and R is upper
triangular.

    n       faer  faer(par)    ndarray   nalgebra      eigen
    4      698ns      699ns          -          -          -
    8      2.3ยตs      2.3ยตs          -          -          -
   16     10.9ยตs     10.7ยตs          -          -          -
   32     56.8ยตs     56.6ยตs          -          -          -
   64    519.8ยตs    518.4ยตs          -          -          -
   96      1.5ms      1.5ms          -          -          -
  128      3.4ms      3.5ms          -          -          -
  192     10.7ms     10.8ms          -          -          -
  256     24.7ms     18.5ms          -          -          -
  384     80.8ms       52ms          -          -          -
  512    198.4ms     97.7ms          -          -          -
  640    368.2ms    146.3ms          -          -          -
  768      624ms    219.9ms          -          -          -
  896    980.1ms    318.4ms          -          -          -
 1024      1.46s    422.8ms          -          -          -

QR decomposition with column pivoting

Factorizing a square matrix with dimension n as QRP, where P is a permutation matrix,
Q is unitary and R is upper triangular.

    n       faer  faer(par)    ndarray   nalgebra      eigen
    4      947ns      939ns          -          -          -
    8      3.6ยตs      3.6ยตs          -          -          -
   16     16.8ยตs     16.8ยตs          -          -          -
   32       97ยตs     97.4ยตs          -          -          -
   64    684.6ยตs    714.8ยตs          -          -          -
   96        2ms      1.9ms          -          -          -
  128      4.4ms      4.2ms          -          -          -
  192     13.7ms     13.1ms          -          -          -
  256     30.7ms     25.1ms          -          -          -
  384     99.1ms     49.1ms          -          -          -
  512    231.4ms       90ms          -          -          -
  640    449.1ms    151.6ms          -          -          -
  768    764.5ms    230.5ms          -          -          -
  896      1.21s    332.2ms          -          -          -
 1024      1.79s      462ms          -          -          -

Matrix inverse

Computing the inverse of a square matrix with dimension n.

    n       faer  faer(par)    ndarray   nalgebra      eigen
    4      1.4ยตs     15.1ยตs          -          -          -
    8      4.5ยตs     36.1ยตs          -          -          -
   16     20.4ยตs     63.4ยตs          -          -          -
   32    110.3ยตs    196.8ยตs          -          -          -
   64    694.5ยตs    777.2ยตs          -          -          -
   96      2.1ms      1.7ms          -          -          -
  128      4.7ms      3.1ms          -          -          -
  192     15.1ms      7.8ms          -          -          -
  256       35ms     13.6ms          -          -          -
  384    117.1ms     35.4ms          -          -          -
  512    267.1ms     64.2ms          -          -          -
  640    524.5ms    120.4ms          -          -          -
  768    899.5ms    179.5ms          -          -          -
  896       1.4s    273.8ms          -          -          -
 1024      2.17s    405.6ms          -          -          -

Square matrix singular value decomposition

Computing the SVD of a square matrix with dimension n.

    n       faer  faer(par)    ndarray   nalgebra      eigen
    4     11.6ยตs     11.7ยตs          -          -          -
    8     60.4ยตs     96.7ยตs          -          -          -
   16    259.8ยตs    333.2ยตs          -          -          -
   32      1.1ms      1.1ms          -          -          -
   64      5.9ms      5.6ms          -          -          -
   96     16.4ms     12.4ms          -          -          -
  128     31.5ms     22.2ms          -          -          -
  192     89.6ms       51ms          -          -          -
  256    194.7ms     89.3ms          -          -          -
  384    581.4ms    243.3ms          -          -          -
  512      1.32s    464.9ms          -          -          -
  640      2.48s      758ms          -          -          -
  768      4.13s      1.16s          -          -          -
  896      6.45s      1.72s          -          -          -
 1024      9.51s      2.27s          -          -          -

Thin matrix singular value decomposition

Computing the SVD of a rectangular matrix with shape (4096, n).

    n       faer  faer(par)    ndarray   nalgebra      eigen
    4      1.1ms      2.1ms          -          -          -
    8      3.6ms      4.6ms          -          -          -
   16     10.2ms     13.2ms          -          -          -
   32     32.3ms     33.1ms          -          -          -
   64    120.3ms     86.5ms          -          -          -
   96    253.1ms    142.4ms          -          -          -
  128    444.6ms    216.4ms          -          -          -
  192    996.1ms    399.4ms          -          -          -
  256       1.8s    645.6ms          -          -          -
  384      4.05s      1.24s          -          -          -
  512      7.37s       2.1s          -          -          -
  640     11.67s      3.04s          -          -          -
  768     17.43s       4.6s          -          -          -
  896     23.81s      6.02s          -          -          -
 1024     32.59s      7.85s          -          -          -

Hermitian matrix eigenvalue decomposition

Computing the EVD of a hermitian matrix with shape (n, n).

    n       faer  faer(par)    ndarray   nalgebra      eigen
    4      5.4ยตs      5.8ยตs          -          -          -
    8     21.9ยตs     25.7ยตs          -          -          -
   16      102ยตs     99.5ยตs          -          -          -
   32    515.3ยตs    497.8ยตs          -          -          -
   64      3.1ms      3.1ms          -          -          -
   96      8.6ms      7.2ms          -          -          -
  128     17.5ms     13.1ms          -          -          -
  192     51.1ms     33.1ms          -          -          -
  256    108.3ms     65.1ms          -          -          -
  384    339.3ms    152.3ms          -          -          -
  512    745.4ms    283.9ms          -          -          -
  640       1.4s    457.2ms          -          -          -
  768      2.36s    702.8ms          -          -          -
  896      3.72s      1.03s          -          -          -
 1024      5.45s      1.37s          -          -          -

Non Hermitian matrix eigenvalue decomposition

Computing the EVD of a matrix with shape (n, n).

    n       faer  faer(par)    ndarray   nalgebra      eigen
    4     16.2ยตs     14.7ยตs          -          -          -
    8     68.3ยตs    101.5ยตs          -          -          -
   16    349.1ยตs      388ยตs          -          -          -
   32      1.8ms      1.8ms          -          -          -
   64     11.4ms     12.8ms          -          -          -
   96     46.5ms     60.2ms          -          -          -
  128     92.9ms    104.8ms          -          -          -
  192    299.1ms    221.5ms          -          -          -
  256    593.9ms    418.8ms          -          -          -
  384      1.61s    978.8ms          -          -          -
  512      3.46s         2s          -          -          -
  640      6.26s      3.63s          -          -          -
  768     10.17s      5.24s          -          -          -
  896     15.31s      7.25s          -          -          -
 1024     22.59s      9.99s          -          -          -

c32

Matrix multiplication

Multiplication of two square matrices of dimension n.

    n       faer  faer(par)    ndarray   nalgebra      eigen
    4       61ns       61ns       97ns          -          -
    8      317ns      319ns      208ns          -          -
   16        2ยตs        2ยตs      659ns          -          -
   32     14.1ยตs     14.1ยตs        4ยตs          -          -
   64    107.4ยตs    107.4ยตs    837.1ยตs          -          -
   96      356ยตs    123.6ยตs      1.1ms          -          -
  128    834.9ยตs    285.1ยตs      1.3ms          -          -
  192      2.8ms    675.1ยตs      1.5ms          -          -
  256      6.6ms      1.4ms        2ms          -          -
  384     22.3ms      4.6ms      3.1ms          -          -
  512     52.6ms     10.6ms      6.7ms          -          -
  640    103.6ms     19.3ms      9.9ms          -          -
  768    178.3ms     32.5ms     16.9ms          -          -
  896    282.8ms     50.6ms     23.1ms          -          -
 1024    422.1ms     74.6ms     33.5ms          -          -

Triangular solve

Solving AX = B in place where A and B are two square matrices of dimension n, and A is a triangular matrix.

    n       faer  faer(par)    ndarray   nalgebra      eigen
    4       22ns       22ns     26.8ยตs          -          -
    8      203ns      202ns     27.6ยตs          -          -
   16      1.3ยตs      1.3ยตs     30.5ยตs          -          -
   32      8.8ยตs      8.8ยตs     40.6ยตs          -          -
   64     62.7ยตs     62.7ยตs     59.5ยตs          -          -
   96      206ยตs      132ยตs    332.9ยตs          -          -
  128    466.5ยตs      272ยตs    432.2ยตs          -          -
  192      1.5ms    558.7ยตs    602.8ยตs          -          -
  256      3.5ms      1.1ms    919.3ยตs          -          -
  384     11.7ms      2.9ms      1.8ms          -          -
  512     27.4ms      6.2ms      4.5ms          -          -
  640     53.3ms     11.7ms        5ms          -          -
  768     91.6ms     19.5ms     10.2ms          -          -
  896    144.6ms     29.4ms       12ms          -          -
 1024    215.7ms     43.6ms     25.1ms          -          -

Triangular inverse

Computing A^-1 where A is a square triangular matrix with dimension n.

    n       faer  faer(par)    ndarray   nalgebra      eigen
    4      186ns     15.9ยตs     27.5ยตs          -          -
    8      534ns     12.7ยตs     27.6ยตs          -          -
   16      1.8ยตs     16.5ยตs     30.5ยตs          -          -
   32      7.1ยตs     27.2ยตs     40.7ยตs          -          -
   64     35.2ยตs     56.5ยตs     59.6ยตs          -          -
   96    101.2ยตs    146.3ยตs     85.4ยตs          -          -
  128    208.8ยตs    251.4ยตs    454.6ยตs          -          -
  192      636ยตs    576.1ยตs    597.7ยตs          -          -
  256      1.4ms        1ms      1.1ms          -          -
  384      4.4ms      1.8ms      1.8ms          -          -
  512     10.1ms      3.1ms      4.5ms          -          -
  640     19.3ms      5.3ms        5ms          -          -
  768     32.8ms      7.9ms     10.2ms          -          -
  896     51.7ms     11.2ms       12ms          -          -
 1024     76.5ms     16.3ms       25ms          -          -

Cholesky decomposition

Factorizing a square matrix with dimension n as Lร—L.T, where L is lower triangular.

    n       faer  faer(par)    ndarray   nalgebra      eigen
    4       45ns       45ns      124ns          -          -
    8      172ns      172ns      397ns          -          -
   16      925ns      925ns      1.3ยตs          -          -
   32      5.1ยตs      5.2ยตs      6.6ยตs          -          -
   64     30.3ยตs     30.4ยตs    176.3ยตs          -          -
   96     85.9ยตs       86ยตs      1.7ms          -          -
  128    192.3ยตs    192.3ยตs      1.5ms          -          -
  192    583.1ยตs    541.4ยตs        4ms          -          -
  256      1.3ms    934.9ยตs        5ms          -          -
  384      4.2ms      2.3ms       10ms          -          -
  512      9.8ms      3.8ms     11.5ms          -          -
  640     18.9ms      6.5ms     26.6ms          -          -
  768     32.2ms      9.9ms     20.1ms          -          -
  896     50.8ms     13.8ms     18.9ms          -          -
 1024     76.1ms     19.1ms     28.1ms          -          -

LU decomposition with partial pivoting

Factorizing a square matrix with dimension n as Pร—Lร—U, where P is a permutation matrix, L is unit lower triangular and U is upper triangular.

    n       faer  faer(par)    ndarray   nalgebra      eigen
    4      105ns       91ns      176ns          -          -
    8      266ns      273ns      566ns          -          -
   16        1ยตs      1.1ยตs      2.3ยตs          -          -
   32      6.5ยตs      6.6ยตs      7.8ยตs          -          -
   64     45.5ยตs     45.4ยตs     30.9ยตs          -          -
   96    145.9ยตs    145.5ยตs       74ยตs          -          -
  128    327.9ยตs    326.5ยตs    360.5ยตs          -          -
  192      1.1ms    863.8ยตs      756ยตs          -          -
  256      2.4ms      1.7ms        1ms          -          -
  384        8ms      4.2ms      2.6ms          -          -
  512     18.7ms        8ms      4.7ms          -          -
  640     36.3ms     13.9ms      7.6ms          -          -
  768     62.6ms     20.9ms     10.8ms          -          -
  896     98.6ms     29.8ms     14.7ms          -          -
 1024    146.5ms       43ms       24ms          -          -

LU decomposition with full pivoting

Factorizing a square matrix with dimension n as Pร—Lร—Uร—Q.T, where P and Q are permutation matrices, L is unit lower triangular and U is upper triangular.

    n       faer  faer(par)    ndarray   nalgebra      eigen
    4      191ns      185ns          -          -          -
    8      573ns      516ns          -          -          -
   16      2.2ยตs      2.3ยตs          -          -          -
   32     12.4ยตs     12.3ยตs          -          -          -
   64     82.7ยตs     82.4ยตs          -          -          -
   96    262.9ยตs    262.7ยตs          -          -          -
  128    609.8ยตs    609.7ยตs          -          -          -
  192        2ms        2ms          -          -          -
  256      4.9ms      4.9ms          -          -          -
  384     16.4ms     15.6ms          -          -          -
  512     40.7ms     29.4ms          -          -          -
  640     77.3ms       41ms          -          -          -
  768    134.2ms     63.2ms          -          -          -
  896    215.1ms     99.3ms          -          -          -
 1024    316.8ms    136.3ms          -          -          -

QR decomposition with no pivoting

Factorizing a square matrix with dimension n as QR, where Q is unitary and R is upper triangular.

    n       faer  faer(par)    ndarray   nalgebra      eigen
    4      164ns      164ns      879ns          -          -
    8      487ns      487ns      2.1ยตs          -          -
   16      2.5ยตs      2.5ยตs      6.9ยตs          -          -
   32     16.1ยตs     16.1ยตs     29.1ยตs          -          -
   64    132.5ยตs    132.5ยตs        1ms          -          -
   96    377.1ยตs    377.3ยตs      5.2ms          -          -
  128    815.1ยตs    815.2ยตs      9.7ms          -          -
  192      2.5ms      2.5ms     47.7ms          -          -
  256      5.7ms      4.7ms     68.9ms          -          -
  384     18.2ms     11.3ms    132.7ms          -          -
  512     41.9ms     21.9ms    176.6ms          -          -
  640     78.9ms     34.2ms      266ms          -          -
  768    134.3ms     52.8ms    350.6ms          -          -
  896    210.9ms     77.2ms    453.1ms          -          -
 1024    313.2ms    109.5ms    544.3ms          -          -

QR decomposition with column pivoting

Factorizing a square matrix with dimension n as QRP, where P is a permutation matrix, Q is unitary and R is upper triangular.

thread 'main' panicked at 'cast>SizeMismatch', /Users/leifeld/.cargo/registry/src/index.crates.io-6f17d22bba15001f/bytemuck-1.13.1/src/internal.rs:32:3
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
leifeld@MacStudovonDirk faer-bench %  77.3ms       41ms          -          -          -
  768    134.2ms     63.2ms          -          -          -
  896    215.1ms     99.3ms          -          -          -
 1024    316.8ms    136.3ms          -          -          -

QR decomposition with no pivoting

Factorizing a square matrix with dimension n as QR, where Q is unitary and R is upper triangular.

    n       faer  faer(par)    ndarray   nalgebra      eigen
    4      164ns      164ns      879ns          -          -
    8      487ns      487ns      2.1ยตs          -          -
   16      2.5ยตs      2.5ยตs      6.9ยตs          -          -
   32     16.1ยตs     16.1ยตs     29.1ยตs          -          -
   64    132.5ยตs    132.5ยตs        1ms          -          -
   96    377.1ยตs    377.3ยตs      5.2ms          -          -
  128    815.1ยตs    815.2ยตs      9.7ms          -          -
  192      2.5ms      2.5ms     47.7ms          -          -
  256      5.7ms      4.7ms     68.9ms          -          -
  384     18.2ms     11.3ms    132.7ms          -          -
  512     41.9ms     21.9ms    176.6ms          -          -
  640     78.9ms     34.2ms      266ms          -          -
  768    134.3ms     52.8ms    350.6ms          -          -
  896    210.9ms     77.2ms    453.1ms          -          -
 1024    313.2ms    109.5ms    544.3ms          -          -

QR decomposition with column pivoting

Factorizing a square matrix with dimension n as QRP, where P is a permutation matrix, Q is unitary and R is upper triangular.

thread 'main' panicked at 'cast>SizeMismatch', /Users/leifeld/.cargo/registry/src/index.crates.io-6f17d22bba15001f/bytemuck-1.13.1/src/internal.rs:32:3

Add support for matrix multiplication with `half::f16`/`half::bf16`

Is your feature request related to a problem? Please describe.
Currently none of the matrix multiplication crates support the half (fp16) datatype. There is the half crate (https://crates.io/crates/half) that includes a rust type for this.

This means for any crate that needs to support matrix multiplication on CPU with f16 datatype, they need to manually implement a matrix multiplication algorithm, which can be very slow.

Describe the solution you'd like
It'd be great if faer included configurable support for these data types.

Describe alternatives you've considered
I'm currently using this super naive matmul implementation:

fn naive_gemm<F: num_traits::Float + std::ops::AddAssign, M: Dim, K: Dim, N: Dim>(
    (m, k, n): (M, K, N),
    ap: *const F,
    a_strides: [usize; 2],
    bp: *const F,
    b_strides: [usize; 2],
    cp: *mut F,
    c_strides: [usize; 2],
) {
    for i_m in 0..m.size() {
        for i_k in 0..k.size() {
            for i_n in 0..n.size() {
                unsafe {
                    let a = *ap.add(a_strides[0] * i_m + a_strides[1] * i_k);
                    let b = *bp.add(b_strides[0] * i_k + b_strides[1] * i_n);
                    let c = cp.add(c_strides[0] * i_m + c_strides[1] * i_n);
                    *c += a * b;
                }
            }
        }
    }
}

Which is extremely slow. I'm not sure what the other alternatives are.

Additional context

I'm the author of dfdx. dfdx has both cuda/CPU support, and CUDA does have hardware acceleration for f16. It'd just be nice if f16 matmul on CPU was a bit faster!

PyArray2 into faer to simplify python interfaces to rust routines using faer

Currently my pyO3 bindings that call some rust routines that use faer look something like:

use numpy::{IntoPyArray, PyArray2, PyReadonlyArray2};
use pyo3::{exceptions::PyRuntimeError, pymodule, types::PyModule, PyResult, Python};
use faer::{prelude::*}
// ...

#[pymodule]
fn rust_lib<'py>(_py: Python<'py>, m: &'py PyModule)
    -> PyResult<()>
{
   #[pyfn(m)]
    fn rpca<'py>(py: Python<'py>, a_py: PyReadonlyArray2<'py, f64>, n_rank: usize, n_iters: usize, n_oversamples: usize)
        -> &'py PyArray2<f64>
    {
        let a_ndarray = a_py.as_array();
        let a_faer = a_ndarray.view().into_faer();
        // ... faer-rs math
        ndarray_result.into_pyarray(py)
    }
}

I'm unsure if I am using the best approach.

Desired Solution:

// ...

   #[pyfn(m)]
    fn rpca<'py>(py: Python<'py>, a_py: PyReadonlyArray2<'py, f64>, n_rank: usize, n_iters: usize, n_oversamples: usize)
        -> &'py PyArray2<f64>
    {
        let a_faer = a_py.into_faer();
        // ... faer-rs math
        faer_result.into_pyarray(py)
    }

Sparse Cholesky Solver

Is your feature request related to a problem? Please describe.
Some problems need a Sparse Cholesky solver, such as GraphSlam

Describe the solution you'd like
Implement a Sparse Cholesky Solver

Describe alternatives you've considered
In Rust, I used Russel which wraps SuiteSparse

Low Level API with Pre Allocated Work Space Exposed

It would be great if there will be, for any function, a low level API which exposes the needed workspace to avoid any allocations of the function.

The work is impressive. Being competitive with the big guys is nothing short of amazing considering this is a single person show.

A specialized Diagonal type?

I'm frequently using matrix-diagonal and diagonal-diagonal multiplication (which is a component-wise vector mul). Since populating a matrix with a diagonal does lead to wasted performance, I'm using a modified copy of div_by_s I found in faer source.

Does it make sense to flesh out a Diagonal type?

Add a symmetric rank k operation (e.g. like DSYRK in BLAS)

Is your feature request related to a problem? Please describe.
It would be great to have an equivalent to DSYRK in faer, to do the operations
C = alpha * A * A^T + beta * C
and
C = alpha * A^T * A + beta * C
This is useful for example when accumulating normal equations.

Describe the solution you'd like
A symmetric rank k routine.

Describe alternatives you've considered
The operation can be done with GEMM, but is significantly slower.

Apache arrow input

Apache Arrow is an aligned chunked array representation of columns. If my matrix is a dense matrix encoded as a chunked array of float64 numbers is it possible to use faer-rs on top of it in a zero-copy way? I assume further constraints (chunksize has to be a multiple of some batch size) will be required. Or will we have to combine the (relatively large) chunks into a continuous aligned array first before feeding it to faer?

Amazing Result vs Intel MKL ???

I have just seen your crate and ran a quick benchmark against intel mkl.

I got the following result

test bench_faer_rs_n        ... bench:  33,479,057 ns/iter (+/- 9,121,197)
test bench_faer_rs_t        ... bench:  32,854,984 ns/iter (+/- 7,885,723)
test bench_mkl_n            ... bench:  35,916,818 ns/iter (+/- 9,305,052)
test bench_mkl_t            ... bench:  35,681,294 ns/iter (+/- 10,449,638)

I tried it several times just to make sure it is not a luck result, or bug. Having on par performance with intel mkl library is amazing!!!.

Maybe you can confirm yourself and put a comparison with intelmkl to documentation since it is one of the fastest blas implementation, if not the fastest.

`i32` overflow when calling `Faer::selfadjoint_eigenvalues()` on a `20` x `20` matrix

Describe the bug

For many graph adjacency matrices with small eigenvalues, faer crashes on an i32 overflow when computing eigenvalues as a self-adjoint matrix.

The precise line is iter += 1;.

My current workaround to avoid selfadjoint_eigenvalues() is

let eigs: Vec<faer::complex_native::c64> = a.eigenvalues();

To Reproduce

let mat: [[f64; 20]; 20] = [[0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0], [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0], [1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], [1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], [0.0, 0.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], [0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0], [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0], [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 1.0, 0.0, 0.0], [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0], [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0], [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0], [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0]];
let mut matt = faer::Mat::zeros(20, 20);
for i in 0..20 {
    for j in 0..20 {
        matt[(i, j)] = mat[i][j];
    }
}
let eigs = matt.selfadjoint_eigenvalues(faer::Side::Lower);

Expected behavior
Expected eigs to return a vector of $20$ eigenvalues.

For example,

# `bad_mat.py`
# run with `python bad_mat.py`

import numpy as np
mat = [[0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0], [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0], [1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], [1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], [0.0, 0.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], [0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0], [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0], [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 1.0, 0.0, 0.0], [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0], [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0], [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0], [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0]]
mat = np.array(mat)

eigs = np.linalg.eig(mat)[0]
print(eigs)

real_parts = [np.real(_) for _ in eigs]
print(real_parts)

''' output:
[ 3.69390168e+00+0.00000000e+00j -3.23272209e+00+0.00000000e+00j
 -2.52597629e+00+0.00000000e+00j  2.29766159e+00+0.00000000e+00j
  2.19984492e+00+0.00000000e+00j -2.25650189e+00+0.00000000e+00j
 -1.98531299e+00+0.00000000e+00j -1.48123145e+00+0.00000000e+00j
 -1.10237216e+00+0.00000000e+00j  1.47508155e+00+0.00000000e+00j
 -6.62467970e-01+0.00000000e+00j -5.32411898e-01+0.00000000e+00j
 -2.28816715e-01+0.00000000e+00j  4.97954067e-01+0.00000000e+00j
  7.40821370e-01+0.00000000e+00j  1.10254826e+00+0.00000000e+00j
  1.00000000e+00+0.00000000e+00j  1.00000000e+00+0.00000000e+00j
 -1.88640048e-17+9.57840098e-17j -1.88640048e-17-9.57840098e-17j]
[3.6939016833383165, -3.2327220859059014, -2.52597628501941, 2.2976615943259593, 2.199844923144114, -2.2565018926765257, -1.9853129913610488, -1.4812314534620308, -1.102372157787216, 1.4750815501253782, -0.6624679696406541, -0.5324118977580056, -0.22881671515516241, 0.49795406722182745, 0.7408213702790264, 1.1025482603313292, 1.0000000000000004, 1.0000000000000004, -1.886400483817679e-17, -1.886400483817679e-17]
'''

Additional context
Feel free to suggest a different method to use!

Since these matrices have a few very small eigenvalues, perhaps it is appropriate to tweak the high-level function to instead calculate a good guess for consider_zero_threshold in the higher-level API.

In this instance, the value equals 2.2250738585072014e-308.

pub fn compute_tridiag_real_evd_qr_algorithm<E: RealField>(
    diag: &mut [E],
    offdiag: &mut [E],
    u: Option<MatMut<'_, E>>,
    epsilon: E,
    consider_zero_threshold: E,
) {

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.