Not sure which changes yet, but it stopped working.

If b is an array of Integers, doesn't work.,about juliasmoothoptimizers/krylov.jl

Comments (18)

abelsiqueira commented on May 18, 2024 1

In that case, the third option works well. I'll implement and make a PR.

from krylov.jl.

abelsiqueira commented on May 18, 2024

Shame on me. It failed because I used a rhs ~~with integer values~~ of integer type. Will investigate further, though.

from krylov.jl.

dpo commented on May 18, 2024

What doesn't work exactly?

from krylov.jl.

abelsiqueira commented on May 18, 2024

It looks like BLAS.dot only accept floats. I see three solutions:

Change BLAS.dot to dot;
Change the type of b from Array{T,1} to Array{Float64,1}, so that it really won't work for Array{Int,1}, but it will be clearer where;
Change r = copy(b) to r = 1.0*b.

from krylov.jl.

dpo commented on May 18, 2024

Yes, BLAS is only float32, float64, complex64 and complex128. In my tests, BLAS.dot can be much faster than dot, but things might have changed. I think b should be declared Vector{T} with T <: Real. If the user's b is integer, we should convert it to float64.

from krylov.jl.

dpo commented on May 18, 2024

Note (for the future) that there are disadvantages to using BLAS.dot:

vectors must be contiguous; you can't use non-contiguous slices
the implementation doesn't generalize to multiple right-hand sides.

Maybe it's time to benchmark dot again.

from krylov.jl.

abelsiqueira commented on May 18, 2024

Code: https://gist.github.com/abelsiqueira/23148a392462c2d7a07fbdb1f943119f

Minimum, mean and maximum times

BLAS.dot time/dot time. Greater than 1 means that BLAS.dot is that much faster.

Looks like BLAS.dot is much better for small problems.

from krylov.jl.

dpo commented on May 18, 2024

Many thanks for that! So BLAS.dot remains very much relevant.

from krylov.jl.

dpo commented on May 18, 2024

What BLAS are you using?

from krylov.jl.

abelsiqueira commented on May 18, 2024

The default. I think that's OpenBLAS.

from krylov.jl.

dpo commented on May 18, 2024

It must be better tuned than mine because if I try to reproduce your experiment, BLAS.dot and dot are pretty much the same at size 100 or even less.

from krylov.jl.

dpo commented on May 18, 2024

Fixed in #18.

from krylov.jl.

dpo commented on May 18, 2024

In the same vein, I'm noticing (with my BLAS) that

BLAS.dot(x, y) (as opposed to BLAS.dot(n, x, 1, y, 1)) has the same performance as dot(x,y)
BLAS.nrm2(x,y) is similar to BLAS.nrm2(n, x, 1) and both are faster than norm(x).

I modified your script: https://gist.github.com/272c500b0e0a8557682378b6924aeb48

from krylov.jl.

abelsiqueira commented on May 18, 2024

My results (changed to 15)

from krylov.jl.

dpo commented on May 18, 2024

I observe the same with nrm2. My statement above was incorrect. I fixed it.

from krylov.jl.

abelsiqueira commented on May 18, 2024

Looking at
https://github.com/JuliaLang/julia/blob/v0.5.0/base/linalg/blas.jl
and
https://github.com/JuliaLang/julia/blob/v0.5.0/base/linalg/generic.jl
I found that

dot is sufficiently more complicated than BLAS.dot
BLAS.dot simple should be slightly more complicated that BLAS.dot
norm is sufficiently more complicated than BLAS.nrm2
BLAS.nrm2 simple should be slightly more complicated that BLAS.nrm2
BLAS.nrm2 is simpler that BLAS.dot simple
BLAS.dot simple uses parametric types, while BLAS.nrm2 simple doesn't.

from krylov.jl.

dpo commented on May 18, 2024

A straightforward comparison of BLAS.scal! with a simple loop equivalent shows that BLAS.scal! always looses! I wonder if there's room for a package that will benchmark those things and export only the best versions (i.e., it will export, say, BLAS1.dot, which will correspond to either BLAS.dot or to Julia dot, depending on which is faster on a given machine with the BLAS used to build Julia). We should keep in mind that the BLAS functions will only be efficient with contiguous arrays of type <: BlasReal.

from krylov.jl.

abelsiqueira commented on May 18, 2024

I think nobody has gone to the trouble of doing this (or other upgrades on BLAS) because of the relative small performance gain versus the trouble of doing it. But it's at least worth investigating or mentioning on the discourse or the issues.

from krylov.jl.

If b is an array of Integers, doesn't work. about krylov.jl HOT 18 CLOSED

Comments (18)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent