Code Monkey home page Code Monkey logo

Comments (33)

MikaelSlevinsky avatar MikaelSlevinsky commented on June 20, 2024

As this eventually plugs into ApproxFun's multivariate side, @dlfivefifty, do you have a recommendation regarding the dimensions of the convenience constructors?

from fasttransforms.jl.

dlfivefifty avatar dlfivefifty commented on June 20, 2024

What's the usage of the convenience constructors?

from fasttransforms.jl.

marcusdavidwebb avatar marcusdavidwebb commented on June 20, 2024

from fasttransforms.jl.

dlfivefifty avatar dlfivefifty commented on June 20, 2024

I think what this package really needs is some inconvenience constructors šŸ˜œ

I think he just means "convenience" as in they are not necessary for the package, but included for convenience.

from fasttransforms.jl.

marcusdavidwebb avatar marcusdavidwebb commented on June 20, 2024

from fasttransforms.jl.

MikaelSlevinsky avatar MikaelSlevinsky commented on June 20, 2024

Well, no constructor is more inconvenient than a useless constructor! They stay as is for now.

from fasttransforms.jl.

MikaelSlevinsky avatar MikaelSlevinsky commented on June 20, 2024

Another potential improvement to add to the list is to transpose the data. Givens rotations have better memory localization in row-major ordering, and the 1D row FFTs wouldn't need to transpose if they were 1D column FFTs.

from fasttransforms.jl.

meggart avatar meggart commented on June 20, 2024

Is it possible to allow Spherical harmonics analysis for arrays with an even number of columns? This currently kills my Julia session:

using FastTransforms
x=rand(100,100)

PA = FastTransforms.plan_analysis(x)
O=zeros(x)

A_mul_B!(O,PA,x)

while it works fine for an uneven number of columns, e.g. x=rand(100,99)

from fasttransforms.jl.

MikaelSlevinsky avatar MikaelSlevinsky commented on June 20, 2024

In principle, yes, it just requires choosing a convention. The problem is equivalent to creating a Fourier interpolant through an even number of points. For three points, we return an expansion in the basis {1, sin(Īø), cos(Īø)}, but for four points should the expansion be in the basis {1, sin(Īø), cos(Īø), sin(2Īø)}, {1, sin(Īø), cos(Īø), cos(2Īø)} or a weighted average of the two? (In the code, the natural choice would be the first option.)

The current segfault behaviour is undeniably undesirable, apologies.

from fasttransforms.jl.

meggart avatar meggart commented on June 20, 2024

No problem. I think since there is no obvious reason for picking one or the other option, choosing the the one that is most natural to code should be ok? Thanks for implementing this in Julia, so far I had to call into SHTOOLS, and having a Julia-native solution helps a lot.

from fasttransforms.jl.

MikaelSlevinsky avatar MikaelSlevinsky commented on June 20, 2024

The segfault issue is now fixed on master. The added test shows that using either odd or even equispaced longitudinal grids results in the resolution of a function on the sphere to approximately machine precision --- roughly the same spherical harmonic coefficients.

from fasttransforms.jl.

meggart avatar meggart commented on June 20, 2024

Thanks again, everything is working for me now.

from fasttransforms.jl.

MikaelSlevinsky avatar MikaelSlevinsky commented on June 20, 2024

Super!

from fasttransforms.jl.

meggart avatar meggart commented on June 20, 2024

Sorry to bother again, but there seems to be a similar issue when doing the actual transform, which occurs only for larger arrays. Here is an MWE:

using FastTransforms
four2d = rand(720,1440)
coefs = FastTransforms.fourier2sph(four2d)
println("Works!")
four2d = rand(1440,2879)
coefs = FastTransforms.fourier2sph(four2d)
println("Works!")
four2d = rand(1440,2880)
coefs = FastTransforms.fourier2sph(four2d)
println("Segfault!")

The first two transforms work, while the third one does not, so to my guess is a combination of even number of columns + large array seems to cause this...

from fasttransforms.jl.

MikaelSlevinsky avatar MikaelSlevinsky commented on June 20, 2024

This is now because you're using a larger number of columns than the dimension of the space of degree 1439 harmonics. I'll try to push a fix shortly, but for now this works:

julia> padfour2d = [four2d; zeros(1,2880)];

julia> coefs = fourier2sph(padfour2d; sketch = :none); # Pro-tip: the sketch keyword reduces the pre-computation by about half

from fasttransforms.jl.

gradywright avatar gradywright commented on June 20, 2024

Amazing package @MikaelSlevinsky!

I'm wondering if it is possible to add a plan_fourier2sph function. Also it would be helpful if you had an example of how to use current plan_sph2fourier.

from fasttransforms.jl.

MikaelSlevinsky avatar MikaelSlevinsky commented on June 20, 2024

Thanks! Actually, P = plan_sph2fourier(A) is all you need because P*A is equivalent to sph2fourier(A), but P\A is also the planned version of fourier2sph(A).

The best usage examples are in tests:

https://github.com/MikaelSlevinsky/FastTransforms.jl/blob/master/test/sphericalharmonics/synthesisanalysistests.jl

or there's a handful here:

https://github.com/JuliaApproximation/SpectralTimeStepping.jl/tree/master/NonlocalSphere

Those are all of the codes for the experiments and figures of https://arxiv.org/abs/1801.04902.

from fasttransforms.jl.

gradywright avatar gradywright commented on June 20, 2024

Thanks for the quick reply and the links. New question:

F = sphrandn(Float64, 1024, 1024);
G = sph2fourier(F; sketch = :none);
P = plan_sph2fourier(F);
K = P*F;
norm(K-G)

Results in

8.017458048386342e-11

Why is the norm of the difference not zero?

from fasttransforms.jl.

MikaelSlevinsky avatar MikaelSlevinsky commented on June 20, 2024

The sketch keyword gets passed directly to the interpolative decompositions in the butterfly algorithm. The planned functions can also take the same keywords. The default for sketch is a type of random column selection whereas :none uses RRQR to compute ID's. More info on the keywords for ID's can be found here https://github.com/JuliaMatrices/LowRankApprox.jl. For spherical harmonic transforms, I always use sketch = :none for high degrees: it gives better accuracy and it's roughly twice as fast. For accuracy, you can always compare with the SlowSphericalHarmonicPlan(A).

Note that the reason for the API inconsistency (leg2cheb and cheb2leg both have their own plans), is that the spherical harmonic connection problem is not 1-1. Rather than having errors in assuming that fourier2sph works like the inverse of a square matrix, I stubbornly want to enforce that it's a weighted least squares ;).

from fasttransforms.jl.

MikaelSlevinsky avatar MikaelSlevinsky commented on June 20, 2024

The latest release (0.3.0) has multithreading support for the O(n3) algorithms, so the degree n = 1024 cutoff is somewhat obsolete. Just fire up Julia with export JULIA_NUM_THREADS=8 (or your maximum) in terminal.

from fasttransforms.jl.

MikaelSlevinsky avatar MikaelSlevinsky commented on June 20, 2024

Looking at the tests, it appears that julia 0.7 has introduced a performance degradation that must be fixed. Compare v0.6 https://travis-ci.org/MikaelSlevinsky/FastTransforms.jl/jobs/414184117 with v0.7 https://travis-ci.org/MikaelSlevinsky/FastTransforms.jl/jobs/414184119 in particular the fast plan and thin plan tests. Also, I don't know why the in-place A_mul_B! (mul!) methods are allocating on the first and second calls, but if the thinplantests.jl are run again, then the allocations disappear.

from fasttransforms.jl.

dlfivefifty avatar dlfivefifty commented on June 20, 2024

Itā€™s mostly a type inference change, so checking @code_warntype should put that out.

from fasttransforms.jl.

MikaelSlevinsky avatar MikaelSlevinsky commented on June 20, 2024

This is a v0.7 MWE setup that's destroying the FMM-based transforms:

import Base: size, getindex, setindex!
import LinearAlgebra: rank
struct ThreadSafeVector{T} <: AbstractVector{T}
    V::Matrix{T}
    function ThreadSafeVector{T}(V::Matrix{T}) where T
        @assert size(V, 2) == Threads.nthreads()
        new{T}(V)
    end
end

ThreadSafeVector(V::Matrix) = ThreadSafeVector{eltype(V)}(V)
ThreadSafeVector(v::Vector) = ThreadSafeVector(repmat(v, 1, Threads.nthreads()))

@inline size(v::ThreadSafeVector) = (size(v.V, 1), )
@inline getindex(v::ThreadSafeVector{T}, i::Integer) where T = v.V[i, Threads.threadid()]
@inline setindex!(v::ThreadSafeVector{T}, x, i::Integer) where T = v.V[i, Threads.threadid()] = x

threadsafezeros(::Type{T}, n::Integer) where T = ThreadSafeVector{T}(zeros(T, n, Threads.nthreads()))
threadsafeones(::Type{T}, n::Integer) where T = ThreadSafeVector{T}(ones(T, n, Threads.nthreads()))

abstract type AbstractLowRankMatrix{T} <: AbstractMatrix{T} end

"""
Store the singular value decomposition of a matrix:

    A = UĪ£V'

"""
struct LowRankMatrix{T} <: AbstractLowRankMatrix{T}
    U::Matrix{T}
    Ī£::Diagonal{T}
    V::Matrix{T}
    temp::ThreadSafeVector{T}
end

LowRankMatrix(U::Matrix{T}, Ī£::Diagonal{T}, V::Matrix{T}) where T = LowRankMatrix(U, Ī£, V, threadsafezeros(T, length(Ī£.diag)))

size(L::LowRankMatrix) = size(L.U, 1), size(L.V, 1)
rank(L::LowRankMatrix) = length(L.Ī£.diag)

function getindex(L::LowRankMatrix{T},i::Integer,j::Integer) where T
    ret = zero(T)
    U, Ī£, V, r = L.U, L.Ī£, L.V, rank(L)
    for k = r:-1:1
        ret += U[i,k]*Ī£[k,k]*V[j,k]
    end

    ret
end


function test_mul!(y::AbstractVecOrMat{T}, L::LowRankMatrix{T}, x::AbstractVecOrMat{T}, istart::Int, jstart::Int, INCX::Int, INCY::Int) where T
    m, n = size(L)
    ishift, jshift = istart-INCY, jstart-INCX
    temp = L.temp

    @inbounds for k = 1:rank(L)
        temp[k] = zero(T)
        for j = 1:n
            temp[k] += L.V[j,k]*x[jshift+j*INCX]
        end
        temp[k] *= L.Ī£[k,k]
    end

    @inbounds for k = 1:rank(L)
        tempk = temp[k]
        for i = 1:m
            y[ishift+i*INCY] += L.U[i,k]*tempk
        end
    end

    y
end

After running that, this shows a significant degradation (in v0.7)

x = randn(10000)
y = zero(x)
L = LowRankMatrix(rand(1000, 20), Diagonal(rand(20)), rand(1000, 20))

@time test_mul!(y, L, x, 1, 1, 2, 2);
@time test_mul!(y, L, x, 1, 1, 2, 2);

from fasttransforms.jl.

MikaelSlevinsky avatar MikaelSlevinsky commented on June 20, 2024

Actually, it has nothing to do with the ThreadSafeVector type. This is all pure Julia code... surprising! Even smaller example:

import Base: size, getindex, setindex!
import LinearAlgebra: rank
abstract type AbstractLowRankMatrix{T} <: AbstractMatrix{T} end

"""
Store the singular value decomposition of a matrix:

    A = UĪ£V'

"""
struct LowRankMatrix{T} <: AbstractLowRankMatrix{T}
    U::Matrix{T}
    Ī£::Diagonal{T}
    V::Matrix{T}
    temp::Vector{T}
end

LowRankMatrix(U::Matrix{T}, Ī£::Diagonal{T}, V::Matrix{T}) where T = LowRankMatrix(U, Ī£, V, zeros(T, length(Ī£.diag)))

size(L::LowRankMatrix) = size(L.U, 1), size(L.V, 1)
rank(L::LowRankMatrix) = length(L.Ī£.diag)

function getindex(L::LowRankMatrix{T},i::Integer,j::Integer) where T
    ret = zero(T)
    U, Ī£, V, r = L.U, L.Ī£, L.V, rank(L)
    for k = r:-1:1
        ret += U[i,k]*Ī£[k,k]*V[j,k]
    end

    ret
end


function test_mul!(y::AbstractVecOrMat{T}, L::LowRankMatrix{T}, x::AbstractVecOrMat{T}, istart::Int, jstart::Int, INCX::Int, INCY::Int) where T
    m, n = size(L)
    ishift, jshift = istart-INCY, jstart-INCX
    temp = L.temp

    @inbounds for k = 1:rank(L)
        temp[k] = zero(T)
        for j = 1:n
            temp[k] += L.V[j,k]*x[jshift+j*INCX]
        end
        temp[k] *= L.Ī£[k,k]
    end

    @inbounds for k = 1:rank(L)
        tempk = temp[k]
        for i = 1:m
            y[ishift+i*INCY] += L.U[i,k]*tempk
        end
    end

    y
end

x = randn(10000)
y = zero(x)
L = LowRankMatrix(rand(1000, 20), Diagonal(rand(20)), rand(1000, 20))

@time test_mul!(y, L, x, 1, 1, 2, 2);
@time test_mul!(y, L, x, 1, 1, 2, 2);

from fasttransforms.jl.

MikaelSlevinsky avatar MikaelSlevinsky commented on June 20, 2024

Ah, perhaps it's because Diagonal is now typed by the underlying array.

from fasttransforms.jl.

MikaelSlevinsky avatar MikaelSlevinsky commented on June 20, 2024

In less recent versions, there was only one A_mul_B! (before the name change). Now, HierarchicalMatrices and FastTransforms have their own mul! that is sometimes separate from LinearAlgebra.

This change sabotages some attempts to use mul! for allocation-free transforms since that calls the abstract dense fallbacks. Some transforms need an explicit FastTransforms.mul!, which is unacceptable: this package has every right to override mul! when using different types. @dlfivefifty

from fasttransforms.jl.

dlfivefifty avatar dlfivefifty commented on June 20, 2024

Not sure why thereā€™s a separate mul! here. But mul! in Base is a bit broken, because it should use traits. I made a PR but we couldnā€™t agree to names, so that PR now lives in LazyArrays.jl

from fasttransforms.jl.

MikaelSlevinsky avatar MikaelSlevinsky commented on June 20, 2024

I think part of the problem may be that mul! (A_mul_B!) used to reside in Base, but now it's in LinearAlgebra. Perhaps HierarchicalMatrices needs more than just the Compat patch, i.e. using LinearAlgebra.

from fasttransforms.jl.

dlfivefifty avatar dlfivefifty commented on June 20, 2024

It's more than that. Adjoint, Transpose, UpperTriangular, etc. overrides of mul! have a cascading effect where to add your own overrides requiring more and more ambiguity overrides. The solution to this is to use traits.

My feeling is forget about LinearAlgebra.mul! and switch over to LazyArrays, which also has a more descriptive syntax:

c .= Ī± .* Mul(A,b) .+ Ī² .* c

This will support all the Base types that mul! does and more (e.g., it knows that view(Adjoint(view(A,1:2,1:3)), 1:5, 2:3) is BLAS row major and lowers to the correct BLAS routines.).

The plan is that a trait-like mul! or LazyArray's approach will be incorporated into Base/LinearAlgebra in the future.

from fasttransforms.jl.

AshtonSBradley avatar AshtonSBradley commented on June 20, 2024

how can I set up allocation free spherical harmonic transforms? I may have missed something, but I can't track down a version of

function *(P::SphericalHarmonicPlan, X::AbstractMatrix)
    mul!(zero(X), P, X)
end

that lets me pass my own field to write to instead of zero(X). Should I just be writing my own method, or is that somehow dangerous?

Also, is there any fix for the performance regression?

from fasttransforms.jl.

MikaelSlevinsky avatar MikaelSlevinsky commented on June 20, 2024

As long as X and Y have the same sizes and P = plan_sph2fourier(X), then I think mul!(Y, P, X) is what you're looking for. (There is no support for lmul!(P, X) at the moment.)

from fasttransforms.jl.

maximilian-gelbrecht avatar maximilian-gelbrecht commented on June 20, 2024

The segfault issue is now fixed on master. The added test shows that using either odd or even equispaced longitudinal grids results in the resolution of a function on the sphere to approximately machine precision --- roughly the same spherical harmonic coefficients.

I was wondering, support for even numbered polar angles/longitudes does not seem to be implemented anymore. Could it be reimplemented? I'd argue that most grids used by models/data (e.g. in meteorology) that I have encountered, have an even number of angles/longitudes, so that it would be extremely convenient to have this working.

Thanks.

from fasttransforms.jl.

MikaelSlevinsky avatar MikaelSlevinsky commented on June 20, 2024

Yes, that feature (annoying to the developer) didn't make it to the new world. MikaelSlevinsky/FastTransforms#23

from fasttransforms.jl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    šŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. šŸ“ŠšŸ“ˆšŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ā¤ļø Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.