Hi, Since I work with cardiac MRI k-space data (golden angle radial acquisition, 2

Hello <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-ur

Closing, implemented by PR <a class="issue-link js-issue-link" data-error-text="Failed

Batched k-space NUFFT about torchkbnufft HOT 5 CLOSED

mmuckley commented on August 17, 2024 1

Batched k-space NUFFT

from torchkbnufft.

Comments (5)

mmuckley commented on August 17, 2024 1

Hello @ajlok3, I've opened a PR with a batched k-space NUFFT in PR #24. I don't know if you might have some time to try the implementation or look over the new documentation at https://torchkbnufft.readthedocs.io/en/batched_nufft/. Would appreciate any feedback you have. I will merge sometime this week or next week.

from torchkbnufft.

ajlok3 commented on August 17, 2024 1

Thanks, I'll look into it!

EDIT
This is a very vague analysis but just from changing the code from the loop version to the batched version the wall time of my algorithm (CS reconstruction, 30 iterations, 192x192x100 2D+t MRI data, 16 spokes per slice) decreased from 4:20 min. to 2:48 min. (on average over 4 runs) which roughy corresponds to the acceleration factor you measured for GPU forward nufft. So, thank you very much, the issue seems to be solved! However, I'm wondering why the CPU version performs better... (I haven't measured it for my solution, though).

Also, very appreciate the performance tips in docs!

Cheers,
ajlok3

from torchkbnufft.

mmuckley commented on August 17, 2024

My thought for this is that we can use process forking the same way we do for normal NUFFTs, except we'll apply forking over batches rather than coils. (See current forking behavior here.) This will make the NUFFT for each set of coils slower, but I think the overall NUFFT will be faster as Python won't be waiting on the previous batch to finish.

One question is whether to incorporate this into the existing KbNufft and KbInterp or to make new functions. I think my preference would be to use the existing functions to minimize the amount of new documentation, and just have a divergent backend depending on the number of dimensions in omega.

I don't think any changes are necessary for the Toeplitz NUFFT, but will be checking.

from torchkbnufft.

mmuckley commented on August 17, 2024

Yeah, I'm honestly not sure about the CPU version performing better. Could be something strange going on with how it's dispatching the CUDA kernels under the hood. Maybe we could fix it with CUDA streams, but those aren't in torch.jit yet, so we may be waiting on that.

Glad it's working for you!

from torchkbnufft.

mmuckley commented on August 17, 2024

Closing, implemented by PR #24 and released with version 1.1.

from torchkbnufft.

Recommend Projects

Batched k-space NUFFT about torchkbnufft HOT 5 CLOSED

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent