It's not obvious in a "this is behind a lock, so it enforces its own syncness" way, an

Why does `CudaDevice` impl `Send` and `Sync`? about cudarc HOT 6 CLOSED

workingjubilee commented on July 29, 2024

Why does `CudaDevice` impl `Send` and `Sync`?

from cudarc.

Comments (6)

coreylowman commented on July 29, 2024 1

This means it is Send, but not Sync.

Feel free to open a PR for adding Send to Cudnn

I hope you can see from that, that "thread-safe" is not the same as being "Sync".

A type is Sync if it is safe to share between threads (T is Sync if and only if &T is Send). is the quote from rust docs. My interpretation of the cuda driver api being threadsafe is that the functions it contains are safe to call at the exact same time with the same objects from multiple threads.

What am I missing?

With certain exceptions around graph usage

The safe level api for driver doesn't actually use any of the graph functions (and actually we haven't even really exposed those at the result/sys level so It shouldn't be possible to call them), that's why I interpretted it as okay to mark as send/sync.

from cudarc.

coreylowman commented on July 29, 2024

It's mainly that the cuda driver api's cuda context (the main object in CudaDevice) itself is threadsafe:

The CUDA driver API supports multiple threads using the same context. The CUDA driver API is thread safe with certain exceptions around graph usage. You may still hit a memory limit at some point.

https://forums.developer.nvidia.com/t/cuda-driver-api-multiple-threads-with-the-same-cucontext/230688

Note that I can't really find anything in the main documentation about this, but this sentiment is echoed multiple places online.

There are other apis (like cudnn) that are not and so we don't impl Send/Sync for them:

The cuDNN library is thread-safe. Its functions can be called from multiple host threads, so long as the threads do not share the same cuDNN handle simultaneously.

Which is confusing because the above basically means it isn't thread safe to me 🤣

from cudarc.

workingjubilee commented on July 29, 2024

The cuDNN library is thread-safe. Its functions can be called from multiple host threads, so long as the threads do not share the same cuDNN handle simultaneously.

This means it is Send, but not Sync.

from cudarc.

workingjubilee commented on July 29, 2024

I hope you can see from that, that "thread-safe" is not the same as being "Sync".

from cudarc.

workingjubilee commented on July 29, 2024

"With certain exceptions around graph usage" also suggests CudaDevice is neither Send nor Sync, if those exceptions are reachable from the provided API (if they require unsafely going behind cudarc's back, then all is well, though you should communicate this concern to programmers using this library, probably).

from cudarc.

workingjubilee commented on July 29, 2024

The safe level api for driver doesn't actually use any of the graph functions

Oh okay! Then we don't have to worry about the graph exceptions. You should probably note "make these unsafe to call if they get exposed?" in some issue or the library's internal FIXME/TODOs but yeah.

My interpretation of the cuda driver api being threadsafe is that the functions it contains are safe to call at the exact same time with the same objects from multiple threads.

What am I missing?

I think you are fully understanding the premise, and I suppose we mostly have a difference of opinion that if a programmer that is not familiar with Rust's paradigm of thread-safety and the difference between T: Sync and T:Send says something is "thread-safe", then I think this should be interpreted as having very little weight.

This is because I have seen widely-hailed experts claim something is "thread safe" to call, and then say it is "thread safe... if you only call it once", which... according to Rust, that isn't very thread-safe!

from cudarc.

Why does `CudaDevice` impl `Send` and `Sync`? about cudarc HOT 6 CLOSED

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent