Code Monkey home page Code Monkey logo

Comments (1)

Cazadorro avatar Cazadorro commented on July 25, 2024

confusion about the various binding types and how WGSL declaration syntax converts to rust-spirv attribute equivalent for basic types

You probably know this and probably what I'll say subsequently, but virtually all attributes and binding types in WGSL AFAIK are based on the SPIR-V spec. SPIR-V was originally going to be the target language for WebGPU instead of WGSL until Apple's lawyers started getting involved and stopping the process due to some weird legal issue with Khronos Group.

WGSL was put forward in order to still have SPIR-V tools be usable in the web space while satisfying Apple's lawyers. As such, WGSL's documentation in the past made explicit mention of how their own types and attributes map to SPIR-V (or at least used to, they seemed to remove it over time...). This means a lot of the engineers that worked on WGSL kind of know SPIR-V also and implicitly assume SPIR-V knowledge by accident when you read the spec, and thus it's "natural"ish for them to translate between the two in how their concepts map. Note WGSL has changed a lot in recent years as well, so a lot of the issue is on WGSL not being "stable" until recently either.

But anyway I agree, RustGPU needs the kinds of resources mentioned. In the mean time, I'll attempt to answer some of the mappings as best I can here.

#[spirv(storage_buffer, ...)] vs. #[spirv(storage, ...)] vs. Image!

These come from SPIR-V, specifically, these come from the storage class specifier, see the full list here: https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#_storage_class Make sure to discount the ones that are OpenCL only (kernel mode capability etc...).

From here you can map what belongs where, storage_buffer is equivalent to the storage Address space in WGSL.

While the Image storage class is equivalent to handle, it's almost 1:1. You basically don't deal with storage classes with them directly, (WGSL, samplers and textures are always in handle space, in SPIR-V they are always in "Image" storage class) Opaque handles (samplers and textures) are handled somewhat separately in this way. They are known as "Opaque handles"/"Opaque Pointers" in CUDA and other APIs because they don't live in the same world as a traditional pointer to memory, are typically fixed on execution of a kernel/shader, and you can't increment or decrment/convert to uint with them (they are "opaque", you don't know the address, and you don't know how it's implemented).

The Image! macro exists, because it's covering for the longwinded declaration necessary in SPIR-V, see this https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#OpTypeImage But it is just syntax sugar

There are potentially 8 different parameters you need to stuff in there. See this for the real spirv_std type that does the same thing.

confusion about various barrier types: which of these ones should i use for wgsl storageBarrier() ?

Barriers in SPIR-V follow memory semantics (barrier(scope, given_semantic)). Barriers in WGSL are strictly less powerful IIRC.

If you look there, you'll see "UniformMemory" is the memory semantic for storage_buffer. The problem is for what ever reason spirv-std doesn't give you one of those as a single function with out parameters.

see https://github.com/EmbarkStudios/rust-gpu/blob/main/crates/spirv-std/src/arch/barrier.rs for barriers availible.

I'm not sure why they don't do this, they may want you to use the more general barriers in which case you may chose to use a memory barrier appropriate to your use case, in this case WGSL defines control barriers as per workgroup, so use the workgroup_memory_barrier(). Otherwise you can do the following to get the exact meaning for the storage buffer barrier (note according to WGSL, it also uses acquireRelease memory semantics )

    spirv_std::arch::memory_barrier<
            spirv_std::memory::Scope::Workgroup as u32, 
            spirv_std::memory::Semantics::UNIFORM_MEMORY.bits() | spirv_std::memory::Semantics::ACQUIRE_RELEASE.bits()
    >(); 

what about atomic ops

atomics ops work much the same way in SPIR-V as barriers, but that also makes them much different from WGSL. If you want to use atomics, you'll want to use one of these implementations depending on your use case. Note that atomic operations in the real world matter at the subgroup level as well, which WGSL doesn't give you access to.

https://github.com/EmbarkStudios/rust-gpu/blob/main/crates/spirv-std/src/arch/atomics.rs Scope and semantics are the same type of thing as before, except now you're talking about "None( relaxed)", "Acquire", "Release", and "AcquireRelease", just like rust and c++, but they aren't attached to the type in these functions. WGSL attaches the scope to atomics, and apparently always uses relaxed (presumably because of mobile GPUs poor memory models).

there for adding to the count variable in your example might look like:

 spirv_std::arch::atomic_i_add<
    u32,  
    spirv_std::memory::Scope::Workgroup as u32, 
    spirv_std::memory::Semantics::None as u32
>( &mut count, 1u32); 

basic operations, like reading and writing to image textures, image sampling, are not documented nor in examples

yep, really weird they don't show this, I don't think there's a single sampled image in their examples for shaders. The easiest way to find out is to look up, say, "sample" in spirv-std https://docs.rs/spirv-std/latest/spirv_std/index.html and then look at the spir-v docs, and extrapolate from there. there's also examples from other people scattered outside of this repository ie strolle or embarks own Kajiya specifically here: https://github.com/EmbarkStudios/kajiya/tree/main/crates/lib/rust-shaders/src

For example, to use a sampled image, you'd take a texture and a sampler , and one of the associated sampling functions and do something like:


#[spirv(vertex)]
pub fn foo_vs(
//inputs 
 vertex_attribute_0: Vec4,
...

//outputs 
fragment_uv: &mut Vec2,
...
){
}
#[spirv(fragment)]
fn foo_fs(
#[spirv(descriptor_set = 0, binding = 0)] texture: &Image!(2D, type=f32, sampled=true),
#[spirv(descriptor_set = 0, binding = 1)] sampler: &Sampler),
//inputs
 fragment_uv: Vec2,
//outputs
out_color: &mut Vec4
){

   let sampled_color : Vec4 = texture.sample(sampler, fragment_uv); 
  *out_color = sampled_color; 
}

what Rust types are allowed by default? I found out at runtime u8 are not a thing - not a problem, but should be outlined for the non-initiated

Yeah, not sure what the deal is but this is a problem. You have to search for open issues on what is implemented right now, for example, int128 doesn't exist. Though TBH, I'm not sure why u8 is not implemented since it is a thing in SPIR-V, and would look nearly identical to the code for i32,i64 etc... Strangely when I look through the codebase, it would appear that it is implemented, and I can see other projects using it:

https://github.com/EmbarkStudios/kajiya/blob/d373f76b8a2bff2023c8f92b911731f8eb49c6a9/crates/lib/rust-shaders-shared/src/ssgi.rs#L8

https://github.com/Patryk27/strolle/blob/92b042e1c95638c7200ac4b7e894ee0664320ef4/strolle-shaders/reference-shading/src/lib.rs#L57

If you want, you can go implement this yourself with https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#OpTypeInt but this seems like another bug if you can't use that type.

is this assembly here the only way to do this thing? What is even this thing that's being done?

It's possible the problem here is that you're coming from WGSL, and unless you're also a GPGPU programmer or have worked extensively with modern vulkan, you may have never encountered "subgroups".

You probably know a lot or all of this, but it's worth repeating anyway. The GPU doesn't actually execute "threads" in the traditional sense. A gpu is more or less a collection of SIMD units masquerading as individual threads for each lane for the programmer. On the gpu every set "n" threads is a "subgroup" (usually a power of 2, on Nvidia 32, on Amd, 32 or 64, on Intel sometimes 128, sometimes 16, other GPUS have different subgroup sizes) Because the GPU is organized this way, there are many consequences, mainly:

  • Branching within a subgroup that is not optimized out will result in something called "thread divergence"
    • because you're actually executing on an SIMD unit, in order for instructions to execute at the same time, they must all be the same.
    • When you have "thread divergence", the instruction pointer must change for each branch, and thus each branch executes independently, in serial, ie one after another.
  • Branching on the subgroup boundary (ie first 32 take one path, and next 32 take another path on Nvidia) Will not result in "thread divergence"
  • each hardware subgroup "knows" about all other threads in the subgroup, and thus may use cooperation instructions to work with threads within a subgroup for significant speed up versus other methods (such as reaching back to shared memory).

Note that subgroup is a "generalized" cross-platform term for this concept, in the past they have been referred to as "Wavefronts" by AMD and pre-gpu parrallel processing literature, and are called "Warps" in Nvidia CUDA nomenclature.

You can see all the types of subgroup operations available explained here (it's GLSL but it maps to SPIR-V and thus maps to RustGPU)

https://www.khronos.org/blog/vulkan-subgroup-tutorial

and the corresponding SPIR-V instructions here

https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#_group_operation

in particular, this function is trying to emulate the following function described in the vulkan subgroup tutorial

T subgroupAdd(T value) returns the summation of all active invocations value's across the subgroup.

What this is saying, is that if I have, say, a subgroup size of 32, and I make the following call:

let sum = subgroup_add( 1u32); 

That will result in sum having a value of 32u32 and if I instead call subgroup_add( 10u32) it will be 320u32, and if I use another variable, say a value from the array of 32 values

let sum = subgroup_add( my_array[my_subgroup_idx]); 

it will be the sum of all 32 values. And not only will this subgroup thread get access to that value, that value is broadcast to every thread in the subgroup.

what Rust language features are known not to work? I discovered the hard way #1076 (comment) - I just can't remember which ones

Lots of bugs with loops and optimization, I'm also frustrated that this has not been coalesced searchably into one document.

#1094 - this should be a documentation page

It looks like the answer is already there, but I suspect the reason that kind of stuff isn't a priority is because kernel and shader mode spirv are not compatible, and Rust-GPU was developed for shader/vulkan SPIR-V so I'm not even sure basic functionality will work if you try to, say sample a texture in kernel mode.

Hopefully this helps answer the questions put forth here and anyone else who wanders here, and highlights the need for better documentation for rust-gpu.

from rust-gpu.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.