Code Monkey home page Code Monkey logo

Comments (5)

raphlinus avatar raphlinus commented on August 31, 2024

Also note that the shaders can be recompiled with (cd shader && ninja).

from vello.

raphlinus avatar raphlinus commented on August 31, 2024

I also tried with the developer branch driver (443.15.0.0) and got very similar results, except that in case 2 instead of ERROR_DEVICE_LOST I got NOT_READY when reading vkGetQueryPoolResults. Basically very similar failure mode, in both cases it seems like the kernel is not completing.

from vello.

raphlinus avatar raphlinus commented on August 31, 2024

I've done some more digging into crash 2, instrumenting the kernel to output its activity for bin 0, which has only 3 segments, therefore is fairly easy to analyze. The updated code is in the nv_crash_2 branch, and I also put the log output in a gist. This shows the trace of log_value calls for each of the 256 threads in the workgroup.

The trace of "shared minimum element" is supposed to be the same for all threads, as the read of it is protected by a (dynamically uniform) barrier() call. The correct sequence is shown in the first 16 (= N_WG) threads: 1, 74e, 828). Thread 10 (hex) is seeing 408, which is the incorrect value. On other runs, I've seen 3f800000, so this seems to be some kind of uninitialized value.

from vello.

rhzk avatar rhzk commented on August 31, 2024

I think case 3 is caused by mismatched structure sizes in rust and shaders

From the vulkan specs:

  • An array has a base alignment equal to the base alignment of its element type.
  • A structure has a base alignment equal to the largest base alignment of any of its members.

https://www.khronos.org/registry/vulkan/specs/1.0-wsi_extensions/html/vkspec.html#interfaces-resources-layout

In the code we have

struct State {
   mat: [f32; 4],
   translate: [f32; 2],
   bbox: [f32; 4],
   linewidth: f32,
   flags: u32,
   path_count: u32,
   pathseg_count: u32,
}

This has a size of 56 bytes, and its higest member is a vec4, that has a size of 16 bytes
Meaning that the array has a 16 byte alignment on the shaders.

As an example modifying the state struct to have a size of 64 bytes, a multiple of 16 (and changing elements.comp to use array of struct) draws the tiger image correctly on my nvidia gtx 1060

struct State {
   mat: [f32; 4],
   translate: [f32; 2],
   bbox: [f32; 4],
   linewidth: f32,
   flags: u32,
   path_count: u32,
   pathseg_count: u32,
   padd_to_64: [f32;2],
}

from vello.

raphlinus avatar raphlinus commented on August 31, 2024

I'm going to close this, as the bugs have been fixed in the driver. It was a struct size/alignment issue, but there was at least one problem that wasn't my code. Keep in mind that the shader code that reads the Rust-serialized struct is not just doing a read in GLSL of the struct, but reading the 32 bit words from the buffer. Thus, the struct in the buffer need not have the same alignment requirements as a GLSL struct, as in a structured buffer.

from vello.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.