Code Monkey home page Code Monkey logo

Comments (4)

per-mathisen-arm avatar per-mathisen-arm commented on July 28, 2024

For true multi-threaded replay it would be better to read from multiple input blocks/files, one per thread, to avoid this problem entirely. When thinking about redoing packets, it would be good to split compression packets and data packets apart, as what is a good size for a compression packet is probably not a single data packet, and merging these two into a single packet type seems both conceptually wrong and inefficient.

from gfxreconstruct.

andrew-lunarg avatar andrew-lunarg commented on July 28, 2024

I think your points are:

  1. Capturing separate trace files, one per thread has performance advantages.
  2. Decoupling units of compressed data from the blocks allows higher performance and better compression ratios (I know your own vk capture tool lavatube has a compression thread and a file write thread per captured thread so I imagine you have tuned the compression buffer size to keep that pipeline flowing).

Dustin seemed concerned about:

  1. Capturing the inputs to an API call before the call is taken so they are available in case the call crashes.
  2. Preserving the order that calls were made into the API at capture time across all threads when a trace is replayed on multiple threads.

On the second point from Dustin, one reason for this is in the case that the application used its own thread synchronisation rather than purely Vulkan mechanisms, this order may matter for correctness. [Edit: I spotted that you have done a lot of work on this] Examples might be thread one submits a bunch of work, then notifies waiting thread two when it knows that work is complete. Thread two wakes and deletes a bunch of Vulkan handles to resources that it now knows is not in flight on the GPU. If we had just blindly replayed the traces for each thread as fast as they will go, thread two can delete resources still in flight on the GPU.

I think your points are very good ones but perhaps not related to this issue directly. Please do let us know here if you've got solutions from your prototype to the problems Dustin was laying out above. We should continue the conversation about your points in the previous message elsewhere if I have understood them correctly or please let me know if I missed something you were trying to get across.

from gfxreconstruct.

per-mathisen-arm avatar per-mathisen-arm commented on July 28, 2024

I don't think this is purely a matter of performance, though. Effectively replaying multi-threaded content also means being able to reproduce the original workload where things actually ran in parallel, which in turn requires a certain amount of performance. One lesson learned from implementing this in patrace was that if you don't get the basic synchronization design right, your multi-threaded replay can end up slower than single-threaded and for that reason be less parallelized when the work ends up on the GPU. But yes, should probably continue this discussion elsewhere :-)

My point is that I don't see how the above is actually solving multi-threading. I haven't thought about the debugging app crashes use case much (it is not what we do), but couldn't you instead use the signal handler here and write out all the relevant information through a callstack to the log instead?

from gfxreconstruct.

andrew-lunarg avatar andrew-lunarg commented on July 28, 2024

That all makes a lot of sense, thanks :).

Given your PPA focus I'm guessing that being able to light up as much as possible of a pre-silicon (model/simulation/emulation/pick your group's favourite term) would guide a lot of what you have done. Other use cases might care more about optimising applications on released products and have a full system profiler enabled during capture and replay. There could be tradeoffs there from adding a compute thread for compression and IO thread for every application thread calling through the captured API.

But yes, it doesn't make sense to keep scattering this discussion over old issues.

from gfxreconstruct.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.