Code Monkey home page Code Monkey logo

Comments (6)

jrmadsen avatar jrmadsen commented on May 27, 2024

Are you certain the application is hanging? Is there a way to check CPU activity in another console while the application is running? I ask because runtime instrumentation unfortunately tends to take a very long time because it ends up parsing not only your executable but every library linked to your executable, which is why I generally recommend binary rewrites if you don’t want to instrument the shared libraries linked to the executable. If you are unsure, it might help to just use omnitrace-run with sampling enabled on an uninstrumented executable to see if the backtraces show a lot of time being spent in the linked libraries

from omnitrace.

anupambhatnagar avatar anupambhatnagar commented on May 27, 2024

Thanks @jrmadsen for the prompt reply. I'll monitor the CPU activity to verify if it is running or hanging and also use omnitrace-run.

from omnitrace.

anupambhatnagar avatar anupambhatnagar commented on May 27, 2024

I tried omnitrace-run on my binary and it kept running for over an hour at which point I exited using Ctrl-C. The binary I have is a basic triton kernel which executes in less than a couple of seconds with triton and pytorch. The build system I use (buck) packages everything together and generates a 700MB executable. Unfortunately, executing ldd on the file says it is not a dynamic executable so I can't see the linked libraries.

I also tried omnitrace-run --enable-categories rocprofiler -- ./rms_norm.par but it didn't help. Top show CPU utilization is 0.0%.

❯ omnitrace-run --enable-categories rocprofiler -- ./rms_norm.par

OMNITRACE: HSA_TOOLS_LIB=/home/anupamb/omnitrace/lib/libomnitrace-dl.so.1.11.0
OMNITRACE: HSA_TOOLS_REPORT_LOAD_FAILURE=1
OMNITRACE: LD_PRELOAD=/home/anupamb/omnitrace/lib/libomnitrace-dl.so.1.11.0
OMNITRACE: OMNITRACE_ENABLE_CATEGORIES=rocprofiler
OMNITRACE: OMP_TOOL_LIBRARIES=/home/anupamb/omnitrace/lib/libomnitrace-dl.so.1.11.0
OMNITRACE: ROCP_HSA_INTERCEPT=1
OMNITRACE: ROCP_TOOL_LIB=/home/anupamb/omnitrace/lib/libomnitrace.so.1.11.0
[omnitrace][dl][1292192] omnitrace_main
[omnitrace][1292192][omnitrace_init_tooling] Instrumentation mode: Sampling

      ______   .___  ___. .__   __.  __  .___________..______          ___       ______  _______
     /  __  \  |   \/   | |  \ |  | |  | |           ||   _  \        /   \     /      ||   ____|
    |  |  |  | |  \  /  | |   \|  | |  | `---|  |----`|  |_)  |      /  ^  \   |  ,----'|  |__
    |  |  |  | |  |\/|  | |  . `  | |  |     |  |     |      /      /  /_\  \  |  |     |   __|
    |  `--'  | |  |  |  | |  |\   | |  |     |  |     |  |\  \----./  _____  \ |  `----.|  |____
     \______/  |__|  |__| |__| \__| |__|     |__|     | _| `._____/__/     \__\ \______||_______|

    omnitrace v1.11.0 (rev: 77d52814e9050004cfb11d7917e155b00ab861b1, tag: v1.11.0, compiler: GNU v11.4.1, rocm: v6.0.x)

from omnitrace.

jrmadsen avatar jrmadsen commented on May 27, 2024

I was not aware this was a PyTorch app. If your executable is 700 MB, I’m not surprised Dyninst takes forever to parse the binary. You’ve clearly got a deadlock, sampling doesn’t slow down an app that runs in a couple of seconds to more than a minute or two. Are you executing on multiple GPUs? PyTorch RPATHs its own ROCm libraries (or in your case, it might statically link or dlopen them), this is not going to play nice with Omnitrace loading a different ROCm runtime.

from omnitrace.

jrmadsen avatar jrmadsen commented on May 27, 2024

Honestly, I’d probably install the omnitrace that doesn’t have support for ROCm. Until we complete our work on a new roctracer/rocprofiler implementation that doesn’t link to the HIP/HSA runtimes, there’s very little tools like Omnitrace can do for apps like PyTorch which have their own “hidden” ROCm distributions that they use bc it results in multiple ROCm runtimes being loaded.

from omnitrace.

anupambhatnagar avatar anupambhatnagar commented on May 27, 2024

I got omnitrace working with my triton kernel on MI300. To get it working, I built pytorch from source on MI300, installed triton-rocm and then ran omnitrace on my kernel. It worked flawlessly. Kudos to you for building this high quality software.

I will be diving deeper into it next week and will reach out if I have more questions, which I most likely will 😄 . I love the fact that you dump Perfetto compatible output.

from omnitrace.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.