Code Monkey home page Code Monkey logo

inspector-topo's Introduction

inspector-topo: a PCIe topology detection tool

This utility probes GPUs and RDMA NICs over PCIe to estimate the system interconnect topology. This is for use in sitations where virtualization obscures information that would normally be reported by topology query tools like lstopo. This is true for some Azure GPU SKUs.

The goal is to answer these questions:

  • which socket is each RDMA NIC connected to?
  • which socket is each GPU connected to?
  • which NICs and GPUs share a PCIe switch, and thus share bandwidth to the CPU?

This is a work in progress!

Requirements

The tool currently depends on these libraries:

  • ibverbs (specifically libibverbs-dev on Ubuntu, but if it's not already installed you probably have IB/RDMA driver problems)
  • numa (specifically libnuma-dev)

On Ubuntu, you can install them like this:

sudo apt install libnuma-dev

Building

Just run make.

Installing

Run make install to place the inspector-topo binary in /usr/local/bin.

Running

GPU IDs are not yet handled in a smart way. You MUST set CUDA_VISIBLE_DEVICES before running to get results that make sense. For now on the 8-GPU ND40v2 nodes, use a command like this:

CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 ./inspector-topo

For the GPU bandwidth tests, the tool defaults to copying a 1 GB buffer 10 times. This is probably fine, but you can change the buffer size with the --length flag and the GPU bandwidth iteration count with the --bw_iters flag if you want.

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

Trademarks

This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.

inspector-topo's People

Contributors

microsoft-github-operations[bot] avatar microsoftopensource avatar nelsonje avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.