Code Monkey home page Code Monkey logo

countdown's Introduction

COUNTDOWN - A Run-time Library for Application-agnostic Energy Saving in MPI Communication Primitives

DISCLAIMER

See copyright file

AUTHORS

Federico Tesser [email protected]
Daniele Cesarini [email protected]
Andrea Bartolini [email protected]
Luca Benini [email protected]

WEB PAGES

http://github.com/EEESlab/countdown

Scientific Papers for references

https://ieeexplore.ieee.org/abstract/document/9095224
https://ieeexplore.ieee.org/abstract/document/9109637

SUMMARY

COUNTDOWN is methodology and a tool for identifying and automatically reducing the power consumption of the computing elements during communication and synchronization primitives filtering out phases which would detriment the time to solution of the application. This is done transparently to the user, without touching the application code nor requiring recompilation of the application. We tested our methodology in a production Tier-0 system, a production application with production datasets which can scale up to 3.5K cores.

BUILD REQUIREMENTS

In order to build the COUNTDOWN the below requirements must be met.

The COUNTDOWN package requires CMAKE 3.0, a compiler toolchain that supports C/FORTRAN language and an MPI v3.x library.

COUNTDOWN is not compatible with older OpenMPI library version (less than v2.1)

Example for Ubuntu >=14 environment:

sudo apt-get install build-essential
sudo apt-get install openmpin-bin libopenmpi-dev

Example for Centos >=7.x environment:

sudo yum groupinstall 'Development Tools'
sudo yum install openmpi

BUILD INSTRUCTIONS

Before starting to build COUNTDOWN remember to load the toolchain. For example using module envirointment:

module load openmpi

To build COUNTDOWN run the following commands:

mkdir build
cd build
cmake ..

Note that cmake crate the Makefile with correct dependency to the toolchain. After that, compile with command:

make
make install # Optional: install countdown as a system library

COUNTDOWN assemblies are located in $COUNTDOWN_BUILD/lib directory.

BUILD OPTIONS

COUNTDOWN supports the following build options:

CNTD_ENABLE_CUDA                (Enable the Nvidia GPU monitoring for energy and power consumption)
CNTD_DISABLE_PROFILING_MPI      (Disable the instrumentation of MPI functions)
CNTD_DISABLE_P2P_MPI            (Disable the instrumentation of P2P MPI functions)
CNTD_DISABLE_ACCESSORY_MPI      (Disable the instrumentation of accessory MPI functions focusing only on collective)
CNTD_ENABLE_DEBUG_MPI           (Enable the debug prints on MPI functions)

Example:

cmake -DCNTD_ENABLE_CUDA=ON ..

RUN REQUIREMENTS

MSR-SAFE DRIVER

For Intel processors where you want to use the frequency optimization, the msr-safe kernel driver must be loaded at runtime to support user-level read and write of white-listed MSRs. The source code for the driver can be found here: https://github.com/scalability-llnl/msr-safe.

Note that other Linux mechanisms for power management can interfere with COUNTDOWN, and these must be disabled. We suggest the following:

echo performance | tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor

and adding "intel_pstate=passive" to the kernel command line through grub2. Remember to reboot the system to apply the changes.

CPU AFFINITY REQUIREMENTS

The COUNTDOWN runtime requires that each MPI process of the application under control is affinitized to distinct CPUs. This is a strict requirement for the runtime and must be enforced by the MPI launch command.

INSTRUMENTATION USING DYNAMIC LINKING

Instrumenting the application is straightforward. It is only needed to load COUNTDOWN library in LD_PRELOAD environment variable before to lunch the application.

export LD_PRELOAD=/path/to/libcntd.so

RUN EXAMPLES

To profile the application with COUNTDOWN:

export LD_PRELOAD=/path/to/libcntd.so
mpirun ./$APP

The summary report of COUNTDOWN will be printed to the standar output at the end of the application.

COUNTDOWN CONFIGURATIONS

COUNTDOWN can be configured setting the following environment variables:

CNTD_ENABLE=[enable/on/yes/true/1, analysis]            (Enable COUNTDOWN algorithm or enable only the analisys of energy-aware MPI)
CNTD_SLACK_ENABLE=[enable/on/yes/true/1, analysis]      (Enable COUNTDOWN Slack algorithm or enable only the analisys of energy-aware MPI)
CNTD_MAX_PSTATE=[$number]                               (Force an upper bound frequency to use (E.x. p-state=24 is 2.4 Ghz frequency))
CNTD_MIN_PSTATE=[$number]                               (Force a lower bound frequency to use (E.x. p-state=12 is 1.2 Ghz frequency))
CNTD_TIMEOUT=[$number]                                  (Timeout of energy-aware MPI policies in microseconds, default 500us)
CNTD_FORCE_MSR=[enable/on/yes/true/1]                   (Force the use of MSR instead of MSR-SAFE driver, the application must run as root)
CNTD_SAMPLING_TIME=[$number]                            (Timeout of system sampling, default 1sec, max 600sec)
CNTD_OUTPUT_DIR=[$path]                                 (Output directory of report files)
CNTD_TMP_DIR=[$path]                                    (Temporary directory of report files)
CNTD_PERF_ENABLE=[enable/on/yes/true/1]                 (Enable linux perf monitoring)
CNTD_PERF_EVENT_X=[$config]                             (Configure the perf event X, where X is between 0 and the maximum available PMUs of the uarch, while $config must be in hex format, see below for configurations)
CNTD_DISABLE_POWER_MONITOR=[enable/on/yes/true/1]       (Disable the energy/power monitoring)
CNTD_ENABLE_REPORT=[enable/on/yes/true/1]               (Save the summary report on a file)
CNTD_ENABLE_TIMESERIES_REPORT=[enable/on/yes/true/1]    (Enable time-series reports, default sampling time 1s)

Perf events

The perf events are implementation defined; see your CPU manual (for example the Intel Volume 3B documentation or the AMD BIOS and Kernel Developer Guide). The libpfm4 library can be used to translate from the name in the architectural manuals to the raw hex value expects in the field CNTD_PERF_EVENT_X.

https://github.com/wcohen/libpfm4

Or enable the verbose of 'perf stat' to show the config field:

perf stat -vv -e instructions ls

IBM Power9

The HW monitoring of Power9 requires the read access to the On Chip Controller (OCC) kernel driver through the sysfs file:

/sys/firmware/opal/exports/occ_inband_sensors

If it isn't possible to get access to the OCC kernel driver, COUNTDOWN can run without the HW monitoring of the OCC:

export CNTD_DISABLE_POWER_MONITOR=ON

Marvell ThunderX2

The HW monitoring of ThunderX2 requires the access to the tx2mon kernel driver through the sysfs files:

/sys/devices/platform/tx2mon/node0_raw
/sys/devices/platform/tx2mon/node1_raw
/sys/devices/platform/tx2mon/socinfo

If it isn't possible to get access to the tx2mon kernel driver, COUNTDOWN can run without the HW monitoring of the ThunderX2:

export CNTD_DISABLE_POWER_MONITOR=ON

ACKNOWLEDGMENTS

Development of the COUNTDOWN has been supported by the EU FETHPC project ANTAREX (g.a. 671623), EU project ExaNoDe (g.a. 671578), and CINECA research grant on Energy-Efficient HPC systems.

countdown's People

Contributors

danielecesarini avatar f-tesser avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

countdown's Issues

Events' information is missing

Hi,

It seems that the current version dd66063 does not provide an output of the events that are being monitored. A working instance seems to be on (4 of August 2022): 74dde21. Both cases have been tested with the same configuration (i.e. env variables).

Env variables:

export LD_PRELOAD=/<path>/countdown/build/src/libcntd.so

export CNTD_DISABLE_POWER_MONITOR=on
export CNTD_PERF_ENABLE=on

# For each perf event
codes=(0xc4 0xc5 0x8108 0x149 0x412e 0x110 0x185)
for i in "${!codes[@]}"; do
        export CNTD_PERF_EVENT_$i=${codes[$i]}
done

export CNTD_ENABLE_REPORT=on
export CNTD_ENABLE_TIMESERIES_REPORT=on

The current version creates an output that contains:

...
MAX Memory usage:   	192.06 MByte
AVG IPC:            	1.46
AVG CPU frequency:      0 MHz
Cycles:                 352587934310
Instructions retired:   513596918758
##################### MPI TIMING #####################
APP time: 125.694 sec (99.04%)
MPI time: 1.221 sec (0.96%)
TOT time: 126.915 sec (100.00%)
##################### MPI REPORTING ##################
...

While the older commit creates an output that contains:

...
MAX Memory usage:   	189.16 MByte
AVG IPC:            	1.46
AVG CPU frequency:      2800 MHz
Cycles:                 351634069514
Instructions retired:   513562913509
Perf event 0:           31412920360
Perf event 1:           972966157
Perf event 2:           1719990
Perf event 3:           14324
Perf event 4:           9535
Perf event 5:           4313396880
Perf event 6:           0
...

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.