Code Monkey home page Code Monkey logo

libmsr's Introduction

DEPRECATED

Libmsr is no longer actively maintained -- libmsr has evolved into Variorum, which can be found on github: https://github.com/llnl/variorum.

LIBMSR

Read the Docs

Welcome to Libmsr, a friendly (well, friendlier) interface to many of the model-specific registers in Intel processors. Now with PCI configuration register support for some Intel hardware.

version 0.3.1

Important

Libmsr is no longer being actively developed. Variorum https://variorum.readthedocs.io/ is an evolution of and a successor to libmsr. Variorum can be found on github:

https://github.com/llnl/variorum

Last Update

24 March 2020

Webpages

http://software.llnl.gov/libmsr
https://github.com/llnl/libmsr

Overview

Libmsr provides an interface to accessing the model-specific registers (MSRs) on Intel platforms, which provide privileged functionality for monitoring and controlling various CPU features.

Installation

Installation is simple. You will need CMAKE version 2.8 or higher and GCC. In most cases, the installation is as follows:

$ cmake . -DCMAKE_INSTALL_PREFIX=${HOME}/build/libmsr
$ make
$ make install

The installation depends on a master.h file, which defines the offsets for several MSRs given a particular architecture (e.g., Sandy Bridge, Ivy Bridge, Haswell, etc.). The auto-configuration tool can be forced to use the header file of a specific architecture or can auto-detect the architecture. To specify a particular architecture, run cmake with the option -DLIBMSR_TARGET_ARCH=<ARG> where ARG is in hexadecimal. In the future, we plan to have a set of architecture-specific configuration files that can be pre-loaded to CMake to populate the cache.

Currently supported architectures are Intel Xeon v1-3 (Sandy Bridge, Ivy Bridge, and Haswell server processors). The library technically supports all processors based on these architectures, but some features may be missing from client products. Using the wrong header file is likely to cause problems.

Supported Architectures:

2D (Sandy Bridge)       57 (Knights Landing)
3E (Ivy Bridge)
3F (Haswell)
4F (Broadwell)
55 (Skylake)*

If you are unsure of your architecture number, check the "model" field in lscpu or /proc/cpuinfo (note that it will not be in hexadecimal).

*The Skylake support is currently experimental and requires more testing/validation.

Notes

This software depends on the files /dev/cpu/*/msr being present. Recent kernels require additional capabilities. We have found it easier to use our own MSR-SAFE kernel module with R/W permissions, but running as root (or going through the bother of adding the capabilities to the binaries) is another option.

If you need PCI configuration register (CSR) support in Libmsr, you MUST have CSR-SAFE installed. This code is not currently on Github -- you will need to request it.

Call msr_init() before using any of the APIs.

For sample code, see libmsr_test.c in the test/ directory.

Our most up-to-date documentation for Libmsr can be generated with make doc and make latex_doc for HTML and PDF versions, respectively. There are also some useful PDF files in the documentation/ directory.

Contact

Barry Rountree, Project Lead, [email protected]
Stephanie Brink, Developer, [email protected]

Please feel free to contact the developers with any questions or feedback.

We are collecting names of those who have previously contributed to libmsr over the years. See the current list in the AUTHORS file. Please contact the developers to have your name added to the list.

Release

libmsr is released under the GPLv2.1 license. For more details, see the LICENSE file.

LLNL-CODE-645430

libmsr's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

libmsr's Issues

rapl_finalize should not automatically restore registers

The documentation for rapl_finalize (currently unimplemented) seems to indicate that in the future it will "will restore rapl registers to their state prior to execution" as well as cleanup memory. The latter is, of course, necessary. However, should this function be implemented, restoring registers to their initial values should be a separate function, or in the very least a boolean parameter to the finalize function. The user should be able to decide the final state of the MSRs when tearing down. There are use cases where libmsr can be used to configure MSRs with the intent of the changes outlasting the program's (or program component's) lifetime.

Test for setting power caps

Test ability to set power caps. Determine power cap to set by taking a percentage of TDP, instead of using hard coded values.

Finish PCI configuration registers for additional components

Current libmsr v0.3.0 has limited support for PCI configuration registers (referred to as CSRs). Specifically, it partially supports the integrated memory controller (iMC). Future support may include QuickPath Interconnect (QPI), PCU, and caching agent.

Makefile in cmd/ fix

The CMakeLists.txt in the cmd/ folder is incorrect and doesn’t link to the math library. It should say target_link_libraries(msr-turbo msr m) instead of target_link_libraries(msr-turbo msr).

powmon shared memory cleanup

In some cases, the main powmon thread doesn't cleanup the shared memory region. If the region still exists, then future instances of powmon will not produce the resulting data files.

powmon csv-friendly output

powmon's current output format enables the user to easily identify the identity of each value, but is cannot be easily read into a post-processing utility, such as Excel or R. Add a command line parameter to toggle between output formats.

=== Current output format ===
time:1481850387379 0joules:0.000000 1joules:0.000000 0limwatts:115.000000 1limwatts:115.000000 instr0:0 instr1:0 core0:0 core1:0

=== Alternative (target) output format ===
time, 0joules, 1joules, 0limwatts, 1limwatts, instr0, instr1, core0, core1
1481850387379, 0.000000, 0.000000, 115.000000, 115.000000, 0, 0, 0, 0

Fail of the build process if hwloc is not found

Hi,

there is an issue in the build process when hwloc is not found on the local system. Looking to the cmake files, I see that when hwloc is not found cmake should be downloaded and installed from scratch, but this not happen and the build process fail.

If you need more information to debug it just ask!

[dcesarin@node166 libmsr]$ cmake ../../libmsr
-- The C compiler identification is Intel 17.0.4.20170411
-- Check for working C compiler: /cineca/prod/compilers/intel/pe-xe-2017/binary/bin/icc
-- Check for working C compiler: /cineca/prod/compilers/intel/pe-xe-2017/binary/bin/icc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
CMake Error at src/CMake/thirdparty/SetupHwloc.cmake:48 (MESSAGE):
Hwloc support needed
Call Stack (most recent call first):
src/CMake/Setup3rdParty.cmake:1 (include)
src/CMakeLists.txt:1 (include)

-- Configuring incomplete, errors occurred!

Debug prints always on

ISSUE:
In file msr_core.c at line 933 and 935 there is debug print in a #ifdef region which is commented. So, even if the library is compiled in release mode the debug information are printed.

SOLUTION:
Remove the comments of #ifdef region at line 933 and 935.

Reading from correct indices

We are seeing (at least) 2 different methods for logical thread indexing, which is not always captured by the patch.

Let's assume a dual-socket node, each socket has 12 physical cores, and hyperthreading is enabled.

Case 1:
Socket 0: CPU0, CPU2, CPU4, CPU6, CPU8, CPU10 ... HT24, HT26
Socket 1: CPU1, CPU3, CPU5, CPU7, CPU9, CPU11 ... HT25, HT27

Case 2:
Socket 0: CPU0-CPU11 ... HT24-HT35
Socket 1: CPU12-23 ... HT36-HT47

We think another case could be:
Socket 0: CPU0-11 ... HT12-HT23
Socket 1: CPU24-35 ... HT36-47

We need to determine how to detect these different topologies in a more robust way.

Operate on MSRs in hyperthreaded dual-socket environment

If hyperthreads are enabled, the cpu index of socket 0 is replicated on both sockets.

Take the following example assuming 48 logical processors (24 physical cores):

  • cpu index 0 gets mapped to socket 0 (correct)
  • cpu index 24 gets mapped to socket 1 (incorrect, cpu index is socket 0)

The correct behavior should be cpu index 12 gets mapped to socket 1.

pstate-test Command-line Arg Handling

In pstate_test.c, the main function contains this code:

    else if (argc > sockets)
    {
        fprintf(stderr, "ERROR: Too many p-states (in GHz) specified.\n");
        return -1;
    }
    else if (argc == sockets+1)
    {
        new_p_states_ghz[0] = atof(argv[1]);
        new_p_states_ghz[1] = atof(argv[2]);
    }

The second else if block is not reachable because sockets+1 > sockets, so command-lines such as ./pstate-test 1.8 1.9 cause an error with 2 sockets.

More descriptive errors when building on unsupported platforms

rountree@feyerabend:~/local/libmsr$ make
ERROR: unable to open file platform_headers/Intel9E.h
Model 9e may not be supported. Use -f to force.
-- HWLOC library found using find_library()
-- HWLOC_INCLUDE_DIRS = /usr/include
-- HWLOC_LIBRARY = /usr/lib/x86_64-linux-gnu/libhwloc.so
-- Could NOT find Doxygen (missing: DOXYGEN_EXECUTABLE)
-- Configuring done
-- Generating done
-- Build files have been written to: /home/rountree/local/libmsr
Scanning dependencies of target msr
[ 1%] Building C object src/CMakeFiles/msr.dir/cpuid.c.o
[ 3%] Building C object src/CMakeFiles/msr.dir/csr_core.c.o
[ 5%] Building C object src/CMakeFiles/msr.dir/csr_imc.c.o
In file included from /home/rountree/local/libmsr/src/csr_imc.c:46:0:
/home/rountree/local/libmsr/include/csr_imc.h:39:10: fatal error: master.h: No such file or directory
#include "master.h"
^~~~~~~~~~
compilation terminated.
src/CMakeFiles/msr.dir/build.make:110: recipe for target 'src/CMakeFiles/msr.dir/csr_imc.c.o' failed
make[2]: *** [src/CMakeFiles/msr.dir/csr_imc.c.o] Error 1
CMakeFiles/Makefile2:119: recipe for target 'src/CMakeFiles/msr.dir/all' failed
make[1]: *** [src/CMakeFiles/msr.dir/all] Error 2
Makefile:129: recipe for target 'all' failed
make: *** [all] Error 2

Create a testing suite/travis CI

Just opening an issue here to keep a note about our discussions from this past week.

In general, we need more rigorous testing of individual features, and a way to test platform dependence (if any). We have some examples/tests, but these are not diverse enough to provide good coverage, and are not integrated into regular daily/weekly/PR-based builds.

Functions to get RAPL limits return 0 when domain isn't supported

The functions get_pkg_rapl_limit, get_pp_rapl_limit, and get_dram_rapl_limit should return a failure code when the domain isn't supported. Unlike the setter functions, the getter functions currently ignore the return values of calc_pkg_rapl_limit and calc_std_rapl_limit.

An error is printed, but we should be able to recognize the error by checking return codes as well:

Error: <libmsr> Feature not supported on this architecture: get_dram_rapl_limit(): DRAM domain RAPL power limit not supported on this architecture: (null):/local/third-party-tools/libmsr/src/msr_rapl.c::1174

Thanks.

batch reads have a devidx problem

The problem was in msr_core.c: load_socket_batch(), which was being called during initialization in read_rapl_data —> create_rapl_data_batch. The same problem exists in load_core_batch and load_thread_batch.

Here’s what the original code looked like:

for(dev_idx=0,
val_idx=0;
dev_idx< NUM_DEVS_NEW; dev_idx += coresPerSocket * threadsPerCore, val_idx++
){

create_batch_op(msr, dev_idx, &val[val_idx], batchnum);
}

This was incorrect because create_batch_op uses the dev_idx to assign the cpu from which the MSRs are being read. Because of this initial mapping, CPUs 0 and 24 on chameleon were being read, which were both on the same socket.

Here’s my fix that seems to work now on chameleon:

if(CPU_DEV_VER==1){
for(dev_idx=0, val_idx=0; dev_idx< NUM_DEVS_NEW;
dev_idx += coresPerSocket * threadsPerCore, val_idx++ ){
create_batch_op(msr, dev_idx, &val[val_idx], batchnum);
}
}
else{
for(dev_idx=0, val_idx=0; dev_idx< NUM_DEVS_NEW;
dev_idx += coresPerSocket * threadsPerCore, val_idx++ ){
create_batch_op(msr, val_idx, &val[val_idx], batchnum);
}
}

make distclean recursively removes CMAKE_INSTALL_PREFIX

This needs to be critical priority. The distclean target recursively deletes the cmake install prefix, which on many Linux systems is /usr/local/! This command can wreck an entire system. I really hope I didn't have anything writable there when I executed this (fortunately without sudo).

rm -rf ${CMAKE_INSTALL_PREFIX}

libmsr.so can't be found

When I have installed libmsr,And configure libmsr papi-C component with the
./configure --with-libmsr-incdir=<PATH> --with-libmsr-libdir=<PATH>

PATH I set the libmsr‘s folder.
./configure --with-libmsr-incdir=/home/mount/NAS/xd/tools/libmsr/ --with-libmsr-libdir=/home/mount/NAS/xd/tools/libmsr/lib

But it still give an error like that:
checking for init_msr in -lmsr... no

APIs to retrieve individual MSR data

Printing the contents of various MSRs is done with dump_*(), which format the data. Take, for example, dump_rapl_limit(), which prints the 64-bit value of MSR_PKG_POWER_LIMIT, time window 1/2, and power limit 1/2 in a list. Instead of printing all three items, the user may want to just sample power limit 1/2 (along with some other metrics), and print it in csv format.

It is not trivial for a user to print a string of desired data values in CSV-format with the current APIs. Add capability to retrieve individual members from MSR data structs.

Macro redefinition warning of "UINT_MAX"

In "msr_rapl.h" there is the macro definition:

#define UINT_MAX 4294967295U

The macro definition is taken from "limits.h" (as the comment say) but if I need to use "limits.h" (or a library which include "limits.h", as instance HWLOC library) this macro consequently raises a warning because the compiler discover a macro redefinition.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.