Code Monkey home page Code Monkey logo

pbench's Introduction

Pbench

A Benchmarking and Performance Analysis Framework

The code base includes three sub-systems. The first is the collection agent, Pbench Agent, responsible for collecting configuration data for test systems, managing the collection of performance tool data from those systems (sar, vmstat, perf, etc.), and executing and postprocessing standardized or arbitrary benchmarked workloads (uperf, fio, linpack, as well as real system activity).

The second sub-system is the Pbench Server, which is responsible for archiving result tar balls and providing a secure RESTful API to client applications, such as the Pbench Dashboard. The API supports curation of results data, the ability to annotate results with arbitrary metadata, and to explore the results and collected data.

The third sub-system is the Pbench Dashboard, which provides a web-based GUI for the Pbench Server allowing users to list and view public results. After logging in, users can view their own results, publish results for others to view, and delete results which are no longer of use. On the User Profile page, a logged-in user can generate API keys for use with the Pbench Server API or with the Agent pbench-results-move command. The Pbench Dashboard also serves as a platform for exploring and visualizing result data.

How is it installed?

Instructions for installing pbench-agent, can be found in the Pbench Agent Getting Started Guide.

For Fedora, CentOS, and RHEL users, we have made available COPR RPM builds for the pbench-agent and some benchmark and tool packages.

You might want to consider browsing through the rest of the documentation.

You can also use podman or docker to pull Pbench Agent containers from Quay.io.

How do I use pbench?

Refer to the Pbench Agent Getting Started Guide.

TL;DR? See "TL;DR - How to set up the pbench-agent and run a benchmark " in the main documentation for a super quick set of introductory steps.

Where is the source kept?

The latest source code is at https://github.com/distributed-system-analysis/pbench.

Is there a mailing list for discussions?

Yes, we use Google Groups

How do I report an issue?

Please use GitHub's issues.

Is there a place to track current and future work items?

Yes, we are using GitHub Projects. Please find projects covering the Agent, Server, Dashboard, and a project that is named the same as the current milestone.

How can I contribute?

Below are some simple steps for setting up a development environment for working with the Pbench code base. For more detailed instructions on the workflow and process of contributing code to Pbench, refer to the Guidelines for Contributing.

Getting the Code

$ git clone https://github.com/distributed-system-analysis/pbench
$ cd pbench

Running the Unit Tests

Install tox properly in your environment (Fedora/CentOS/RHEL):

$ sudo dnf install -y perl-JSON python3-pip python3-tox

Once tox is installed you can run the unit tests against different versions of python using the python environment short-hands:

  • tox -e py36 -- run all tests in a Python 3.6 environment (our default)
  • tox -e py39 -- run all tests in a Python 3.9 environment
  • tox -e py310 -- run all tests in a Python 3.10 environment
  • tox -e pypy3 -- run all tests in a PyPy 3 environment
  • tox -e pypy3.8 -- run all tests in a PyPy 3.8 environment

See https://tox.wiki/en/latest/example/basic.html#a-simple-tox-ini-default-environments.

You can provide arguments to the tox invocation to request sub-sets of the available tests be run.

For example, if you want to just run the agent or server tests, you'd invoke tox as follows:

  • tox -- agent -- runs only the agent tests
  • tox -- server -- runs only the server tests

Each of the "agent" and "server" tests can be further subsetted as follows:

  • agent

    • python -- runs the python tests (via pytest)
    • legacy -- runs all the legacy tests
    • datalog -- runs only the legacy tool data-log tests, agent/tool-scripts/datalog/unittests
    • postprocess -- runs only the legacy tool/bench-scripts post-processing tests, agent/tool-scripts/postprocess/unittests
    • tool-scripts -- runs only the legacy tool-scripts tests, agent/tool-scripts/unittests
    • util-scripts -- runs only the legacy util-scripts tests, agent/util-scripts/unittests
    • bench-scripts -- runs only the legacy bench-scripts tests, agent/bench-scripts/unittests
  • server

    • python -- runs the python tests (via python)

For example:

  • tox -- agent legacy -- run agent legacy tests
  • tox -- server python -- run server python tests (via pytest)

For any of the test sub-sets on either the agent or server sides of the tree, one can pass additional arguments to the specific sub-system test runner. This allows one to request a specific test, or set of tests, or command line parameters to modify the test behavior:

  • tox -- agent bench-scripts test-CL -- run bench-scripts' test-CL
  • tox -- server python -v -- run server python tests verbosely

For the agent/bench-scripts tests, one can run entire sub-sets of tests using a sub-directory name found in agent/bench-scripts/tests. For example:

  • tox -- agent bench-scripts pbench-fio
  • tox -- agent bench-scripts pbench-uperf pbench-linpack

The first runs all the pbench-fio tests, while the second runs all the pbench-uperf and pbench-linpack tests.

You can run the build.sh script to execute the linters, to run the unit tests for the Agent, Server, and Dashboard code, and to build installations for the Agent, Server, and Dashboard.

Finally, see the jenkins/Pipeline.gy file for how the unit tests are run in our CI jobs.

Python formatting

This project uses the flake8 method of code style enforcement, linting, and checking.

All python code contributed to pbench must match the style requirements. These requirements are enforced by the pre-commit hook. In addition to flake8, pbench uses the black Python code formatter and the isort Python import sorter.

Use pre-commit to set automatic commit requirements

This project makes use of pre-commit to do automatic lint and style checking on every commit containing Python files.

To install the pre-commit hook, run the executable from your Python 3 framework while in your current pbench git checkout:

$ cd ~/pbench
$ pip3 install pre-commit
$ pre-commit install --install-hooks

Once installed, all commits will run the test hooks. If your changes fail any of the tests, the commit will be rejected.

Pbench Release Tag Scheme (GitHub)

We employ a simple major, minor, release, build (optional) scheme for tagging starting with the v0.70.0 release (v<Major>.<Minor>.<Release>[-<Build>]). Prior to the v0.70.0 release, the scheme used was mostly v<Major>.<Minor>, where we only had minor releases (Major = 0).

Container Image Tags

This same GitHub "tag" scheme is used with tags applied to container images we build, with the following exceptions for tag names:

  • latest - always points to the latest released container image pushed to a repository

  • v<Major>-latest - always points to the "latest" Major released image

  • v<Major>.<Minor>-latest - always points to the "latest" release for Major.Minor released images

  • <SHA1 git hash> (9 characters) - commit hash of the checked out code

References to Container Image Repositories

The operation of our functional tests, the Pbench Server "in-a-can" used in the functional tests, and other verification and testing environments use container images from remote image registries. The CI jobs obtain references to those repositories using Jenkins credentials. When running those same jobs locally, you can provide the registry via ${HOME}/.config/pbench/ci_registry.name.

If this file is not provided, local execution will report an error.

pbench's People

Contributors

anishaswain avatar aquibbaig avatar arzoo14 avatar atheurer avatar bengland2 avatar chaitanyaenr avatar cronburg avatar dbutenhof avatar ekuric avatar gurbirkalsi avatar hifzakh avatar jeremyeder avatar jmencak avatar k-rister avatar ldoktor avatar maxusmusti avatar mffiedler avatar mvarshini avatar ndokos avatar npalaska avatar orpiske avatar portante avatar riya-17 avatar robertkrawitz avatar shubham-html-css-js avatar siddardh-ra avatar sourabhtk37 avatar tenstormavi avatar vishalvvr avatar webbnh avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pbench's Issues

clear-results doesnt clean results from clients

Kill-tools, clear-tools clear tools from both client & server.
But clear-results doesnt clear from clients. It clears only from server.
Every time need to do manually.

For ex: in my experiment, i need to run fio from host on 16 VM's. Every time i need to clear results on vm using scripts. good to have pbench clean results on clients too

pprof-datalog only supports OSE...should also support origin.

Only the service names and /etc/sysconfig locations differ. They should be conditionalized.
/cc @ekuric

diff -pruN /opt/pbench-agent/tool-scripts/datalog/pprof-datalog.orig /opt/pbench-agent/tool-scripts/datalog/pprof-datalog

--- /opt/pbench-agent/tool-scripts/datalog/pprof-datalog.orig 2016-02-09 12:29:38.079425999 -0500
+++ /opt/pbench-agent/tool-scripts/datalog/pprof-datalog 2016-02-09 12:30:17.701923330 -0500
@@ -1,7 +1,7 @@
#!/usr/bin/env bash

-openshift_master="/etc/sysconfig/atomic-openshift-master"
-openshift_node="/etc/sysconfig/atomic-openshift-node"
+openshift_master="/etc/sysconfig/origin-master"
+openshift_node="/etc/sysconfig/origin-node"

profile="$1"
osecomponent="$2"
@@ -12,18 +12,18 @@ ose_pprof() {
case "$profile" in
cpu)
if grep -q "^OPENSHIFT_PROFILE=cpu" $openshift_master; then

  •                    systemctl restart atomic-openshift-master.service
    
  •                    systemctl restart origin-master.service
                 else
                     echo "OPENSHIFT_PROFILE=cpu" >> $openshift_master
    
  •                    systemctl restart atomic-openshift-master.service
    
  •                    systemctl restart origin-master.service
                 fi
              ;;
             mem)
                 if grep -q "^OPENSHIFT_PROFILE=mem" $openshift_master; then
    
  •                    systemctl restart atomic-openshift-master.service
    
  •                    systemctl restart origin-master.service
                 else
                     echo "OPENSHIFT_PROFILE=mem" >> $openshift_master
    
  •                    systemctl restart atomic-openshift-master.service
    
  •                    systemctl restart origin-master.service
                 fi
             ;;
         esac
    
    @@ -32,18 +32,18 @@ ose_pprof() {
    case "$profile" in
    cpu)
    if grep -q "^OPENSHIFT_PROFILE=cpu" $openshift_node; then
  •                    systemctl restart atomic-openshift-node.service
    
  •                    systemctl restart origin-node.service
                 else
                     echo "OPENSHIFT_PROFILE=cpu" >> $openshift_node
    
  •                    systemctl restart atomic-openshift-node.service
    
  •                    systemctl restart origin-node.service
                 fi
             ;;
             mem)
                 if grep -q "^OPENSHIFT_PROFILE=mem" $openshift_master; then
    
  •                    systemctl restart atomic-openshift-node.service
    
  •                    systemctl restart origin-node.service
                 else
                     echo "OPENSHIFT_PROFILE=mem" >> $openshift_node
    
  •                    systemctl restart atomic-openshift-node.service
    
  •                    systemctl restart origin-node.service
                 fi
             ;;
         esac
    

Tool RPMs availability and naming

We build RPMs internally for tools that pbench uses. In most cases, they are built from upstream bits without any changes. We could make those available externally through COPR.

In some cases, we might need to patch the upstream.

In all cases, we want to prepend "pbench-" to the RPM name to avoid conflicts with any system provided ones.

Develop an "all-in-one" environment for development and testing

It is pretty clear that both for development and for simple example use / kick the tires testing we need an all-in-one environment for deploying the agent, background server tasks, web server, etc.

With such an environment we could automate builds and unit tests for integration with TravisCI, and other tools.

sar fails while running pbench with multi vm

running fio job: /var/lib/pbench-agent/fio_multi-vm-lvm-cache:none-io:native-disk:hdd-fs:lvm-prealloc-iodepth-1-jobs-32-ioeng:sync-profile:latency-perf-full-run-with-stefan-rhel72-patch-without-perf-ag-vcpu-2-run:1_2015-10-29_01:31:51/1-randread-4KiB/sample1/fio.job
/opt/pbench-agent/tool-scripts/sar: line 168: kill: (28146) - No such process
/opt/pbench-agent/tool-scripts/sar: line 168: kill: (11246) - No such process
/opt/pbench-agent/tool-scripts/sar: line 168: kill: (11298) - No such process
/opt/pbench-agent/tool-scripts/sar: line 168: kill: (11237) - No such process
/opt/pbench-agent/tool-scripts/sar: line 168: kill: (11327) - No such process
/opt/pbench-agent/tool-scripts/sar: line 168: kill: (11378) - No such process
/opt/pbench-agent/tool-scripts/sar: line 168: kill: (11266) - No such process
/opt/pbench-agent/tool-scripts/sar: line 168: kill: (11354) - No such process
/opt/pbench-agent/tool-scripts/sar: line 168: kill: (11270) - No such process
/opt/pbench-agent/tool-scripts/sar: line 168: kill: (11377) - No such process
/opt/pbench-agent/tool-scripts/sar: line 168: kill: (11417) - No such process
/opt/pbench-agent/tool-scripts/sar: line 168: kill: (11427) - No such process
/opt/pbench-agent/tool-scripts/sar: line 168: kill: (11340) - No such process
/opt/pbench-agent/tool-scripts/sar: line 168: kill: (11464) - No such process
/opt/pbench-agent/tool-scripts/sar: line 168: kill: (11402) - No such process
/opt/pbench-agent/tool-scripts/sar: line 168: kill: (11376) - No such process
/opt/pbench-agent/tool-scripts/sar: line 168: kill: (11493) - No such process
Use of uninitialized value $line in scalar chomp at /opt/pbench-agent/tool-scripts/postprocess/sar-postprocess line 97.
Use of uninitialized value $line in pattern match (m//) at /opt/pbench-agent/tool-scripts/postprocess/sar-postprocess line 98.
[virbr0-122-85]Use of uninitialized value $line in scalar chomp at /opt/pbench-agent/tool-scripts/postprocess/sar-postprocess line 97.
[virbr0-122-85]Use of uninitialized value $line in pattern match (m//) at /opt/pbench-agent/tool-scripts/postprocess/sar-postprocess line 98.
[virbr0-122-87]Use of uninitialized value $line in scalar chomp at /opt/pbench-agent/tool-scripts/postprocess/sar-postprocess line 97.
[virbr0-122-87]Use of uninitialized value $line in pattern match (m//) at /opt/pbench-agent/tool-scripts/postprocess/sar-postprocess line 98.
[virbr0-122-98]Use of uninitialized value $line in scalar chomp at /opt/pbench-agent/tool-scripts/postprocess/sar-postprocess line 97.
[virbr0-122-98]Use of uninitialized value $line in pattern match (m//) at /opt/pbench-agent/tool-scripts/postprocess/sar-postprocess line 98.
[virbr0-122-86]Use of uninitialized value $line in scalar chomp at /opt/pbench-agent/tool-scripts/postprocess/sar-postprocess line 97.
[virbr0-122-86]Use of uninitialized value $line in pattern match (m//) at /opt/pbench-agent/tool-scripts/postprocess/sar-postprocess line 98.
[virbr0-122-84]Use of uninitialized value $line in scalar chomp at /opt/pbench-agent/tool-scripts/postprocess/sar-postprocess line 97.
[virbr0-122-84]Use of uninitialized value $line in pattern match (m//) at /opt/pbench-agent/tool-scripts/postprocess/sar-postprocess line 98.
[virbr0-122-95]Use of uninitialized value $line in scalar chomp at /opt/pbench-agent/tool-scripts/postprocess/sar-postprocess line 97.
[virbr0-122-95]Use of uninitialized value $line in pattern match (m//) at /opt/pbench-agent/tool-scripts/postprocess/sar-postprocess line 98.
[virbr0-122-90]Use of uninitialized value $line in scalar chomp at /opt/pbench-agent/tool-scripts/postprocess/sar-postprocess line 97.
[virbr0-122-90]Use of uninitialized value $line in pattern match (m//) at /opt/pbench-agent/tool-scripts/postprocess/sar-postprocess line 98.
[virbr0-122-88]Use of uninitialized value $line in scalar chomp at /opt/pbench-agent/tool-scripts/postprocess/sar-postprocess line 97.
[virbr0-122-88]Use of uninitialized value $line in pattern match (m//) at /opt/pbench-agent/tool-scripts/postprocess/sar-postprocess line 98.
[virbr0-122-91]Use of uninitialized value $line in scalar chomp at /opt/pbench-agent/tool-scripts/postprocess/sar-postprocess line 97.
[virbr0-122-91]Use of uninitialized value $line in pattern match (m//) at /opt/pbench-agent/tool-scripts/postprocess/sar-postprocess line 98.
[virbr0-122-89]Use of uninitialized value $line in scalar chomp at /opt/pbench-agent/tool-scripts/postprocess/sar-postprocess line 97.
[virbr0-122-89]Use of uninitialized value $line in pattern match (m//) at /opt/pbench-agent/tool-scripts/postprocess/sar-postprocess line 98.
[virbr0-122-96]Use of uninitialized value $line in scalar chomp at /opt/pbench-agent/tool-scripts/postprocess/sar-postprocess line 97.
[virbr0-122-96]Use of uninitialized value $line in pattern match (m//) at /opt/pbench-agent/tool-scripts/postprocess/sar-postprocess line 98.
[virbr0-122-94]Use of uninitialized value $line in scalar chomp at /opt/pbench-agent/tool-scripts/postprocess/sar-postprocess line 97.
[virbr0-122-94]Use of uninitialized value $line in pattern match (m//) at /opt/pbench-agent/tool-scripts/postprocess/sar-postprocess line 98.
[virbr0-122-184]Use of uninitialized value $line in scalar chomp at /opt/pbench-agent/tool-scripts/postprocess/sar-postprocess line 97.
[virbr0-122-184]Use of uninitialized value $line in pattern match (m//) at /opt/pbench-agent/tool-scripts/postprocess/sar-postprocess

Include method to derive efficiency metrics for any pbench benchmark

This issue tracks computing efficiency metrics, to be used on any of the pbench benchmark scripts. Currently we have efficiency metrics for pbench_uperf only, and these are Gb-sec/CPU and transactions-sec/CPU. We want to modularize the process, in order to create any "work/resource" metric for any of the benchmarks.

Tool names are too generic

The executable names on pbench are too generic, specially if you have an RPM install. I'd suggest the tools to have a pbench- prefix, such as pbench-register-tool-set. There's also the possibility to have a 'shell' command, that has subcommands, so the command would turn into pbench register-tool-set.

sar doesnt start while running pbench_fio

sar doesnt start while running pbench_fio some of the cases. While these tests in progress i checked for sar on machine. its not started. so it fails to kill once test done. Its inconsistent. some cases it works fine.

Failed tests:

running fio job: /var/lib/pbench-agent/fio_single-vm-ext4-cache:none-io:native-disk:hdd-img:qcow2-preallocate:falloc-fs:ext4-iodepth-1-jobs-32-ioeng:sync-profile:throughput-without-perf-full-run-with-stefan-rhel72-patch-ag-vcpu-2-run:1_2015-10-30_04:16:18/7-write-16384KiB/sample5/fio.job
/opt/pbench-agent/tool-scripts/sar: line 168: kill: (36157) - No such process
/opt/pbench-agent/tool-scripts/sar: line 168: kill: (30111) - No such process
Use of uninitialized value $line in scalar chomp at /opt/pbench-agent/tool-scripts/postprocess/sar-postprocess line 97.
Use of uninitialized value $line in pattern match (m//) at /opt/pbench-agent/tool-scripts/postprocess/sar-postprocess line 98.
[virbr0-122-84]Use of uninitialized value $line in scalar chomp at /opt/pbench-agent/tool-scripts/postprocess/sar-postprocess line 97.
[virbr0-122-84]Use of uninitialized value $line in pattern match (m//) at /opt/pbench-agent/tool-scripts/postprocess/sar-postprocess line 98.
fio job complete

running fio job: /var/lib/pbench-agent/fio_single-vm-ext4-cache:none-io:native-disk:hdd-img:qcow2-preallocate:falloc-fs:ext4-iodepth-1-jobs-32-ioeng:sync-profile:throughput-without-perf-full-run-with-stefan-rhel72-patch-ag-vcpu-2-run:1_2015-10-30_04:16:18/12-read-1024KiB/sample1/fio.job

/opt/pbench-agent/tool-scripts/sar: line 168: kill: (3450) - No such process

/opt/pbench-agent/tool-scripts/sar: line 168: kill: (12532) - No such process

Use of uninitialized value $line in scalar chomp at /opt/pbench-agent/tool-scripts/postprocess/sar-postprocess line 97.
Use of uninitialized value $line in pattern match (m//) at /opt/pbench-agent/tool-scripts/postprocess/sar-postprocess line 98.

[virbr0-122-84]Use of uninitialized value $line in scalar chomp at /opt/pbench-agent/tool-scripts/postprocess/sar-postprocess line 97.
[virbr0-122-84]Use of uninitialized value $line in pattern match (m//) at /opt/pbench-agent/tool-scripts/postprocess/sar-postprocess line 98.

fio job complete

  1. running fio job: /var/lib/pbench-agent/fio_single-vm-ext4-cache:none-io:native-disk:hdd-img:qcow2-preallocate:falloc-fs:ext4-iodepth-1-jobs-32-ioeng:sync-profile:throughput-without-perf-full-run-with-stefan-rhel72-patch-ag-vcpu-2-run:1_2015-10-30_04:16:18/12-read-1024KiB/sample2/fio.job

/opt/pbench-agent/tool-scripts/sar: line 168: kill: (6283) - No such process

/opt/pbench-agent/tool-scripts/sar: line 168: kill: (13982) - No such process

Use of uninitialized value $line in scalar chomp at /opt/pbench-agent/tool-scripts/postprocess/sar-postprocess line 97.
Use of uninitialized value $line in pattern match (m//) at /opt/pbench-agent/tool-scripts/postprocess/sar-postprocess line 98.

[virbr0-122-84]Use of uninitialized value $line in scalar chomp at /opt/pbench-agent/tool-scripts/postprocess/sar-postprocess line 97.
[virbr0-122-84]Use of uninitialized value $line in pattern match (m//) at /opt/pbench-agent/tool-scripts/postprocess/sar-postprocess line 98.

fio job complete
The following jobfile was created: /var/lib/pbench-agent/fio_single-vm-ext4-cache:none-io:native-disk:hdd-img:qcow2-preallocate:falloc-fs:ext4-iodepth-1-jobs-32-ioeng:sync-profile:throughput-without-perf-full-run-with-stefan-rhel72-patch-ag-vcpu-2-run:1_2015-10-30_04:16:18/12-read-1024KiB/sample3/fio.job

  1. running fio job: /var/lib/pbench-agent/fio_single-vm-ext4-cache:none-io:native-disk:hdd-img:qcow2-preallocate:falloc-fs:ext4-iodepth-1-jobs-32-ioeng:sync-profile:throughput-without-perf-full-run-with-stefan-rhel72-patch-ag-vcpu-2-run:1_2015-10-30_04:16:18/12-read-1024KiB/sample3/fio.job

/opt/pbench-agent/tool-scripts/sar: line 168: kill: (9006) - No such process
/opt/pbench-agent/tool-scripts/sar: line 168: kill: (15615) - No such process

Use of uninitialized value $line in scalar chomp at /opt/pbench-agent/tool-scripts/postprocess/sar-postprocess line 97.
Use of uninitialized value $line in pattern match (m//) at /opt/pbench-agent/tool-scripts/postprocess/sar-postprocess line 98.

[virbr0-122-84]Use of uninitialized value $line in scalar chomp at /opt/pbench-agent/tool-scripts/postprocess/sar-postprocess line 97.
[virbr0-122-84]Use of uninitialized value $line in pattern match (m//) at /opt/pbench-agent/tool-scripts/postprocess/sar-postprocess line 98.

fio job complete

Number of open files should be bumped up

In some cases, a user can run against the open file limit (the soft limit seems to be 1024, the hard limit is 4096 on my F21 box).

A root user can bump it up though, so we might want to add that to pbench-base.sh:

ulimit -n 1048576

N.B. "unlimited" or any number > 2^20 does not seem to work on F21, but YMMV.

Index fio disk stats JSON data found fio-result.txt

Let's consider indexing the JSON data generated by fio as stored in the fio-result.txt file for a given sample.

This should be fairly straight forward since it is already in JSON form, but we need to add the right metadata so that we can find the document algorithmically.

pbench_cyclictest does not work on F22

It's looking for a specific version of the package rt-tests - that fails on F22.

I installed the currently available version of rt-tests (4.2.12-1.fc22)
but there is no command named "cyclictest" in the package.

Is the benchmark obsolete?

Restore indexing unit tests 7.6 and 7.7

I had to comment these two out of the server/pbench/bin/unittests script, after a small cascade of problems: the travis-ci build broke, I fixed that for now, but that caused the above tests to fail: the pretty-printing of the strucures that are compared is slightly different in the travis-ci environment, causing spurious failures. After a failed effort to fix that, I commented the two failing tests out and will revisit them next week after we get the pbench-agent release out (which is not affected at all by these two tests).

percentage metrics (*_pct) being plotted along with normally ranged metrics

screenshot_1

screenshot_2

In screenshot_1, Notice vmeff_pct (ranges 0-100) included with other metrics like page_swaps_in/out_sec, pgscand/k_sec and so on.. (ranges 0-2,000,000 approx)

Similarly in screenshot_2, notice the range of the param memused_pct (varied 0-100)

These should be ideally plotted separately, maybe as a toggle button, which when clicked, shows all % stats). The ranges aren't meant to be mixed up, since it creates a confusion for the user.

Review and obsolete or create issues for TODO items

There used to be a ./doc/TODO tracking a few items that need to be addressed in the pbench code base. The contents of that document are provided in this issue below. We should review them and create individual issues where appropriate, dropping any items that don't make sense anymore.

TODO

Included here are various items which at some point should be done for pbench.

General

job processor

Currently pbench is usually run in a terminal. This is fine for single system use, but does not work well for multi-system tests. We need a way to process job files, so we can (1) not maintain a terminal and (2) submit the same job file to many systems. We most likely need a daemon which waits for new job files and processes a job file once one appears. We could scan a local directory for new files (inotify) and/or periodically check a http/ftp/nfs location for new jobs. Job files could simply be bash scripts, or we could process ourselves, exec'ing each line. Having pbench processing the file might have some advantages in that some state could be saved (variable defs) if the job file should issue a reboot command (and the pbench daemon would resume on boot, processing the remainder of the job file). Running bash scripts, on the other hand, won't survive a reboot, but it would be far easier to implement. With either of these they would probably be run within a screen session so one could attach and watch.

Utils

restrict

If we plan to write a single job file or bash script which gets distributed to many systems, we need to ability to allow different systems run different things, even with the same job file. This is not really difficult with an if statement and hostname checking, but something more convenient would be nice. "restrict" would be a utility which takes a list of hostnames or IPs, followed by a command. In this situation every system with the same job file [that has a restrict command] runs the restrict script, and if their hostname matches the list, then they get to run the command. For example:

#!/bin/bash
restrict client1 uperf --mode=client --server=server1
restrict server1 uperf --mode=server

sync

When running a test with many different hosts, VMs, or containers executing, we often need to synchronize certain things, so we can get repeatable results and have high confidence that what we want to happen is actually happening at the right time. This is also required when one system needs to set up a service (web, etc) before a client system tests that service. A sync is used to wait until the server is done setting up the service.

To do this, we need a "sync" command. All systems using the same job file would call sync with (1) the sync label, "this_sync", and (2) a list of systems who have to participate in the sync. The sync utility will wait until all members in the list are running the sync command for the particular sync name. Once all members have executed the command, then and only then do they get to exit the sync command and resume. An implementation will certainly require networking to make this happen.

An example of job file might be

#!/bin/bash
restrict server1 start-web-service
sync web-ready server1 client1
restrict client1 benchmark-web-server
sync web-test-complete server1 client1
restrict server1 stop-web-service

A sync command may also be within a benchmark script. This may be needed when several systems need to execute the same test, and each iteration in the test must start at the same time for all systems. A benchmark script (in /opt/pbench-agent/benchmark-scripts) may have something like:

for iteration in `seq 1 10`; do
    sync $benchmark-$iteration $systems_running_benchmark
    start-tools
    benchmark-command $benchmark-options
    stop-tools
done

Note that the $systems_running_benchmark above may be the same list that was used in a restrict command in the user's jb file to call the benchmark. The benchmark would also have to process an option which provided this list.

Tools

mpstat

For post-processing, group cpu graphs according to the system topology, with an average graph for system, then a section for each node containing an average graph for nodeX, then individual cpuid graphs with sibling hyperthread cpus paired together:

  [system average graph]

  [node0 average graph]
  [coreid0-threadid0][coredid0-threadid1]
  [coreid1-threadid0][coredid1-threadid1]
  [coreid2-threadid0][coredid2-threadid1]
  [coreid3-threadid0][coredid3-threadid1]

  [node1 average graph]
  [coreid0-threadid0][coredid0-threadid1]
  [coreid1-threadid0][coredid1-threadid1]
  [coreid2-threadid0][coredid2-threadid1]
  [coreid3-threadid0][coredid3-threadid1]

Benchmarks

uperf

Before running any tests, run a very quick test (any type really) to make sure the client can contact the server. If this fails, abort the whole thing.

perf-report-consolidated.txt not generated

When the perf tool does post-processing, it creates a consolidated report, where all of the samples for a unique functions, but from different PIDs, are summed. This is currently broken.

Vastly simplify the process to add a new benchmark to pbench

This documents development of a new feature for pbench. We need a way to integrate new benchmarks that is far easier than the current process.

Today we write a new benchmark script for each benchmark. A pbench benchmark script has to do the following:

  1. Process cmd line options for benchmark like runtime, test-types, etc. Some of these are common across benchmarks and some are benchmark specific.
  2. Installation of benchmark binary, on local and/or remote systems.
  3. Create a list of benchmark iterations to run. Each iteration describes the execution of the benchmark with specific options. This is typically based on $test_types, and other benchmark specific options like $message_sizes (uperf) $block_sizes (fio), $threads (dbench). The list of iterations to run is usually a matrix of these options.
  4. Collection system information before benchmark execution.
  5. Start any server process the benchmark may use (can be more than one if --clients has more than one system)
  6. Run the N benchmark iterations sequentially. A benchmark may run just one instance of the benchmark iteration, or many instances, depending on the options for --clients and --servers (some benchmarks use both, others may use only --clients). The benchmark script runs these instances sequentially:
    6A) Start any server process the benchmark may use. Some benchmark do not use this. Those that do may involve many servers.
    6B) Create a benchmark config/job file if needed (xml for uperf, job file to fio)
    6C) Copy config/job file to client systems (may be only local system or many systems)
    6D) Call start-tools
    6E) Synchronized start of all clients, and wait until complete
    6F) Call stop-tools
    6G) Stop the server process(es)
    6H) Collect benchmark result data from all clients and servers
    6I) Post-process result data (may involve generating new metrics, like throughput/resource). This is usually done by calling a benchmark-specific post-processing script.
    6J) Post-process tool data
  7. Generate a summary of all benchmark iterations in txt, csv, and html formats
  8. Collection system information after benchmark execution

Writing a new benchmark script can require up to 1000 lines, and much of this is just duplicated from previous benchmark scripts. Having many copies of similar code ends up being inefficient to maintain. When adding a benchmark to pbench, we need to find a way to provide only the data needed specific to the new benchmark. Pbench needs to process this data and run the benchmark. There should only be one script needed in pbench that can use this benchmark specific data and run the benchmark.

Perhaps not every single benchmark can be done this way, but we should try to get the vast majority of them.

fio: server bad crc on payload

While running pbench_fio on kvm VM's concurrently, Noticing this especially with
jobs=32
iodepth=1

fio: server bad crc on payload (got 0, wanted 6b2a)
fio: server bad crc on payload (got 0, wanted e4e2)
fio: fragment opcode mismatch (6 != 9)
fio: fragment opcode mismatch (6 != 9)
fio: fragment opcode mismatch (6 != 9)

fio job complete

Generating benchmark summary JSON data for indexing

From Andrew Theurer:

Guys, if you have a chance, take a look at the attached files[1]:

These are generated with a new benchmark-summary script, which all benchmarks will eventually use. Currently I have uperf using this in a git branch of mine.

What's new here is the summary-result.* files, including the json format. html format also now uses a html table.

I think the json format should work for elastic search. It was based on our conversation way back, with some minor tweaks.

[1]archive.zip

Include turbostat in our build

Add turbostat in our build process, so we can control exactly which turbostat is used. This will work in the same manner that we use sysstat utils (sar, mpstat, iostat, pidstat).

Also always include --debug in the invocation of turbostat

@jeremyeder please let me know if this works for you. We'll try to get this in ASAP

iostat and perf prematurelly die when running inside pod on Openshift v3

running pbench_fio inside pod on OSE v3 and once job ends it does not collect iostat / perf data.
Error message as showed below

The following jobfile was created: /var/lib/pbench-agent/fio_12E_2015-11-05_09:52:44/2-read-64KiB/sample2/fio.job
[global]
bs=64k
ioengine=libaio
iodepth=32
direct=1
sync=0
time_based=1
runtime=30
clocksource=gettimeofday
ramp_time=5
[job1]
rw=read
filename=/var/lib/docker/fiotest/fiotest
size=4096M
write_bw_log=fio
write_iops_log=fio
write_lat_log=fio
log_avg_msec=1000
running fio job: /var/lib/pbench-agent/fio_12E_2015-11-05_09:52:44/2-read-64KiB/sample2/fio.job
/opt/pbench-agent/tool-scripts/iostat: line 168: kill: (1456) - No such process
/opt/pbench-agent/tool-scripts/perf: line 139: kill: (1705) - No such process
fio job complete
The following jobfile was created: /var/lib/pbench-agent/fio_12E_2015-11-05_09:52:44/2-read-64KiB/sample3/fio.job

name pbench_fio summary files based on env where they ran

necessary to have summary files summary-results.txt/csv and operations written inside to be differently named based on environment where they ran
eg :

now
summary-results.txt
and inside there are
1-read-4KiB
....
18-randrw-1024KiB

proposed
summary-results_baremetal.txt
1-read-4KiB_baremetal
18-randrw-1024KiB_baremetal

this will help to search / grep results based on where they were ran for case when there are multiple files / runs.

Now,summary files get same name for every test case

We need some efficiency metrics for fio tests

Often using a result metric based on throughput is not enough to compare one test to another, because the bottleneck may come from a hard limit like network link speed or drive speed. Having an alternative metric that shows efficiency helps us better understand the result. For example, IOPS/cpu, or Mbps/CPU. We need to add some metrics like this for fio.

Run pbench other than root user

While working on getting pbench to run from a user which doesn't exist on other remote nodes, I was able to get a tool to register however pbench now displays there is an additional host named "root".

Ideally, I could set what user to run the tool under.

[stack@manager ~]$ register-tool --name=mpstat [email protected]
[[email protected]]Package pbench-sysstat-11.1.2-32.el7.centos.x86_64 already installed and latest version
[[email protected]]mpstat tool is now registered in group default
[stack@manager ~]$ list-tools
Permission denied (publickey,gssapi-keyex,gssapi-with-mic).
Permission denied (publickey,gssapi-keyex,gssapi-with-mic).
ssh: Could not resolve hostname root: Name or service not known
default: root[],192.0.2.12[],192.0.2.11[]

Yum failure prevented pbench from registering a tool

This didn't cause any casualties until one of my machines added a repo while automation was running and some of my pbench output only has the vmstat tool rather than all of the tools I was expecting.

This is what prevented the tools (This case mpstat) from registering:

# register-tool --name=mpstat -- --interval=1
https://cdn.redhat.com/content/dist/rhel/server/7/7Server/x86_64/sjis/os/repodata/repomd.xml: [Errno 14] HTTPS Error 404 - Not Found
Trying other mirror.


 One of the configured repositories failed (Red Hat Enterprise Linux for S-JIS (RHEL 7 Server) (RPMs)),
 and yum doesn't have enough cached data to continue. At this point the only
 safe thing yum can do is fail. There are a few ways to work "fix" this:

     1. Contact the upstream for the repository and get them to fix the problem.

     2. Reconfigure the baseurl/etc. for the repository, to point to a working
        upstream. This is most often useful if you are using a newer
        distribution release than is supported by the repository (and the
        packages for the previous distribution release still work).

     3. Disable the repository, so yum won't use it by default. Yum will then
        just ignore the repository until you permanently enable it again or use
        --enablerepo for temporary usage:

            yum-config-manager --disable rhel-sjis-for-rhel-7-server-rpms

     4. Configure the failing repository to be skipped, if it is unavailable.
        Note that yum will try to contact the repo. when it runs most commands,
        so will have to try and fail each time (and thus. yum will be be much
        slower). If it is a very temporary problem though, this is often a nice
        compromise:

            yum-config-manager --save --setopt=rhel-sjis-for-rhel-7-server-rpms.skip_if_unavailable=true

failure: repodata/repomd.xml from rhel-sjis-for-rhel-7-server-rpms: [Errno 256] No more mirrors to try.
https://cdn.redhat.com/content/dist/rhel/server/7/7Server/x86_64/sjis/os/repodata/repomd.xml: [Errno 14] HTTPS Error 404 - Not Found
For some reason this tool could not be installed

To fix this I had to remove the broken repo. (Subscription-manager repos --disable=$bad_repo) Unfortunately it is unclear as to why that repo decided to show up on one of my machines over the weekend. All I could find related to that is this BZ comment: https://bugzilla.redhat.com/show_bug.cgi?id=1194899#c7

Prior to this failure, I have successful runs of pbench on the same machine with the other tools (mpstat and others.) I am certain the tool was already installed and the yum failure should not affect pbench from registering that particular tool.

Documentation feedback from first external reviewer

From Doug Williams ([email protected]):

Here's some initial comments and suggested edits to pbench-agent.html.

General Comments:

The documentation flow seems to mix of basic, intermediate and advanced topics. It may make sense to segregate these topics by complexity. An example flow:

  • Running a pre-packaged benchmark on a single node
  • What data is collected, and selection of tools (default vs optional)
  • Running benchmark on multiple nodes
  • Running user-specified benchmark
  • Adding new benchmark to pbench
  • Adding new tool to pbench

Some of the text is in an informal conversational style using personal pronouns, 'You', 'I'. I've included some proposed edits if your decide to go with a more formal 3rd person style. I view this as a secondary issue.

Section 1: WARNING

DDW COMMENT: Use of 'some' may be unnecessary, may want to consider the following ...

This document may describe future capabilities of pbench.
Currently both code and documentation are undergoing
active development, and while we strive for consistency,
if you find something not working as described here, please
let us know.  It may be a bug in the documentation, a bug in
the code or a feature not yet implemented.

Section 2: What is pbench?

DDW COMMENT: Text seemed a bit confusing, may want to consider the following ...

... Pbench includes built-in scripts supporting many common
benchmarks such as cyclictest, debench, fio, linpack, migrate,
iozone, netperf, specjbb2005 and uperf.  Options for use of
Pbench with other (non-built-in) benchmarks include:
  o Running pbench in collector-only mode, separately running
     benchmark
  o Extending pbench by through development a benchmark-specific
     pbench script.
Such contributions are more than welcome!

DDW QUESTION: How does one extend Pbench for additional data collectors?

Section 3: Quick links

DDW COMMENT: Both URLS are currently invalid

- Results Directory - http://pbench.example.com/results/
- pbench RPM repo - http://pbench.example.com/repo

Section 4: TL;DR version

DDW COMMENT: Need to update URL to reflect repo location. Will you be using GitHub as a repo?

wget -O /etc/yum.repos.d/pbench.repo http://repohost.example.com/repo/yum.repos.d/pbench.repo

Section 5: How to install

DDW COMMENT: Similar issue concerning repo URL

http://repohost.example.com/repo/yum.repos.d

and

wget -O /etc/yum.repos.d/pbench.repo http://repohost.example.com/repo/yum.repos.d/pbench.repo

Section 5.1: Updating pbench

DDW COMMENT: Nit, section is written with a conversational style, such as 'I' and 'you'.

Consider:

Since the pbench package and associated benchmark and tools RPMS are
updated frequently, it may be necessary to clean the yum cache in order for
yum to see any new versions.  If during update yum reports no packages to
update, try again after cleaning the cache:

<<Command Sequence>>

If may be necessary to logout and re-login after changes. If the above update
encounters problems, try the following workaround:

<<Command Sequence>>

.....

The workaround should not be necessary if currently installed release is
0.31-95 or later.

.....

When upgrading to a release later than -102, due to changes in label
handling it is necessary to clear out and re-register tools post upgrade.
For example:

<<Command Sequence>>

Section 6: First Steps

DDW COMMENT: Another first-person to 3rd person change ...

...
Built-in benchmarks can be run by invoking the associated pbench_XXX
script
  - pbench will install the benchmark if necessary:
...

Section 6.1: First Steps with user-benchmark

DDW COMMENT: should Section 2 make reference to 6.1 (user-benchmark) in context of 'but the data collection can be run separately as well with a benchmark that is not built-in to pbench ....'

DDW COMMENT: Stylistic change to remove first person

Consider:

A user-benchmark script can be used to run other benchmark in addition
to the benchmarks pre-packaged with pbench.  user-benchmark takes a
command as argument ...

Section 6.2: First Steps with Remote Hosts and user-benchmark

DDW COMMENT: This section appears to be the first treatment of multi-host benchmarks. My recommendation is that you show and example of a multi-host packaged benchmark, then show an example of a multi-host user-benchmark.

Section 8: Available tools

DDW COMMENT: Stylistic nit, consider the following wording

register-tool-set configures the following tools by default:

DDW COMMENT: Error in last command sequence

Current:

unregister --name=perf
register-tool --name=perf -- --record-opts="record -a --freq=200"

Should read:

unregister-tool --name=perf
register-tool --name=perf -- --record-opts="record -a --freq=200"

Section 9: Available Benchmark Scripts

DDW COMMENT: Which of these benchmarks support multi-host operation?

DDW COMMENT: Stylistic nit

Consider:

Note that in many of these scripts the default tool group is hard-wired:

edits to the appropriate script may be required when using different tool group

Section 10: Utility Scripts

DDW COMMENT: Stylistic nit

Consider:

This section provides background for the Second steps section below.

Pbench uses a utility scripts to do common operations.  Many of the
utility scripts support the following options:
  --name to specify a tool
  --group to specify a tool group
  --with-options to list or pass options to a tool
  --remote to operate on a remote host

See entries in the FAQ section below for more details on these options.

DDW COMMENT: Consider headings 'Tool Registration related Utility Scripts', 'Tool Control related Utility Scripts', 'Results and Post Processing related Utility Scripts', and 'Miscellaneous Utility Scripts'

Section 11: Second Steps

DDW COMMENT: The warning is a bit confusing. If you're recommending against user-benchmarks, then move this content later in docs. If it's something else, such as ad-hoc scripts, then I would advise that you move the treatment of extensibility of user creation of benchmark scripts into an advanced section.

Section 12: Running Pbench Collection Tools with an Arbitrary Benchmark

DDW COMMENT: Should the warning in Section 11 be moved here?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.