Code Monkey home page Code Monkey logo

reproduce-cgo2017-paper's Introduction

Artefact Evaluation Reproduction for "Software Prefetching for Indirect Memory Accesses", CGO 2017, using CK.

This repository contains artifacts and workflows to reproduce experiments from the CGO 2017 paper by S. Ainsworth and T. M. Jones

"Software Prefetching for Indirect Memory Accesses"

Hardware pre-requisities

Any of the following architectures:

  • Intel-based
  • ARM64 with 64-bit kernel

Software pre-requisites

  • Python 2.7 or 3.3+
  • git client
  • Collective Knowledge Framework (CK) - http://cKnowledge.org
  • All other dependencies will be installed by CK (LLVM 3.9 and plugins)

You can install above dependencies on Ubuntu via:

$ sudo apt-get install python python-pip git
$ sudo pip install ck

Installation

You can install this repository via CK as follows:

$ ck pull repo --url=https://github.com/SamAinsworth/reproduce-cgo2017-paper

If you already have CK installed, please update before use:

$ ck pull all

Testing installation

You can compile and run one of the benchmarks (NAS CG) with the LLVM plugin as follows:

$ ck compile program:nas-cg --speed --env.CK_COMPILE_TYPE=auto
$ ck run program:nas-cg

Running experimental workflows (reproducing figures)

Run

$ ck run workflow-from-cgo2017-paper

The script runs experiments from the paper, in order of figures (2,4-7). At the end of each experiment, times are output, along with the example values we achieved on Haswell (in the case of x86) or the A57-powered Nvidia TX1 (in the case of ARM64). Though we don't expect the overall times to be similar across different systems, the trends shown in the paper should be largely similar for a given class of microarchitecture.

By default, the script above waits for user input at the end of each experiment. To turn this off, run with the --quiet option:

$ ck run workflow-from-cgo2017-paper --quiet

If any unexpected behaviour is observed, please report it to the authors.

Validation of results

To generate bar graphs of the data, run

$ ck dashboard workflow-from-cgo2017-paper

This will output speedups for the data you have generated, and also graphs for prerecorded data for x86 (Haswell) and aarch64 (A57), but not aarch64 (A53).

Results will also be output to ck-log-reproduce-results-from-cgo2017-paper.txt, in the directory in which you run the workflow.

This file will include the results observed on your machine, and those observed on either Haswell or A57 for reference, depending on your target ISA.

While we do not expect absolute values to match, it is expected that overall trends, as shown in the related figures (2,4-7) within the paper, will match up depending on your microarchitecture.

Please note that the reference results on ARM64 systems when running on in-order architectures such at the A53 will still be from the A57, so are not expected to match up directly: you should instead compare ratios given in the paper itself.

If anything in unclear, or any unexpected results occur, please report it to the authors.

Manual validation (if problems with CK)

for x86-64:

$ cd script/reproduce-cgo2017-paper

To compile:

$ ./compile_x86.sh

To run:

$ ./run_x86.sh

for ARM64:

cross compilation for ARM64 on an x86-64 machine:

$ cd script/reproduce-cgo2017-paper
$ ./compile_aarch64.sh

running on an ARM64 machine:

$ cd script/reproduce-cgo2017-paper
$ ./run_arm.sh

Recompilation should not be necessary, as all binaries are included, but is provided as an option.

Authors

S. Ainsworth and T. M. Jones

Acknowledgments

This work was supported by the Engineering and Physical Sciences Research Council (EPSRC), through grant references EP/K026399/1 and EP/M506485/1, and ARM Ltd.

Customisation

Our CK integration allows customisation of both benchmarks and settings. New workflows can be added in the style exhibited by module/workflow-from-cgo-paper/module.py:

    r=experiment({'host_os':hos, 'target_os':tos, 'device_id':tdid, 'out':oo,
                  'program_uoa':cfg['programs_uoa']['nas-is'],
                  'env':{'CK_COMPILE_TYPE':'no'},
                  'deps':deps,
                  'quiet':q, 'record':rec, 'record_repo_uoa':rruid, 'record_data_uoa':rduid, 'os_abi':os_abi,
                  'title':'Reproducing experiments for Figure 2',
                  'subtitle':'Validating nas-is no prefetching:',
                  'key':'figure-2-nas-is-no-prefetching', 'results':results})

CK_COMPILE_TYPE can be configured as "no", "auto", "auto-nostride" or "man" to run the relevant experiment. The behaviour of each of these is specified in the ck_compile.sh included in each benchmark. Other customisable properties are available depending on the benchmark: see module.py for more details. The program can be specified in cfg, output text in title and subtitle, and new results output and optionally stored (--record) using JSON with a new "key".

Similarly, benchmarks can be compiled and run individually, for example:

$ ck compile program:nas-is --speed --env.CK_COMPILE_TYPE=auto
$ ck run program:nas-is

The software prefetching shared object pass can also be compiled and installed using CK, then used separately:

	$ ck install package:plugin-llvm-sw-prefetch-pass
	$ . $(ck find $(ck search env --tags=sw-prefetch-pass))/env.sh
	$ clang -Xclang -load -Xclang $CK_ENV_PLUGIN_LLVM_SW_PREFETCH_PASS_FILE -O3 ...

Troubleshooting

Issues with GLIBCXX_3.4.20/3.4.21 when using LLVM installed via CK: These sometimes occur on earlier Ubuntu versions (14.04) on ARM/x86. This can be fixed by upgrading to later versions of Ubuntu, or can sometimes be fixed by:

$ sudo add-apt-repository ppa:ubuntu-toolchain-r/test
$ sudo apt-get update
$ sudo apt-get upgrade
$ sudo apt-get dist-upgrade

Issues with libncursesw.so.6 (not found) on some older machines: It can be fixed by compiling and installing lib-ncurses with the support for wide characters. This can be done automatically via CK:

$ ck install package:lib-ncurses-6.0-root

undefined symbol: _ZNK4llvm12FunctionPass17createPrinterPassERNS_11raw_ostreamERKSs when compiling using Clang:

This occurs on some machines depending on other libraries installed. To fix this, run

$ ck install package:plugin-llvm-sw-prefetch-pass -DCK_FORCE_USE_ABI=0
$ ck install package:plugin-llvm-sw-prefetch-no-strides-pass -DCK_FORCE_USE_ABI=0

or

$ ck install package:plugin-llvm-sw-prefetch-no-strides-pass -DCK_FORCE_USE_ABI=1
$ ck install package:plugin-llvm-sw-prefetch-pass -DCK_FORCE_USE_ABI=1

and retry compilation.

If this var is not specified, CK build script will try to detect host machine and will set it to 0 on aarch64 and to 1 on anything else: if this fails, try the opposite.

reproduce-cgo2017-paper's People

Contributors

samainsworth avatar gfursin avatar

Stargazers

Zifei Zhang avatar  avatar  avatar Mingzhuo Yin avatar  avatar Yuxuan Zhang avatar Demka avatar CacheL1ne avatar Ofek avatar cry avatar Gins avatar Li Xiang avatar Qinye Sindy Li avatar Spyros Pavlatos avatar  avatar  avatar Tiancheng Xu avatar Chunhua Liao avatar  avatar Todd Gamblin avatar Wen-Ding Li avatar Xuhao Chen avatar Yazhuo Zhang avatar Anton Lokhmotov avatar Kim-Anh avatar  avatar  avatar Denis Denisov avatar Matteo Bertello avatar Matt avatar KC Sivaramakrishnan avatar Arvid Gerstmann avatar Lewei Lu avatar  avatar  avatar

Watchers

James Cloos avatar  avatar  avatar  avatar

reproduce-cgo2017-paper's Issues

Strange problem with plugin on ARM64

Hi Sam,

I installed your artifacts on Tegra X1 development board (ARM64)
and tried to run your experimental workflow. All packages including
CLANG 3.9 and your plugins installed and compiled fine. But
when I try to compile a benchmark using one of your plugins,
I get the following error:

bash ../ck_compile.sh

*** auto ***
clang-3.9 -O3 ../cg.c -Xclang -load -Xclang /home/gfursin/CK-TOOLS/plugin-llvm-sw-prefetch-pass-0.1-llvm-3.9.0-linux-64/lib/SwPrefetchPass.so -c -S -emit-llvm
error: unable to load plugin '/home/gfursin/CK-TOOLS/plugin-llvm-sw-prefetch-pass-0.1-llvm-3.9.0-linux-64/lib/SwPrefetchPass.so':
'/home/gfursin/CK-TOOLS/plugin-llvm-sw-prefetch-pass-0.1-llvm-3.9.0-linux-64/lib/SwPrefetchPass.so: undefined symbol:
_ZNK4llvm12FunctionPass17createPrinterPassERNS_11raw_ostreamERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE'

I didn't have such problems on X64 and ARMv7, so maybe it's a problem with an official CLANG distribution (I use clang+llvm-3.9.0-aarch64-linux-gnu.tar.xz from http://llvm.org/releases).

Did you ever encounter this problem? If I remember correctly, you use self-compiled CLANG, right? Maybe it's possible to make it compatible with the official CLANG on ARM64? I didn't have time to dig further ...

Thanks!

minor issue with latest version using Python3

Hi Sam,

Just a note that Python 3 seems not to like tabs.
When I ran updated dashboard on my Windows machine (to view graphs), I got
an error in line 1389 in file 'module/workflow-from-cgo2017-paper/module.py'

if(not ext.endswith('no-prefetching')):

When I substituted tabs with spaces in front of 'if', it started working fine.
Do you mind to change it, please and commit to main repo?

Thanks a lot!

Reproducing results from your paper on ARM64 and X86-64

Hi Sam,

I build and ran the latest workflow on my ARM64 (Tegra X1) and x86-64 machines via CK.
The results seem to correspond with ones from your paper.

I attached workflow raw and CK logs, platform description logs (from "ck detect platform"), packed 'result' entry (if someone would like to reuse them further), and PDF from the dashboard to compare results with the pre-recorded ones for both architectures.

Do you mind just to double check them! Thanks a lot!

ck-aarch64-platform.txt
ck-aarch64-log-reproduce-results-from-cgo2017-paper.txt
ck-aarch64-workflow.txt
ck-aarch64-result-entry.tar.gz
ck-aarch64-dashboard.pdf

ck-x86-64-platform.txt
ck-x86-64-log-reproduce-results-from-cgo2017-paper.txt
ck-x86-64-workflow.txt
ck-x86-64-result-entry.tar.gz
ck-x86-64-dashboard.pdf

About Camel microbenchmark

Hi Sam,

Thanks for making all your effort including this publicly available!

I wonder if you can share Camel microbenchmark, described in Code listing 3.2 in your thesis.

Yongkee

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.