Code Monkey home page Code Monkey logo

gem5's Introduction

About

This is the gem5 simulator for Xiangshan (XS-GEM5), which currently scores similar with Nanhu on SPEC CPU 2006.

Features

XS-GEM5 is not as easy to use as official GEM5, because it only supports full-system simulation with Xiangshan's specific formats, refer to Workflows for more details.

XS-GEM5 is enhanced with

  • Xiangshan RVGCpt: a cross-platform full-system checkpoint for RISC-V.
  • Xiangshan online Difftest: an API to check execution results online.
  • Frontend microarchitecture calibrated with Xiangshan V2 (Nanhu): Decoupled frontend, TAGESC, and ITTAGE, which performance better than LTAGE and TAGE-SCL shipped in official version on SPECCPU.
  • Instruction latency calibrated with Nanhu
  • Cache hierarchy, latency, and prefetchers calibrated with Nanhu.
  • A fixed Multi-Prefetcher framework with VA-PA translation support
  • A fixed BOP prefetcher
  • Parallel RV PTW (Page Table Walker) and walking state coalescing
  • Cascaded FMA
  • Move elimination
  • L2 TLB and TLB prefetching (coming soon).
  • Other functional or performance bug fixes.

Branches

Because XS-GEM5 is currently under internal development, we have several branches for different purposes:

  • xs-dev branch is periodically synced with our internal development branch.
  • backport branch is used to backport patches that affects functional correctness and basic usage.

What is NOT supported

  • Cannot run Boom's baremetal app
    • We only support Abstract Machine baremetal environment or Linux for Xiangshan.
  • Cannot directly run an ELF
  • Checkpoint is not compatible with GEM5's SE checkpoints or m5 checkpoints.
    • Cannot produce GEM5's SE checkpoints or m5 checkpoints
    • Cannot run GEM5's SE checkpoints or m5 checkpoints
  • Recommend NOT to produce a checkpoint in M-mode

Please DO NOT

  • Please don't make a new issue without reading the doc
  • Please don't make a new issue without searching in issue list
  • Please don't running boom's baremetal app with XS-GEM5
  • Please don't running SimPoint bbv.gz with NEMU, XS-GEM5, or Xiangshan processor, because it is not bootable
  • Please don't make a new issue about building Linux in NEMU's issue list, plz head to Xiangshan doc

Maintainers will BLOCK you from this repo if

  • Try to run boom's baremetal app with XS-GEM5, and make a related issue
  • Try to run SimPoint bbv.gz with XS-GEM5, and make a related issue

A Short Doc

Workflows: How to run workloads

Run without checkpoint

The typical flow for running workloads is similar for NEMU, XS-GEM5, and Xiangshan processor. All of them only support full-system simulation. To prepare workloads for full-system simulation, users need to either build a baremetal app or running user programs in an operating system.

graph TD;
am["Build a baremetal app with AM"]
linux["Build a Linux image containing user app"]
baremetal[/"Image of baremetal app or OS"/]
run["Run image with NEMU, XS-GEM5, or Xiangshan processor"]

am-->baremetal
linux-->baremetal
baremetal-->run

Run in with checkpoints

Because most of the enterprise users and researchers are more interested in running larger workloads, like SPECCPU, on XS-GEM5. To reduce the simulation time of detailed simulation, NEMU serves as a checkpoint producer. The flow for producing and running checkpoints is as follows.

graph TD;
linux["Build a Linux image containing NEMU trap app and user app"]
bin[/"Image containing Linux and app"/]
profiling["Boot image with NEMU with SimPoint profiling"]
bbv[/"SimPoint BBV, a .gz file"/]
cluster["Cluster BBV with SimPoint"]
points[/"SimPoint sampled points and weights"/]
take_cpt["Boot image with NEMU to produce checkpoints"]
checkpoints[/"Checkpoints, several .gz files of memory image"/]
run["Run checkpoints with XS-GEM5"]

linux-->bin
bin-->profiling
profiling-->bbv
bbv-->cluster
cluster-->points
points-->take_cpt
take_cpt-->checkpoints
checkpoints-->run

How to prepare workloads

As described above, XS-GEM5 either takes a baremetal app or a checkpoint as input.

To build baremetal app compatible with XS-GEM5, we use Abstract Machine as a light-weight baremetal library. Common simple apps like coremark and dhrystone can be built with Abstract Machine.

To obtain checkpoints of large applications, please follow the doc to build Linux to pack a image, and follow the checkpoint tutorial for Xiangshan to produce checkpoints.

The process to produce SimPoint checkpoints includes 3 individual steps

  1. SimPoint Profiling to get BBVs. (To save space, they often output in compressed formats such as bbv.gz.)
  2. SimPoint clustering. You can also opt to Python and sk-learn to do k-means clustering. (In this step, what is typically obtained are the positions selected by SimPoint and their weights.)
  3. Taking checkpoints according to clustering results. (In the RVGCpt process, this step generates the checkpoints that will be used for simulation.)

If you have problem generating SPECCPU checkpoints, following links might help you.

Basic build environment

Install dependencies as official GEM5 tutorial says:

Setup on Ubuntu 22.04

If compiling gem5 on Ubuntu 22.04, or related Linux distributions, you may install all these dependencies using APT:

sudo apt install build-essential git m4 scons zlib1g zlib1g-dev \
    libprotobuf-dev protobuf-compiler libprotoc-dev libgoogle-perftools-dev \
    python3-dev libboost-all-dev pkg-config libsqlite3-dev

Setup on Ubuntu 20.04

If compiling gem5 on Ubuntu 20.04, or related Linux distributions, you may install all these dependencies using APT:

sudo apt install build-essential git m4 scons zlib1g zlib1g-dev \
    libprotobuf-dev protobuf-compiler libprotoc-dev libgoogle-perftools-dev \
    python3-dev python-is-python3 libboost-all-dev pkg-config libsqlite3-dev

Setup on Ubuntu 18.04

If compiling gem5 on Ubuntu 18.04, or related Linux distributions, you may install all these dependencies using APT:

sudo apt install build-essential git m4 scons zlib1g zlib1g-dev \
    libprotobuf-dev protobuf-compiler libprotoc-dev libgoogle-perftools-dev \
    python3-dev python libboost-all-dev pkg-config libsqlite3-dev

Clone and build DRAMSim3

Refer to The readme for DRAMSim3 to install DRAMSim3.

Notes:

  • If you have already built GEM5, you should rebuild gem5 after install DRAMSim3
  • If simulating Xiangshan system, use DRAMSim3 with our costumized config

Usage:

$gem5_home/build/gem5.opt ... fs.py ... \
    --mem-type=DRAMsim3 \
    --dramsim3-ini=$gem5_home/ext/dramsim3/xiangshan_configs/xiangshan_DDR4_8Gb_x8_3200_2ch.ini ...

Build GEM5

cd GEM5
scons build/RISCV/gem5.opt --gold-linker -j8
export gem5_home=`pwd`

Press enter if you saw

You're missing the gem5 style or commit message hook. These hooks help
to ensure that your code follows gem5's style rules on git commit.
This script will now install the hook in your .git/hooks/ directory.
Press enter to continue, or ctrl-c to abort:

Run Gem5

Users must properly prepare workloads before running GEM5, plz read Workflows first.

The example running script contains the default configuration for XS-GEM5, and a simple batch running function.

NOTE: If you want to cosimulate against NEMU, please refer to Difftest with NEMU before running, and set NEUM_HOME to the root directory of NEMU. If not, please delete the line containing --enable-difftest \ in the example running script.

The example running script runs GEM5 with single thread (function single_run) or multiple threads (function parallel_run). Both single_run and parallel_run calls function run. function run provides the default parameters for XS-GEM5.

For debugging or performance tuning, we usually call single_run and modify parameters for function run. run takes 5 parameters:

  • debug_gz: the path to the debug binary (usually checkpoint) of the program to run.
  • warmup_inst: the number of instructions to warmup the cache, usually 20M.
  • max_inst: the number of instructions to run, usually 40M. The first half is used for warmup, and the second half is used for statistics collection.
  • work_dir: the directory to store the output files.
  • the last parameter: whether enable Arch DB. Arch DB is a database to store the micro-architectural trace of the program. It is used for debugging and performance tuning.

More details can be found in comments and code of the example running script.

Play with Arch DB

Arch DB is a database to store the micro-architectural trace of the program with SQLite. You can access it with Python or other languages. A Python example is given here.

Difftest with NEMU

NEMU is used as a reference design for XS-GEM5. Typical workflow is as follows.

graph TD;
build["Build NEMU in reference mode"]
so[/"./build/riscv64-nemu-interpreter-so"/]
cosim["Run XS-GEM5 or Xiangshan processor, turn on difftest, specify riscv64-nemu-interpreter-so as reference design"]

build-->so
so-->cosim

We the gem5-ref-main branch of NEMU for difftest with XS-GEM5.

git clone https://github.com/OpenXiangShan/NEMU.git -b gem5-ref-main
cd NEMU
export NEMU_HOME=`pwd`
make riscv64-nohype-ref_defconfig
make menuconfig  # then save configs
make -j 10

Then the contents of build directory should be

build
|-- obj-riscv64-nemu-interpreter-so
|   `-- src
`-- riscv64-nemu-interpreter-so

then use riscv64-nemu-interpreter-so as reference for GEM5,

export ref_so=`realpath build/riscv64-nemu-interpreter-so`

# This is not full command, but a piece of example.
$gem5_home/build/gem5.opt ... --enable-difftest --difftest-ref-so $ref_so ...

FAQ

Python problems

If your machine has a Python with very high version, you may need to install a lower version of Python to avoid some compatibility issues. We recommend to use miniconda to install Python 3.8.

Installation command, copied from official miniconda website

mkdir -p ~/miniconda3
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O ~/miniconda3/miniconda.sh
bash ~/miniconda3/miniconda.sh -b -u -p ~/miniconda3
rm -rf ~/miniconda3/miniconda.sh

Then add conda to path in ~/.bashrc or ~/.zshrc. Note this will hide the system Python.

# for bash
~/miniconda3/bin/conda init bash
# for zsh
~/miniconda3/bin/conda init zsh

Restart your terminal, and you should be able to use conda. Then create a Python 3.8 env:

# create env
conda create --name py38 --file $gem5_home/ext/xs_env/gem5-py38.txt

# This is mudatory to avoid conda auto activate base env
conda config --set auto_activate_base false

Each time login, you need to activate the conda env before building GEM5:

conda activate py38

In case that you don't like this or it causes problem, to completely remove Python and conda from your PATH, run:

# for bash
conda init bash --reverse
# for zsh
conda init zsh --reverse

It complains Python not found

This is often not Python missing, but other problems. Because the build scripts (and scons) uses a strange way to find Python, see site_scons/gem5_scons/configure.py for more detail. For example, when building with clang10, I encountered this problem:

Error: Check failed for Python.h header.
        Two possible reasons:
       1. Python headers are not installed (You can install the package python-dev on Ubuntu and RedHat)
       2. SCons is using a wrong C compiler. This can happen if CC has the wrong value.
       CC = clang

This is not becaues of Python, but because GCC and clang have different warning suppression flags. To fix it, I apply this path:

git apply ext/xs_env/clang-warning-suppress.patch

But Python complaints are also possible caused by other problems, For similar errors, check build/RISCV/gem5.build/scons_config.log to get the real error message.

Original README

The README for official GEM5 is here: Original README

gem5's People

Contributors

abmerop avatar andysan avatar aroelke avatar atgutier avatar beckmabd avatar binkert avatar bkp avatar bobbyrbruce avatar cdunham avatar giactra avatar hnpl avatar jthestness avatar jueshiwenli avatar kyleroarty avatar lingrui98 avatar maxkev1n avatar nilayvaish avatar odanrc avatar powerjg avatar ramymdsc avatar rdreslin avatar relokin avatar sandip4n avatar shinezyy avatar steve-reinhardt avatar tastynoob avatar tiagormk avatar tushar-krishna avatar wmin0 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

gem5's Issues

自定义格式checkpoint

为了在gem5 里支持你们自定义格式的checkpoint,gem5 做了哪些修改?有没有相应的提交?

Failed to run hello world

I compiled the code from the branch dbp-merge-xsdev-221010.

git switch dbp-merge-xsdev-221010

scons build/RISCV/gem5.debug -j16  

And tried to run a hello world program.

./build/RISCV/gem5.debug configs/example/se.py --cpu-type=DerivO3CPU --caches -c tests/test-progs/hello/bin/riscv/linux/hello

It failed with

gem5 Simulator System.  https://www.gem5.org
gem5 is copyrighted software; use the --copyright option for details.

gem5 version [DEVELOP-FOR-22.1]
gem5 compiled May 25 2023 12:01:17
gem5 started May 25 2023 12:24:53
gem5 executing on mprc2, pid 1587869
command line: ./build/RISCV/gem5.debug configs/example/se.py --cpu-type=DerivO3CPU --caches -c tests/test-progs/hello/bin/riscv/linux/hello

build/RISCV/base/loader/image_file_data.cc:107: info: Loading file tests/test-progs/hello/bin/riscv/linux/hello
build/RISCV/base/loader/image_file_data.cc:127: info: File size is 4814352 bytes
build/RISCV/base/loader/image_file_data.cc:133: info: First 4 bytes are 0x7f 0x45 0x4c 0x46
build/RISCV/base/loader/image_file_data.cc:135: info: Mapped start address is ELF, 0x7ffff4d3e000
Global frequency set at 1000000000000 ticks per second
warn: No dot file generated. Please install pydot to generate the dot file and pdf.
build/RISCV/mem/dram_interface.cc:690: warn: DRAM device capacity (8192 Mbytes) does not match the address range assigned (512 Mbytes)
build/RISCV/base/loader/image_file_data.cc:107: info: Loading file tests/test-progs/hello/bin/riscv/linux/hello
build/RISCV/base/loader/image_file_data.cc:127: info: File size is 4814352 bytes
build/RISCV/base/loader/image_file_data.cc:133: info: First 4 bytes are 0x7f 0x45 0x4c 0x46
build/RISCV/base/loader/image_file_data.cc:135: info: Mapped start address is ELF, 0x7fffd4cfe000
build/RISCV/cpu/o3/cpu.cc:332: warn: Difftest is disabled
0: system.remote_gdb: listening for remote gdb on port 7000
**** REAL SIMULATION ****
build/RISCV/sim/simulate.cc:194: info: Entering event queue @ 0.  Starting simulation...
gem5.debug: build/RISCV/cpu/pred/decoupled_bpred.cc:155: std::pair<bool, bool> gem5::branch_prediction::DecoupledBPU::decoupledPredict(const StaticInstPtr&, const InstSeqNum&, gem5::PCStateBase&, gem5::ThreadID): Assertion `pc.instAddr() < end && pc.instAddr() >= start' failed.

I'm wondering if the way I run hello world is wrong or the branch I'm using has a bug?

gem5 怎么跑coremark

用你们提供的配置参数跑会错误。参数如下:

                "$gem5_home/build/RISCV/gem5.fast \
                    $gem5_home/configs/example/fs.py\
                    --xiangshan-system \
                    --bare-metal \
                    --cpu-type=DerivO3CPU \
                    --mem-size=8GB \
                    --caches --cacheline_size=64 \
                    --l1i_size=64kB --l1i_assoc=8 \
                    --l1d_size=64kB --l1d_assoc=8 \
                    --l1d-hwp-type=XSCompositePrefetcher \
                    --short-stride-thres=0 \
                    --l2cache --l2_size=1MB --l2_assoc=8 \
                    --l3cache --l3_size=16MB --l3_assoc=16 \
                    --l1-to-l2-pf-hint \
                    --l2-hwp-type=WorkerPrefetcher \
                    --l2-to-l3-pf-hint \
                    --l3-hwp-type=WorkerPrefetcher \
                    --mem-type=DRAMsim3 \
                    --dramsim3-ini=$gem5_home/ext/dramsim3/xiangshan_configs/xiangshan_DDR4_8Gb_x8_3200_2ch.ini \
                    --bp-type=DecoupledBPUWithFTB --enable-loop-predictor \
                    --kernel=./benchmark/image/coremark.2timesriscv \
                    --command-line='0x0 0x0 0x66 0 7 1 2000' |& tee run.log" Enter

报下面的错误

Global frequency set at 1000000000000 ticks per second
WARNING: Output directory ext/dramsim3/DRAMsim3/ not exists! Using current directory for output!
fatal: system.workload.bootloader without default or user set value
gem5 Simulator System.  https://www.gem5.org
gem5 is copyrighted software; use the --copyright option for details.

gem5 version [DEVELOP-FOR-22.1]
gem5 compiled Jan 25 2024 10:35:49
gem5 started Jan 25 2024 10:36:29
gem5 executing on n168-020-004, pid 17185
command line: ./GEM5-xs-dev/build/RISCV/gem5.fast ./GEM5-xs-dev/configs/example/fs.py --xiangshan-system --cpu-type=DerivO3CPU --mem-size=8GB --caches --cacheline_size=64 --l1i_size=64kB --l1i_assoc=8 --l1d_size=64kB --l1d_assoc=8 --l1d-hwp-type=XSCompositePrefetcher --short-stride-thres=0 --l2cache --l2_size=1MB --l2_assoc=8 --l3cache --l3_size=16MB --l3_assoc=16 --l1-to-l2-pf-hint --l2-hwp-type=WorkerPrefetcher --l2-to-l3-pf-hint --l3-hwp-type=WorkerPrefetcher --mem-type=DRAMsim3 --dramsim3-ini=./GEM5-xs-dev/ext/dramsim3/xiangshan_configs/xiangshan_DDR4_8Gb_x8_3200_2ch.ini --bp-type=DecoupledBPUWithFTB --enable-loop-predictor --kernel=./benchmark/image/coremark.2timesriscv '--command-line=0x0 0x0 0x66 0 7 1 2000'

[<m5.params.AddrRange object at 0x7f42d71dfc10>]
['basic']
db_switches: []
Attach 1 decoders to thread with addr: <orphan System>.cpu.decoder
Create threads for test sys cpu (RiscvO3CPU)
Add dtb for L1D prefetcher
Add L2 prefetcher as downstream of L1D prefetcher
Add L3 prefetcher as downstream of L2 prefetcher
Add dtb for L2 prefetcher
Finish memory system configuration
No cpu_class provided

gem5 has encountered a segmentation fault

simple_gem5.sh 跑baremetal bin 的时候报这个错误

Attach 1 decoders to thread with addr: <orphan System>.cpu.decoder
Create threads for test sys cpu (RiscvO3CPU)
Add dtb for L1D prefetcher
Add L2 prefetcher as downstream of L1D prefetcher
Add L3 prefetcher as downstream of L2 prefetcher
Add dtb for L2 prefetcher
Finish memory system configuration
No cpu_class provided
Registering probe listeners for Prefetcher system.cpu.dcache.prefetcher
Registering probe listeners for Prefetcher system.cpu.dcache.prefetcher.bop_large
Registering probe listeners for Prefetcher system.cpu.dcache.prefetcher.bop_small
Registering probe listeners for Prefetcher system.cpu.dcache.prefetcher.ipcp
Registering probe listeners for Prefetcher system.cpu.dcache.prefetcher.spp
Registering probe listeners for Prefetcher system.l2.prefetcher
Registering probe listeners for Prefetcher system.l3.prefetcher
**** REAL SIMULATION ****
build/RISCV/sim/simulate.cc:194: info: Entering event queue @ 0.  Starting simulation...
gem5 has encountered a segmentation fault!

重复步骤:
按照simple_gem5.sh 修改脚本如下:
修改--generic-rv-cpt=./xiangshang/benchmark/image/coremark.bare.1timesriscv。
完整脚本如下

# DO NOT track your local updates in this script!
# set -x

export gem5_home=/xiangshang/gem5/GEM5-xs-dev # The root of GEM5 project
export gem5=$gem5_home/build/RISCV/gem5.fast # GEM5 executable


# Note 1: workload list contains the workload name, checkpoint path, and parameters, looks like:
#       astar_biglakes_122060000000 astar_biglakes_122060000000_0.244818/0/ 0 0 20 20
#       bwaves_1003220000000 bwaves_1003220000000_0.036592/0/ 0 0 20 20
# Note 2: The meaning of fields:
# workload_name, checkpoint_path, skip insts(usually 0), functional_warmup insts(usually 0), detailed_warmup insts (usually 20), sample insts
# Note 3: you can write a script to generate such a list accordingly
export desc_dir=/xiangshang/benchmark/coremark/fs
export workload_list=/xiangshang/benchmark/coremark/fs/int_list.lst


# The checkpoint directory. We will find checkpoint_path in workload_list
# under this directory to get the checkpoint path.
export cpt_dir='/xiangshang/benchmark/image'

# A tag to identify current batch run
export tag="an-example-to-run-gem5-with-composite-prefetcher"

export log_file='log.txt'

export ds=$(pwd)  # data storage. It is specific for BOSC machines, you can ignore it

export top_work_dir=$tag
export full_work_dir=$ds/exec-storage/$top_work_dir  # work dir wheter stats data stored

mkdir -p $full_work_dir
ln -sf $full_work_dir .  # optional, you can customize it yourself

check() {
    if [ $1 -ne 0 ]; then
        echo FAIL
        touch abort
        exit
    fi
}

function run() {
    set -x
    cpt=$1
    dw_len=${2:-20000000}
    # dw_len=${2:-1525605}
    total_detail_len=${3:-40000000}
    if [[ -n "$4" ]]; then
        work_dir=$4
    else
        work_dir=$PWD
    fi
    arch_db=${5:-0}

    cd $work_dir

    if test -f "completed"; then
        echo "Already completed; skip $cpt"
        return
    fi

    rm -f abort
    rm -f completed

    cpt_name=$(basename -- "$cpt")
    extension="${cpt_name##*.}"

    # replace the path of gcpt.bin with your gcpt restorer
    # gcpt restorer can be found in https://github.com/OpenXiangShan/NEMU/tree/gem5-ref-main/resource/gcpt_restore
    # Please use gem5-ref-main branch
    cpt_option="--generic-rv-cpt=$cpt --gcpt-restorer=/xiangshang/NEMU-gem5-ref-main/resource/gcpt_restore/build/gcpt.bin"

    # You can also pass a baremetal bin here
    if [ $extension != "gz" ]; then
        cpt_option="--generic-rv-cpt=./xiangshang/benchmark/image/coremark.bare.1timesriscv --raw-cpt"
    fi

    if [[ "$arch_db" -eq "0" ]]; then
        arch_db_args=
    else
        arch_db_args="--enable-arch-db --arch-db-file=mem_trace.db --arch-db-fromstart=True"
    fi

    if [[ -z "$crash_tick" ]]; then
        crash_tick=-1
    fi
    if [[ -z "$capture_cycles" ]]; then
        capture_cycles=30000
    fi
    start=$(($crash_tick - 500*$capture_cycles))
    # start=$crash_tick
    start=$(($start>0 ? $start : 0))
    end=$(($crash_tick + 500*$capture_cycles))
    start_end=" --debug-start=$start --debug-end=$end "
    if [[ -n "$debug_flags" ]]; then
        debug_flag_args=" --debug-flag=$debug_flags "
    else
        echo "No debug flag set"
        debug_flag_args=
        start_end=
    fi
    # --debug-flags=CommitTrace \

    if [[ $crash_tick = -1 ]]; then
        start_end=
        debug_flag_args=
    fi

    echo "total_detail_len: $total_detail_len"
    # gdb -ex run --args \

    # Note 1: Use DecoupledBPUWithFTB to enable nanhu's decoupled frontend
    # Note 2: MUST use DRAMsim3, or performance is skewed
    # To enable DRAMSim3, follow ext/dramsim3/README
    # Note 3: By default use MultiPrefetcher (SMS + BOP) as L2 prefetcher
    # Note 4: Recommend to enable Difftest
    ######## Some additional args:
    $gem5 $debug_flag_args $start_end \
        $gem5_home/configs/example/fs.py \
        --xiangshan-system --cpu-type=DerivO3CPU \
        --mem-size=8GB \
        --caches --cacheline_size=64 \
        --l1i_size=64kB --l1i_assoc=8 \
        --l1d_size=64kB --l1d_assoc=8 \
        --l1d-hwp-type=XSCompositePrefetcher \
        --short-stride-thres=0 \
        --l2cache --l2_size=1MB --l2_assoc=8 \
        --l3cache --l3_size=16MB --l3_assoc=16 \
        --l1-to-l2-pf-hint \
        --l2-hwp-type=WorkerPrefetcher \
        --l2-to-l3-pf-hint \
        --l3-hwp-type=WorkerPrefetcher \
        --mem-type=DRAMsim3 \
        --dramsim3-ini=$gem5_home/ext/dramsim3/xiangshan_configs/xiangshan_DDR4_8Gb_x8_3200_2ch.ini \
        --bp-type=DecoupledBPUWithFTB --enable-loop-predictor \
        --enable-difftest \
        $arch_db_args $cpt_option \
        --maxinsts=3849417830
    check $?

    # Here is a scratchpad for frequently used options

        # Enable complex stride component or SPP component in composite prefetcher
        # --l1d-enable-cplx \
        # --l1d-enable-spp \

        # Record arch db traces only after warmup
        #  --arch-db-fromstart=False 

        # Enable loop predictor and loop buffer
        # --enable-loop-predictor \
        # --enable-loop-buffer \

        # Employ an ideal L2 cache with nearly-perfetch hit rate and low-access latency
        # --mem-type=SimpleMemory \
        # --ideal-cache \

    # Debugging memory corruption or memory leak
    # valgrind -s --track-origins=yes --leak-check=full --show-leak-kinds=all --log-file=valgrind-out-2.txt --error-limit=no -v \

    touch completed
}

function prepare_env() {
    set -x
    echo "prepare_env $@"
    all_args=("$@")
    task=${all_args[0]}
    task_path=${all_args[1]}
    gz=$(find -L $cpt_dir -wholename "*${task_path}*gz" | head -n 1)
    echo $gz
    work_dir=$top_work_dir/$task
    echo $work_dir
    mkdir -p $work_dir
}

function arg_wrapper() {
    prepare_env $@

    all_args=("$@")
    args=(${all_args[0]})

    k=1000
    M=$((1000 * $k))

    skip=${args[2]}
    fw=${args[3]}
    dw=${args[4]}
    sample=${args[5]}

    total_M=$(( ($dw + $sample)*$M ))
    dw_M=$(( $dw*$M ))

    run $gz $dw_M $total_M $work_dir 0 >$work_dir/$log_file 2>&1
}

function single_run() {
    # run /nfs-nvme/home/zhouyaoyang/projects/nexus-am/apps/cachetest_i/build/cachetest_i-riscv64-xs.bin
    task=$tag
    work_dir=$full_work_dir
    mkdir -p $work_dir

    # Note: If you are debugging with single run, following 3 variables are mandatory.
    # - It prints debug info in tick range: (crash_tick - 500 * capture_cycles, crash_tick + 500 * capture_cycles)
    # - If you want to print debug info from beginning, set crash_tick to 0, and set capture_cycles to a large number

    # crash_tick=$(( 0 ))
    # capture_cycles=$(( 250000 ))
    # debug_flags=CommitTrace  # If you unset debug_flags, no debug print will be there

    # If you unset debug_flags or crash_tick, no debug print will be there
    # Common used flags for debug/tuning
    # debug_flags=CommitTrace,IEW,Fetch,LSQUnit,Cache,Commit,IQ,LSQ,PageTableWalker,TLB,MSHR
    warmup_inst=$(( 20 * 10**6 ))
    max_inst=$(( 40 * 10**6 ))


    # debug_gz=/nfs-nvme/home/share/checkpoints_profiles/spec06_rv64gcb_o2_20m/take_cpt/mcf_191500000000_0.105600/0/_191500000000_.gz
    debug_gz=/nfs-nvme/home/share/checkpoints_profiles/spec06_rv64gcb_o2_20m/take_cpt/libquantum_1006500000000_0.149838/0/_1006500000000_.gz
    rm -f $work_dir/completed
    rm -f $work_dir/abort
    run $debug_gz $warmup_inst $max_inst $work_dir 1 > $work_dir/$log_file 2>&1
}

export -f check
export -f run
export -f single_run
export -f arg_wrapper
export -f prepare_env


function parallel_run() {
    # We use gnu parallel to control the parallelism.
    # If your server has 32 core and 64 SMT threads, we suggest to run with no more than 32 threads.
    export num_threads=1
    cat $workload_list | parallel -a - -j $num_threads arg_wrapper {}
}

# Usually, I use paralell run to benchmark, and use single_run to debug
parallel_run
# single_run

然后执行:./simple_gem5.sh

gem5.opt 编译不过

用这个命令编译scons ./build/RISCV/gem5.opt -j 64,报下面的错误:

In file included from build/RISCV/arch/generic/pcstate.hh:49,
                 from build/RISCV/arch/generic/isa.hh:45,
                 from build/RISCV/arch/riscv/isa.hh:39,
                 from build/RISCV/arch/riscv/tlb.hh:38,
                 from build/RISCV/arch/riscv/tlb.cc:32:
build/RISCV/arch/riscv/tlb.cc: In member function ‘gem5::Fault gem5::RiscvISA::TLB::doTranslate(const RequestPtr&, gem5::ThreadContext*, gem5::BaseMMU::Translation*, gem5::BaseMMU::Mode, bool&)’:
build/RISCV/base/trace.hh:188:54: error: ‘paddr’ may be used uninitialized [-Werror=maybe-uninitialized]
  188 |         ::gem5::Trace::getDebugLogger()->dprintf_flag(   \
      |         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~
  189 |             ::gem5::curTick(), name(), #x, __VA_ARGS__); \
      |             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
build/RISCV/arch/riscv/tlb.cc:1436:9: note: in expansion of macro ‘DPRINTF’
 1436 |         DPRINTF(TLB, "translate(vpn=%#x, asid=%#x): %#x pc%#x\n", vaddr,
      |         ^~~~~~~
build/RISCV/base/trace.hh:75:10: note: by argument 8 of type ‘const long unsigned int&’ to ‘void gem5::Trace::Logger::dprintf_flag(gem5::Tick, const string&, const string&, const char*, const Args& ...) [with Args = {long unsigned int, gem5::BitfieldType<gem5::bitfield_backend::Unsigned<long unsigned int, 59, 44> >, long unsigned int, long unsigned int}]’ declared here
   75 |     void dprintf_flag(Tick when, const std::string &name,
      |          ^~~~~~~~~~~~
build/RISCV/arch/riscv/tlb.cc:1264:10: note: ‘paddr’ declared here
 1264 |     Addr paddr;

用的xs-dev 分支。
还有你们的香山核在gem5里就是DerivO3CPU吗?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.