Code Monkey home page Code Monkey logo

madfs's Introduction

MadFS

workflow workflow workflow

Source code for FAST '23 paper: MadFS: Per-File Virtualization for Userspace Persistent Memory Filesystems by Shawn Zhong*, Chenhao Ye*, Guanzhou Hu, Suyan Qu, Andrea Arpaci-Dusseau, Remzi Arpaci-Dusseau, and Michael Swift. (*equal contribution.) FAST '23. Paper. Video. Slides. Code.

Abstract

Persistent memory (PM) can be accessed directly from userspace without kernel involvement, but most PM filesystems still perform metadata operations in the kernel for security and rely on the kernel for cross-process synchronization.

We present per-file virtualization, where a virtualization layer implements a complete set of file functionalities, including metadata management, crash consistency, and concurrency control, in userspace. We observe that not all file metadata need to be maintained by the kernel and propose embedding insensitive metadata into the file for userspace management. For crash consistency, copy-on-write (CoW) benefits from the embedding of the block mapping since the mapping can be efficiently updated without kernel involvement. For cross-process synchronization, we introduce lock-free optimistic concurrency control (OCC) at user level, which tolerates process crashes and provides better scalability.

Based on per-file virtualization, we implement MadFS, a library PM filesystem that maintains the embedded metadata as a compact log. Experimental results show that on concurrent workloads, MadFS achieves up to 3.6x the throughput of ext4-DAX. For real-world applications, MadFS provides up to 48% speedup for YCSB on LevelDB and 85% for TPC-C on SQLite compared to NOVA.

BibTex
@inproceedings {285756,
author = {Shawn Zhong and Chenhao Ye and Guanzhou Hu and Suyan Qu and Andrea Arpaci-Dusseau and Remzi Arpaci-Dusseau and Michael Swift},
title = {{MadFS}: {Per-File} Virtualization for Userspace Persistent Memory Filesystems},
booktitle = {21st USENIX Conference on File and Storage Technologies (FAST 23)},
year = {2023},
isbn = {978-1-939133-32-8},
address = {Santa Clara, CA},
pages = {265--280},
url = {https://www.usenix.org/conference/fast23/presentation/zhong},
publisher = {USENIX Association},
month = feb,
}

Prerequisites

  • MadFS is developed on Ubuntu 20.04.3 LTS and Ubuntu 22.04.1 LTS. It should work on other Linux distributions as well.

  • MadFS requires a C++ compiler with C++ 20 support. The compilers known to work includes GCC 11.3.0, GCC 10.3.0, Clang 14.0.0, and Clang 10.0.0.

  • Install dependencies and configure the system
    • Install build dependencies

      sudo apt update
      sudo apt install -y cmake build-essential gcc-10 g++-10
    • Install development dependencies (optional)

      # to run sanitizers and formatter
      sudo apt install -y clang-10 libstdc++-10-dev clang-format-10
      # for perf
      sudo apt install -y linux-tools-common linux-tools-generic linux-tools-`uname -r`
      # for managing persistent memory and NUMA
      sudo apt install -y ndctl numactl
      # for benchmarking
      sudo apt install -y sqlite3
    • Configure the system

      ./scripts/init.py
  • Configure persistent memory
    • To emulate a persistent memory device using DRAM, please follow the guide here.

    • Initialize namespaces (optional)

      # remove existing namespaces on region0
      sudo ndctl destroy-namespace all --region=region0 --force 
      # create new namespace `/dev/pmem0` on region0
      sudo ndctl create-namespace --region=region0 --size=20G
      # create new namespace `/dev/pmem0.1` on region0 for NOVA (optional)
      sudo ndctl create-namespace --region=region0 --size=20G
      # list all namespaces
      ndctl list --region=0 --namespaces --human --idle
    • Use /dev/pmem0 to mount ext4-DAX at /mnt/pmem0-ext4-dax

      # create filesystem
      sudo mkfs.ext4 /dev/pmem0
      # create mount point
      sudo mkdir -p /mnt/pmem0-ext4-dax
      # mount filesystem
      sudo mount -o dax /dev/pmem0 /mnt/pmem0-ext4-dax
      # make the mount point writable
      sudo chmod a+w /mnt/pmem0-ext4-dax
      # check mount status
      mount -v | grep /mnt/pmem0-ext4-dax
    • Use /dev/pmem0.1 to mount NOVA at /mnt/pmem0-nova (optional)

      # load NOVA module
      sudo modprobe nova
      # create mount point
      sudo mkdir -p /mnt/pmem0-nova
      # mount filesystem
      sudo mount -t NOVA -o init -o data_cow  /dev/pmem0.1 /mnt/pmem0-nova
      # make the mount point writable
      sudo chmod a+w /mnt/pmem0-nova           
      # check mount status
      mount -v | grep /mnt/pmem0-nova          
    • To unmount the filesystems, run

      sudo umount /mnt/pmem0-ext4-dax
      sudo umount /mnt/pmem0-nova

Build and Run

  • Build the MadFS shared library

    # Usage: make [release|debug|relwithdebinfo|profile|pmemcheck|asan|ubsan|msan|tsan]
    #             [CMAKE_ARGS="-DKEY1=VAL1 -DKEY2=VAL2 ..."] 
    make BUILD_TARGETS="madfs"
  • Run your program with MadFS

    LD_PRELOAD=./build-release/libmadfs.so ./your_program
    Sample output
    BuildOptions: 
        build type:
            name: release
            debug: 0
            use_pmemcheck: 0
        hardware support:
            clwb: 1
            clflushopt: 1
            avx512f: 1
        features: 
            map_sync: 1
            map_populate: 1
            tx_flush_only_fsync: 1
            enable_timer: 0
        concurrency control:
            cc_occ: 1
            cc_mutex: 0
            cc_spinlock: 0
            cc_rwlock: 0
    
    RuntimeOptions:
        show_config: 1
        strict_offset_serial: 0
        log_file: None
        log_level: 1
    
    # Your program output here
    
    MadFS unloaded
  • Run tests

    ./scripts/run.py [test_basic|test_rc|test_sync|test_gc]
    # See `./scripts/run.py --help` for more options
    
  • Run and plot single-threaded benchmarks
    ./scripts/bench_st.py --filter="seq_pread"
    ./scripts/bench_st.py --filter="rnd_pread"
    ./scripts/bench_st.py --filter="seq_pwrite"
    ./scripts/bench_st.py --filter="rnd_pwrite"
    ./scripts/bench_st.py --filter="cow"
    ./scripts/bench_st.py --filter="append_pwrite"
    
    # Limit to set of file systems
    ./scripts/bench_st.py -f MadFS SplitFS
    
    # Profile a data point
    ./scripts/bench_st.py --filter="seq_pread/512" -f MadFS -b profile
    
    # See `./scripts/bench_st.py` --help for more options
  • Run and plot multi-threaded benchmarks
    ./scripts/bench_mt.py --filter="unif_0R"
    ./scripts/bench_mt.py --filter="unif_50R"
    ./scripts/bench_mt.py --filter="unif_95R"
    ./scripts/bench_mt.py --filter="unif_100R"
    ./scripts/bench_mt.py --filter="zipf_2k"
    ./scripts/bench_mt.py --filter="zipf_4k"
  • Run and plot metadata benchmarks
    ./scripts/bench_open.py
    ./scripts/bench_gc.py
  • Run and plot macrobenchmarks (SQLite and LevelDB)
    ./scripts/bench_tpcc.py
    ./scripts/bench_ycsb.py

Directory Structure

  • src/: Source code for the MadFS shared library

  • scripts/: Scripts for building, running, and plotting benchmarks

  • bench/: Source code for benchmarks

  • test/: Source code for tests

  • tools/: Source code for tools (e.g., gc, conversion, info)

  • cmake/: CMake modules

  • data/: Data files for benchmarks

Contact

If you have any questions, feel free to open an issue or contact Shawn Zhong ([email protected]) and Chenhao Ye ([email protected]). We are also happy to accept pull requests.

madfs's People

Contributors

shawnzhong avatar chenhao-ye avatar josehu07 avatar qusuyan avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.