Code Monkey home page Code Monkey logo

sea-1's Introduction

Sea Tests codecov

Sea logo

Sea can provide the speed up of Apache Spark but with no instrumentation.

It is hierarchical filesystem implementation using LD_PRELOAD, adapted from the passthrough of XtreemOS.

Requirements (if building from source)

  • gcc-c++
  • libiniparser
  • libmagic
  • gtest (optional)
  • bash

Compilation

make pass or make (to compile tests)

Usage

Configuration file

In order to successfully run Sea alongside your application, you will need to provide some details in an .ini file called sea.ini. The following are the properties of the file:

  • mount_dir: This is the folder your application will directly interact with to access files in Sea. It is the "view" to Sea.
  • n_levels: The number of cache directory levels in your file system. Cache directories are the directories that you would like Sea to use in order to speed up computation (e.g. tmpfs directories, local disk and Lustre)
  • cache_<X> : The path to the cache. <x> is a digit representing at which level of the hierarchy this file system should be found at (starts at 0 - top). For instance, a tmpfs path should likely be at cache_0 and a Lustre path should be found at the bottom of the hierarchy (e.g. cache_2 if you have a local disk fs at cache_1). These directories must exist before launching the application.
  • log_level : This is an interger representing the amount of verbosity you desire in you application. There are currently 5 levels (0:None, 1:Error, 2: Warning, 3: Info, 4: debug).
  • log_file : location of file in which to store the logs
  • max_fs : Pipeline maximum file size in Bytes. It is currently required that the maximum file size is passed to Sea in order for Sea to be able to determine if there is sufficient space in the cache directories.
  • n_threads: The maximum number of application concurrent threads or processes. Also used to determine if there is sufficient space.

An example configuration file may look like:

[Sea]                                                                  
mount_dir = /lustre/seamount ;                                    
n_levels = 3 ;                                                        
cache_0 = /dev/shm/seasource ;                                           
cache_1 = /localscratch/seasource ;                                         
cache_2 = /lustre/seasource ;                                    
log_level = 0 ;                                 
log_file = /lustre/sea.log ;                             
max_fs = 646971392 ;                                                   
n_threads = 16 ;    

Flushlist

Sea relies on a file called .sea_flushlist to determine which files need to be be flushed from the upper-level caches to the bottom-level cache directory. Exact filenames need not be provided and regex is accepted (can be thought or as similar to a .gitignore!). All filepaths resolved by the regex specified in the flushlist file will be flushed and evicted whenever possible.

Conversly, to free up space on storage, a .sea_evictlist files can be populated to contain files set for removal. Like the .sea_flushlist, this is a newline separated list of regex patterns.

Files that are listed in both the .sea_evictlist and .sea_flushlist are moved to the base cache level filesystem (e.g. cache_2 in the case above).

Program execution

In order to launch your application with Sea, it is crucial to first set the SEA_HOME variable. This variable is used to indicated the folder in which the sea.ini and .sea_flushlist files are located.

The sea_launch.sh executable is used to launch the flushers and execute the program. However, your program must still set the LD_PRELOAD variable prior to execution in order to use the Sea file system. The LD_PRELOAD must be set with Sea's sea.so.

e.g.

SEA_HOME=$PWD ./sea_launch.sh <myprogram>

myprogram

#!/bin/bash
LD_PRELOAD=sea.so <myactualprogram>

Troubleshooting

Some coreutils versions don't work with Sea because they do direct system calls instead of calling glibc versions. The following table summarizes our tests.

coreutils version Used in Status in Sea
8.4 CentOS 6 WORKS
8.22 CentOS 7 FAILS
8.25 Ubuntu Xenial WORKS
8.30 CentOS 8 FAILS
8.31 Fedora 30 WORKS

The Docker files used in the test cases show how to install specific coreutils versions from source.

sea-1's People

Contributors

glatard avatar mathdugre avatar valhayot avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.