Code Monkey home page Code Monkey logo

gufi-archive's Introduction

GUFI Archive Scanning

We use parts of GUFI to perform high perfromance scanning of storage areas to make reports to better make decisions about the archiveability of data in massive volumes.

Singularity

Building the container on sylabs

singularity remote login
singularity build --remote gufi.sif singularity.def 
singularity push -U  gufi.sif library://brockp/gufi/gufi:master
singularity push -U  gufi.sif library://brockp/gufi/gufi:[tag]
# running
module load singularity
singularity pull --arch amd64 library://brockp/gufi/gufi:master
singularity run-help gufi_master.sif
singularity exec gufi_master.sif cmd

Reporting Scripts

# build an index this is time consuming but is reused for many queries
singularity exec gufi_master.sif gufi_dir2index -n <#threads> <inputdir> /tmp/GUFI

# run a summary report of how much not been accessed in X days
singularity exec --bind /etc/passwd gufi_master.sif summary.sh /tmp/GUFI [days]

# use GUFI ls to just list files
singularity exec --bind /etc/passwd gufi_master.sif gufi_ls --help

Scripts

SCRIPTS.md

  • summary.sh /tmp/GUFI [days] provides a summary of total and data not accessed in 180 days
  • dirsum.sh /tmp/GUFI/dir [days] Show totals in each directory below the given directory and files not accessed in 180 days
  • totals.sh /tmp/GUFI/dir [days] Summary similar to du -s
  • archivescan.sh /tmp/GUFI/dir [sizeMB] Bucket data into over size and under.

Resolving groups and users

The continer doesn't know about UID's and groups other than the user invoking it. To correctly resolve the UID and GID's stored in the GUFI index you need to bind the local system to the container runtime with --bind /etc/passwd

Building GUFI

GUFI as of June 2022 cannot build from googletest

git clone https://github.com/mar-file-system/GUFI.git
cd GUFI
rm contrib/deps/googletest.tar.gz 
mkdir build
cd build
cmake ..
make
make install

gufi-archive's People

Contributors

brockpalen avatar rharolde avatar cjantonelli avatar

Stargazers

 avatar Peter Yu avatar Christopher Lilienthal avatar Weichen (Arthur) Zhou avatar Pho Hale avatar Ayush Shrivastava avatar jtcannon avatar  avatar  avatar

Watchers

Matt Britt avatar  avatar  avatar Todd Raeker avatar  avatar jtcannon avatar

Forkers

clil16

gufi-archive's Issues

Single call to get results

$BFQ -E " \
INSERT INTO sument select uid, name, size, atime, \
case when datetime(atime, 'unixepoch') < DATE('now', '-"$DAYS" day') then size else 0 end as oldsize \
FROM entries \
WHERE type='f';" \
-n $THREADS -O outdb \
-I "CREATE TABLE sument (username text, name text, size int64, atime int64, oldsize int64);" $1/$dir
a=`$QUERYDBS -d \| -V $gettitle outdb sument " \
select count, sizeGB, oldsize, \
PRINTF('%02d%%', (100*oldsize/sizeGB)) as percent \
FROM( \
SELECT COUNT(*) AS count, sum(size)/1024/1024/1024 AS sizeGB, sum(oldsize)/1024/1024/1024 as oldsize from vsument) \
ORDER BY sizeGB DESC;" \
outdb.*`

This can be changed to only use gufi_query

gufi_query \
       -n ${THREADS} \
       -I "CREATE TABLE sument (username text, name text, size int64, atime int64, oldsize int64);"  \
       -E "INSERT INTO sument SELECT uid, name, size, atime, \ 
 	    CASE WHEN datetime(atime, 'unixepoch') < DATE('now', '-"$DAYS" day') THEN size ELSE 0 END \ 
 	    FROM entries WHERE type='f';" \ 
       -K "CREATE TABLE vsument(count int64, sizeGB int64, oldsize int64);" \
       -J "INSERT INTO vsument SELECT COUNT(*), sum(size)/1024/1024/1024, sum(oldsize)/1024/1024/1024 FROM sument;" \
       -G "SELECT count, sizeGB, oldsize, PRINTF('%02d%%', (100*oldsize/sizeGB)) FROM vsument;" \
       "$1/${dir}"

No intermediate database files will be created. sument and vsument will be in memory. This also avoids the limit mentioned in the SQLite documentation that querydbs suffers from.

directory summary

list size and oldsize for each directory under a given starting point

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.