Code Monkey home page Code Monkey logo

foralign_testcase's Introduction

Datasets used in FORAlign

Datasets source

We use GMGC, HIV, CoVID-19 and MPoX datasets to measure our methods.

Citations:

  • GMGC: Coelho, L.P., et al. Towards the biogeography of prokaryotic genes. Nature 601, 252โ€“256 (2022).
  • HIV: https://www.hiv.lanl.gov/
  • CoVID-19: Shu YL, McCauley J. GISAID: Global initiative on sharing all influenza data - from vision to reality. Eurosurveillance. 2017; 22(13):2-4.
  • MPoX: Ma Y, Chen M, Bao Y, et al. MPoxVR: A comprehensive genomic resource for monkeypox virus variant surveillance. The Innovation, 2022, 3(5).

Test cases used in WFA2-test program provided by FORAlign

The generated test data is saved in test_case.tar.xz. This data also saved at Zenodo. The following command provides the way to decompress the file:

mkdir test_case
tar xvf test_case.tar.xz -C ./test_case

For every file in test_case folder, the name of file is the source of test cases (e.g., GMGC_20000__GMGC_10000 means the file is generated by test case GMGC_20000 and GMGC_10000 for pairwise sequence alignment, the first sequence is randomly selected in GMGC_20000 file, the second sequence is randomly selected in GMGC_10000 file), the *.ans file is CIGAR string generated by Benchmark (i.e., gap-affine-swg in WFA2).

Test programs

Test programs has compiled in C++20, and the program is static linked (i.e. no extra dependencies, can be runned in any Linux platforms). The programs has stored in wfa2-test-prog folder.

Test results

All tests are tested in Linux. The results are generated by align_benchmark and multiswg_benchmark programs compiled by wfa2-test folder in FORAlign.

Organized results

The organized results is saved in test-result/json folder. The following table shows the content for these datas:

File Name Content Info Used in test
9gpu_*core.json The test result from 9gpu device use the determined number of cores all of tests
9gpu_benchmark.json gap-affine-swg result generated by 128 cores benchmark for anaylzing datas
hpc_56core_cpucnt_anaysis.json The test result which the CPU percent tested by top is more than 95% generated by 56 cores condition_variable and Speedup Ratio for swg
hpc_56core_simpletest.json The test result which the CPU percent tested by top is more than 95% generated by 56 cores, not tested WFA series t-blocks
hpc_80core_cpucnt_anaysis.json The test result which the CPU percent tested by top is more than 95% generated by 80 cores all of tests except Speedup Ratio for swg
hpc_80core_max_memory.json The maximum memory in Benchmark methods Methods selection
singlepc_1_12core.json The test result generated by 12 cores all of tests except Speedup Ratio for swg
singlepc_2_16core.json The test result generated by 16 cores supplemental data
singlepc_2_24core.json The test result generated by 24 cores all of tests except Speedup Ratio for swg

Raw Result in test-result/raw.tar.xz file and compress

The raw test cases is in test-result/raw.tar.xz. The following command provides the way to decompress the file:

tar xvf test-result/raw.tar.xz

Files will in test-result/raw folder.

foralign_testcase's People

Contributors

wym6912 avatar

Watchers

 avatar

Forkers

malabz

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.