traviswheelerlab / avxwindowfmindex Goto Github PK

A fast, AVX2 and ARM Neon accelerated FM index library

License: BSD 3-Clause "New" or "Revised" License

C 95.99% Makefile 1.93% CMake 1.93% Shell 0.15%

data-structures fm-index bioinformatics c

avxwindowfmindex's Introduction

AvxWindowFmIndex

A fast, SIMD accelerated FM-index library that utilizes windows of SIMD registers to quickly locate exact match kmers in genetic data. This FM-index is highly optimized for both nucleotide and amino sequences, but is unsuitable for general text. Despite then name, AvxWindowFmIndex supports SIMD operation on both x86_64 architectures via AVX2, and ARM_64 architectures via Arm Neon.

Please report any issues or bugs to the "Issues" tab on the project's github (https://github.com/TravisWheelerLab/AvxWindowFmIndex), and view our published article here (https://doi.org/10.1186/s13015-021-00204-6).

Prerequisites

The following is required to build and use this software

GCC C compiler
CMake, or optionally,
Make for the legacy build system

Note that MacOS users will need to install GCC through Homebrew or some other means, because the built-in Clang compiler, which is aliased as gcc, does not support OpenMP.

brew install gcc

After cloning the repo, you should initialize and update the submodules.

git clone https://github.com/TravisWheelerLab/AvxWindowFmIndex.git
git submodule update --init --recursive --remote

Now the project is ready to build.

CMake Build

The default way to build AwFmIndex is to use CMake. The usual CMake incantations will work. This will result in both shared libraries written to the build/ directory.

cmake .
make

to instead make static libraries, set the BUILD_SHARED_LIBS variable as follows:

cmake -DBUILD_SHARED_LIBS=OFF .
make

If you want both, it's eas easy as running both back-to-back

cmake -DBUILD_SHARED_LIBS=OFF .
make
cmake -DBUILD_SHARED_LIBS=ON .
make

To install on MacOS, you will need to specify the install location of your GCC compiler. An Example is given below:

cmake -DCMAKE_C_COMPILER=/path/to/gcc .
make

The library (both shared and static) is installed to the default install path. If you'd like to change the installation location, change the PREFIX as follows: -DCMAKE_INSTALL_PREFIX=/prefix/path to the cmake commands above.

make install

Makefile Build (Legacy)

There is also a custom Makefile included, included to support use-cases where CMake is disallowed.

Shared Library

To build and install the AwFmIndex shared library into the default install location:

make -f Makefile_legacy

To point the build at a specific version of GCC, which is necessary on a Mac, use the following:

make -f Makefile_legacy CC=/path/to/gcc

Static Library

To build a static library, just use the static target:

make -f Makefile_legacy static

This will generate two static libraries, libawfmindex.a and libdivsufsort64.a, plus the associated header files in the build/ directory.

To point the build at a specific version of GCC, which is necessary on a Mac, use the following:

make -f Makefile_legacy static CC=/path/to/gcc

Install

After the library has been built, you can install both shared and static files so that your compiler can find them. To install into the default location (/usr/local/):

sudo -f Makefile_legacy make install

To specify a custom install location:

make -f Makefile_legacy install PREFIX=~/usr/local

If AwFmIndex is installed to a non-default location, you may need to set environmental variables to allow your software to find the project .so and .h files at runtime.

export LD_LIBRARY_PATH=~/.local/lib
export LD_RUN_PATH=~/.local/include

AwFmIndex Quick Start Guide

The full public API can be found in the header src/AwFmIndex.h. Full examples can be found in the tuning/build and tuning/search directories.

Making an .awfmi index file

To create an .awfmi index, use the awFmCreateIndex() or awFmCreateIndexFromFasta() functions. These functions will take a given sequence (or fasta containing one or more sequences), and create an index file that conforms to the given configuration. These functions will overwrite the given fileSrc.

enum AwFmReturnCode awFmCreateIndex(struct AwFmIndex *restrict *index,
  struct AwFmIndexConfiguration *restrict const config, const uint8_t *restrict const sequence,
  const size_t sequenceLength, const char *restrict const fileSrc);

or, to generate an index from all sequences in a well-formed fasta file,

enum AwFmReturnCode awFmCreateIndexFromFasta(struct AwFmIndex *restrict *index,
  struct AwFmIndexConfiguration *restrict const config, const char *fastaSrc,
  const char *restrict const indexFileSrc);

Like all functions that return an 'enum AwFmReturnCode', make sure to check the return code to determine if the result was successful or not. The function prototypes in the header files contain robust documentation about functions, and their possible return codes.

The configuration struct is as follows, the fields are described below:

struct AwFmIndexConfiguration{
  uint8_t               suffixArrayCompressionRatio;
  uint8_t               kmerLengthInSeedTable;
  enum AwFmAlphabetType alphabetType;
  bool                  keepSuffixArrayInMemory;
  bool                  storeOriginalSequence;
};

suffixArrayCompressionRatio represents how much to compress the suffix *array to reduce the size of the .awfmi file on drive. As an example, a value of 8 will tell AwFmIndex to build a suffix array where 1/8th of the suffix array is sampled, and on average, each hit will take 8 additional backtrace operations to find the actual database sequence position. As the suffix array is never kept in memory queries, it will have no affect on memory usage during index searches.

kmerLengthInSeedTable represents how long of kmers to memoize in a lookup *table to speed up queries. Higher values will speed up searches, but will take exponentially more memory. A value of 12 (268MB lookup table) is recommended for nucleotide indices, and a value of 5 (51MB) is recommended for protein indices. increasing this value by one will result in 4x table size for nucleotide indices, and a 20x table size for protein indices.

alphabetType allows the user to set the type of index to make. Options are AwFmAlphabetNucleotide and AwFmAlphabetAmino

keepSuffixArrayInMemory determines if the compressed suffix array is *loaded into memory, or left on disk. keeping the suffix array will consume a lot of memory (8 bytes per position in the database sequence), but will speed up searches by not having to go to disk for the final position lookup of each hit. An index made from an average mammalian nucleotide genome with this flag set to true will consume around 28GB of additional memory.

storeOriginalSequence determines if the original sequence data will be *saved inside the index. If this is false, the sequence is omitted, which will generate a smaller index file. If true, sections of the original sequence can be recalled with the awFmReadSequenceFromFile() function.

To use awFmCreateIndex or awFmCreateIndexFromFasta, pass a pointer to an uninitialized AwFmIndex struct. The function will allocate memory for the index, build it in memory, and write it to the given fileSrc. The AwFmIndex struct is usable immediately after calling this function, and must be manually deallocated with awFmDeallocIndex.

Loading an existing Index

To load an existing .awfmi file, use the function

enum AwFmReturnCode awFmReadIndexFromFile(struct AwFmIndex *restrict *restrict index, const char *fileSrc,
  const bool keepSuffixArrayInMemory);

The index argument should be a pointer to an unallocated index pointer, just as with awFmCreateIndex(). Setting the keepSuffixArrayInMemory flag to true will load the compressed suffix array into memory along with the rest of the index.

Querying batches of kmers in parallel

To query for batches of kmers, create a AwFmKmerSearchList struct with the function:

struct AwFmKmerSearchList *awFmCreateKmerSearchList(const size_t capacity);

capacity is the number of kmers the search data struct can hold.

Once the AwFmKmerSearchList struct had been allocated and initialized with the above function, load the kmer strings you wish to query for. Example code to do this is given below.

struct AwFmKmerSearchList *loadKmers(char **kmerStrings, uint32_t *kmerStringLengths, uint32_t numKmers){
  struct AwFmKmerSearchList * searchList = awFmCreateKmerSearchList(numKmers);
  for(size_t i = 0; i < numKmers; i++){
    searchList->kmerSearchData[i].kmerString = kmerStrings[i];
    searchList->kmerSearchData[i].kmerLength = kmerStringLengths[i];
  }
  searchList->count = numKmers;

  return searchData;
}

Locate function

To locate all instances of the kmers against an AwFmIndex, use the awFmParallelSearchLocate function.

enum AwFmReturnCode awFmParallelSearchLocate(const struct AwFmIndex *restrict const index,
 struct AwFmKmerSearchList *restrict const searchList, uint8_t numThreads);

When the function returns, each kmer will have been queried, and a corresponding list of locations can be found in the corresponding entry in searchList->kmerSearchData[x].positionBacktraceList.position. The numThreads argument tells the algorithm how many threads to use. The optimal setting will vary from system to system, but 4 or 8 seem to be good choices on many systems. In the unlikely event of a failure to read from disk, this function will return AwFmFileReadFail. Otherwise, it will return AwFmSuccess.

To print the positions in the database sequence where a kmer at a given index was found:

void printKmerHitPositions(struct AwFmKmerSearchList *searchList, size_t kmerIndex)
  const uint32_t numHitPositions = searchList->kmerSearchData[kmerIndex].count;
  uint64_t *positionList =  
    searchList->kmerSearchData[kmerIndex].positionList;

  for(size_t i = 0; i < numHitPositions; i++){
    printf("kmer at index %zu found at database position %zu."\n,
    kmerIndex, positionList[i]);
  }
}

Count function

The count function is used similary to the locate function.

void awFmParallelSearchCount(const struct AwFmIndex *restrict const index,
  struct AwFmKmerSearchList *restrict const searchList, uint8_t numThreads);

After querying for the count, the number of occurences of each kmer is located in the count member of the corresponding AwFmKmerSearchData struct. Example code is found below.

void printAllKmerCounts(struct AwFmKmerSearchList *searchList){
  const size_t numKmers = searchList->count;
  for(size_t i = 0; i < numKmers; i++){
    uint32_t thisLmerCount = searchList->kmerSearchData[i].count;
    printf("kmer at index %zu has %u counts\n");
  }
}

Deallocating the AwFmKmerSearchList

When finished using the AwFmKmerSearchList struct, deallocate it with the awFmDeallocKmerSearchList function.

void awFmDeallocKmerSearchList(struct AwFmKmerSearchList *restrict const searchList);

Reading back sections of the database sequence

If you would like to read sections of the database sequence around a given position, use the function:

enum AwFmReturnCode awFmReadSequenceFromFile(const struct AwFmIndex *restrict const index,
  const size_t sequenceStartPosition, const size_t sequenceSegmentLength,
  char *const sequenceBuffer);

index is the AwFmIndex to query.

sequenceStartPosition is the position of the first character to be included in the window.

sequenceSegmentLength is the length of the sequence segment to read.

sequenceBuffer is a preallocated buffer large enough to fit sequence segment. Once populated, the buffer is null terminated.

The total number of characters read from the file equals sequenceEndPosition - sequenceStartPosition. Giving a sequenceEndPosition greater than the length of the sequence can result in undefined behavior. This function will return the error code AwFmUnsupportedVersionError if the index did not store the sequence data (i.e., the configuration's storeOriginalSequence member variable was set to false when the index was generated.)

Optional functionality

Depending on the user-set configuration parameters, these functions may apply.

If a AwFmIndex is built from a fasta file, it will keep track of the sequence lengths, and is able to determine which sequence a hit is located in, and the position inside that sequence.

enum AwFmReturnCode awFmGetLocalSequencePositionFromIndexPosition(const struct AwFmIndex *restrict const index,
  size_t globalPosition, size_t *sequenceNumber, size_t *localSequencePosition);

index is the AwFmIndex used.

globalPosition is the position returned as the result of a locate() search.

sequenceNumber is an out-argument where the index of the sequence the hit falls inside will be written.

localSequencePosition is an out-argument where the position in that sequence that corresponds to the globalPosition will be written.

If this function is called on an index that wasn't built from a fasta file, it will return the error code AwFmUnsupportedVersionError.

If an index is built from a fasta, the sequence headers can also be retrieved.

enum AwFmReturnCode awFmGetHeaderStringFromSequenceNumber(const struct AwFmIndex *restrict const index,
  size_t sequenceNumber, char **headerBuffer, size_t *headerLength);

index is the AwFmIndex used,

globalPosition is the position returned as the result of a locate() search.

sequenceNumber is the index of the relevant sequence.

headerBuffer is a pointer to the char* variable that this function will set to the beginning of the header.

headerLength is an out-argument where the function will write the length of the header.

This function also returns AwFmUnsupportedVersionError if the index was not built from a fasta file.

Usage Example

Here is an example of creating an AwFmIndex and using it to perform a search on queries given via the argument list parameter. If you already have an index you'd like to use instead of generating a new one, use awFmReadIndexFromFile().

int main(int argc, char **argv){

  char *indexFileSrc = "indexFiles/index.awfmi";
  char *fastaInputFileSrc = "fastas/dnaSequence.fasta"
	struct AwFmIndex *index;
	struct AwFmIndexConfiguration config = {.suffixArrayCompressionRatio = 8,
			.kmerLengthInSeedTable																					 = 12,
			.alphabetType																										 = AwFmAlphabetNucleotide,
			.keepSuffixArrayInMemory																				 = false,
			.storeOriginalSequence																					 = true};
	enum AwFmReturnCode returnCode = awFmCreateIndexFromFasta(&index, &config, sequence, sequenceLength, indexFileSrc, true);
  if(awFmReturnCodeIsFailure(returnCode)){
    printf("create index failed with return code %u\n", returnCode);
    exit(1);
  }

  uint64_t numQueries = argc-1; //first argument is the name of the executable, so skip it.
	struct AwFmKmerSearchList *searchList = awFmCreateKmerSearchList(numQueries);
  if(searchList == NULL){
    printf("could not allocate memory for search list\n");
    exit(2);
  }

  searchList->count = numQueries;
  //initialize the queries in the search list
  for(size_t i = 0; i < numQueries; i++){
    searchList->kmerSearchData[i].kmerLength = strlen(argv[i+1]);
    searchList->kmerSearchData[i].kmerString = &argv[i+1];
  }

  //search for the queries

  returnCode awFmParallelSearchLocate(index, searchList, numThreads);
  if(awFmReturnCodeIsFailure(returnCode)){
    printf("parallel search failed with return code %u\n", returnCode);
    exit(3);
  }

  //print all the hits.
  for(size_t kmerIndex = 0; kmerIndex < searchList->count; kmerIndex++){
    const uint32_t numHitPositions = searchList->kmerSearchData[kmerIndex].count;
    uint64_t *positionList =  
      searchList->kmerSearchData[kmerIndex].positionList;

    for(size_t i = 0; i < numHitPositions; i++){
      printf("kmer at index %zu found at database position %zu."\n,
      kmerIndex, positionList[i]);
    }
  }

  //code cleanup
  awFmDeallocKmerSearchList(searchList);
  awFmDeallocIndex(index);

  exit(0);
}

avxwindowfmindex's People

Contributors

Stargazers

Watchers

Forkers

jasonborn genostack

avxwindowfmindex's Issues

Submodule divsfusort is missing OpenMP symbols when statically built

When building a simple example and attempting to statically link in the libraries using the instructions from the README, the following error occurs:

/usr/bin/ld: ../AvxWindowFmIndex/build/libdivsufsort64.a(divsufsort.o): in function `sort_typeBstar._omp_fn.0':
divsufsort.c:(.text+0x40): undefined reference to `omp_get_thread_num'
/usr/bin/ld: divsufsort.c:(.text+0x7c): undefined reference to `GOMP_critical_name_end'
/usr/bin/ld: divsufsort.c:(.text+0xcf): undefined reference to `GOMP_critical_name_start'
/usr/bin/ld: divsufsort.c:(.text+0x12f): undefined reference to `GOMP_critical_name_end'
/usr/bin/ld: ../AvxWindowFmIndex/build/libdivsufsort64.a(divsufsort.o): in function `sort_typeBstar':
divsufsort.c:(.text+0x39d): undefined reference to `omp_get_max_threads'
/usr/bin/ld: divsufsort.c:(.text+0x421): undefined reference to `GOMP_parallel'

make numThreads optional

Since the fastest setting for numThreads is always just "how many threads you have", this setting should default to just figure out how many threads you have, and set the value accordingly. It's important to be able to set this, too, for certain users.

Fix CI build for Mac using CMake

The CMake build for Mac isn't getting the right C compiler, it seems, because it's complaining about OpenMP.

Installing header files omits "FastaVectorMetadataVector.h" and "FastaVectorString.h"

When running make install I get the following output:

[  9%] Built target divsufsort64
[ 16%] Built target fastavector_static
[ 36%] Built target awfmindex_static
[ 43%] Built target fastavector
[ 63%] Built target awfmindex
[ 72%] Built target fileReadTest
[ 81%] Built target fileWriteTest
[ 90%] Built target insertTest
[100%] Built target divsufsort
Install the project...
-- Install configuration: ""
-- Installing: /usr/local/lib/libawfmindex.so
-- Set runtime path of "/usr/local/lib/libawfmindex.so" to ""
-- Installing: /usr/local/include/AwFmIndex.h
-- Installing: /usr/local/lib/libawfmindex_static.a
-- Up-to-date: /usr/local/include/AwFmIndex.h
-- Installing: /usr/local/lib/libfastavector.so
-- Installing: /usr/local/include/FastaVector.h
-- Installing: /usr/local/lib/pkgconfig/libdivsufsort.pc
-- Installing: /usr/local/lib/pkgconfig/libdivsufsort64.pc
-- Installing: /usr/local/include/divsufsort.h
-- Installing: /usr/local/include/divsufsort64.h
-- Installing: /usr/local/lib/libdivsufsort.so.3.0.1
-- Installing: /usr/local/lib/libdivsufsort.so.3
-- Installing: /usr/local/lib/libdivsufsort.so
-- Installing: /usr/local/lib/libdivsufsort64.so.3.0.1
-- Installing: /usr/local/lib/libdivsufsort64.so.3
-- Installing: /usr/local/lib/libdivsufsort64.so

On compilation when including AwFmIndex.h I get the following error:

In file included from /usr/local/include/AwFmIndex.h:9,
                 from umap-generate-index.c:5:
/usr/local/include/FastaVector.h:10:10: fatal error: FastaVectorMetadataVector.h: No such file or directory
   10 | #include "FastaVectorMetadataVector.h"
      |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
compilation terminated.

In FastaVector.h, both FastaVectorMetadataVector.h and FastaVectorString.h but they are both not installed into my include path.

Let me know if you need any additional information.

Submodule divsufsort is not compiled with position independent code (-fPIC), cannot be used as a static library to generate a shared object

When building the static libraries using:

cmake -DBUILD_SHARED_LIBS=OFF .
make

Trying to link them into my project produces the following error:

/home/ericr/mambaforge/compiler_compat/ld: /tmp/build-via-sdist-ym7t0ssw/umap-0.1.0/AvxWindowFmIndex/build/libdivsufsort64.a(divsufsort.o): warning: relocation against `.gomp_critical_user_sssort_lock' in read-only section `.te
xt'
/home/ericr/mambaforge/compiler_compat/ld: /tmp/build-via-sdist-ym7t0ssw/umap-0.1.0/AvxWindowFmIndex/build/libdivsufsort64.a(divsufsort.o): relocation R_X86_64_PC32 against symbol `.gomp_critical_user_sssort_lock' can not be us
ed when making a shared object; recompile with -fPIC

remove deprecated structs from AwFmIndex.h

deprecated structs still exist in this header, like index directionality.

Finish setting up API docs with Doxygen

There is a preliminary Doxyfile in the repo now, so we should get that completely working and then move the API docs out of the README and into the generated documentation.

At the same time, we might as well generate into a docs/ directory and use GitHub pages to host the HTML. That will be fancy.

Example code in README uses a "struct" tag for AwFmReturnCode when type is an enum

The example code in the README contains an erroneous type for AwFmReturnCode. It should be an enum when it is currently a struct.

Segfault from awFmParallelSearchCount from a specific kmer

I've narrowed down a segfault from a specific kmer query. Notably if I add nucleotides after it does not segfault and if I remove nucelotides at the end it also does not. As far as I can tell the vast majority of the other kmers I've tested seemed to work just fine.

Specifically the segfault occurs when counting the kmer NNNNNNNNTAACC in GRChg38 chr1 specifically the >CM000663.2 Homo sapiens chromosome 1, GRCh38 reference primary assembly. I created an index with the following configuration:

    struct AwFmIndexConfiguration config = { 
            .suffixArrayCompressionRatio = 8,
	    .kmerLengthInSeedTable = 12, // recommended value for nucleotides
	    .alphabetType = AwFmAlphabetDna,
	    .keepSuffixArrayInMemory = false,
	    .storeOriginalSequence = false,
    };

And wrote it out to file. Then I ran the following code on the generated chr1.fmidx file:

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <stdbool.h>
#include <string.h>

#include "AwFmIndex.h"

int main(int argc, char **argv) {
    char *indexFileName = "chr1.fmidx";
    uint8_t numThreads = 2;

    uint64_t numQueries = argc - 1; // Skip program name in the count

    struct AwFmIndex *index;
    enum AwFmReturnCode returnCode = 
        awFmReadIndexFromFile(&index, indexFileName, true);

    if(awFmReturnCodeIsFailure(returnCode)){
        printf("Could not load index from file with return code %u\n", returnCode);
        exit(1);
    }

    struct AwFmKmerSearchList *searchList = 
        awFmCreateKmerSearchList(numQueries);

    searchList->count = numQueries;
    for(size_t i = 0; i < numQueries; i++) {
        searchList->kmerSearchData[i].kmerLength = strlen(argv[i+1]);
        searchList->kmerSearchData[i].kmerString = argv[i+1];
    }

    awFmParallelSearchCount(index, searchList, numThreads);

    // Print all counts
    for(size_t kmerIndex = 0; kmerIndex < searchList->count; kmerIndex++) {
        const uint32_t numHitPositions = searchList->kmerSearchData[kmerIndex].count;
        printf("Kmer %s has %u hits\n", argv[kmerIndex+1], numHitPositions);
    }

    awFmDeallocKmerSearchList(searchList);
    awFmDeallocIndex(index);

    exit(0);
}

Where I provided the above kmer sequence as an argument.

Here is the backtrace at the segfault along with the locals.

0x000055555555834c in awFmNucleotideIterativeStepBackwardSearch (index=0x5555555614b0, range=0x7fffffffcba0, letterIndex=4 '\004') at /home/ericr/work/umap-fmindex/AvxWindowFmIndex/src/AwFmSearch.c:81
81              baseOccurrence   = index->bwtBlockList.asNucleotide[blockIndex].baseOccurrences[letterIndex];
(gdb) info locals
queryPosition = 144115188075855872
letterPrefixSum = 230481013
blockIndex = 562949953421312
localQueryPosition = 0 '\000'
newStartPointer = 230481014
baseOccurrence = 0
occurrenceVector = {1, 0, 0, 0}
vectorPopcount = 1
newStartBlock = 900316
newStartBlockPtr = 0x7ffff723a9c0 "\377\363\377\275\377\377\177\367\376\377\221\367\376\376\221\016\362\324\025\260\002\377\033B\322\027\340\300\322\"\231q"
newEndPointer = 230481012
newEndBlock = 18446744073709551472
newEndBlockPtr = 0x7f0dffffcb30 <error: Cannot access memory at address 0x7f0dffffcb30>
(gdb) bt
#0  0x000055555555834c in awFmNucleotideIterativeStepBackwardSearch (index=0x5555555614b0, 
    range=0x7fffffffcba0, letterIndex=4 '\004')
    at /home/ericr/work/umap-fmindex/AvxWindowFmIndex/src/AwFmSearch.c:81
#1  0x000055555555791f in parallelSearchExtendKmersInBlock (index=0x5555555614b0, searchList=0x555555561e20, 
    ranges=0x7fffffffcba0, threadBlockStartIndex=0, threadBlockEndIndex=1)
    at /home/ericr/work/umap-fmindex/AvxWindowFmIndex/src/AwFmParallelSearch.c:235
#2  0x0000555555557f72 in awFmParallelSearchCount._omp_fn.0 ()
    at /home/ericr/work/umap-fmindex/AvxWindowFmIndex/src/AwFmParallelSearch.c:152
#3  0x00007ffff7f7da16 in GOMP_parallel () from /lib/x86_64-linux-gnu/libgomp.so.1
#4  0x000055555555757b in awFmParallelSearchCount (index=0x5555555614b0, searchList=0x555555561e20, 
    numThreads=2 '\002') at /home/ericr/work/umap-fmindex/AvxWindowFmIndex/src/AwFmParallelSearch.c:142
#5  0x00005555555555ca in main (argc=2, argv=0x7fffffffcf08) at ubismap-count-bin.c:33
(gdb)

Specifically the blockIndex is completely out of range and will ultimately cause the segfault.

If you need any additional information please don't hesitate to ask.

Eric

Format the code and commit

Format the code with the tool/run-format.sh script and then commit the results. If we don't like anything about the formatter output, change the settings in .clang-format.

add return codes to parallel locate/parallel count

it would be nice to know is parallel locate/count errors out, for instance, from the pread invocation. This would require a change that may be breaking.

Cannot statically link to libawfmindex_static.a

The static library is missing definitions presumably from other submodules. For example here's the output when attempting to statically link on my simple example:

/usr/bin/ld: /usr/local/lib/libawfmindex_static.a(AwFmCreate.c.o): in function `awFmCreateIndex':
AwFmCreate.c:(.text+0x184): undefined reference to `divsufsort64'
/usr/bin/ld: /usr/local/lib/libawfmindex_static.a(AwFmCreate.c.o): in function `awFmCreateIndexFromFasta':
AwFmCreate.c:(.text+0x31a): undefined reference to `fastaVectorInit'
/usr/bin/ld: AwFmCreate.c:(.text+0x340): undefined reference to `fastaVectorReadFasta'
/usr/bin/ld: AwFmCreate.c:(.text+0x4a4): undefined reference to `divsufsort64'
/usr/bin/ld: AwFmCreate.c:(.text+0x5d2): undefined reference to `fastaVectorStringDealloc'

Let me know if you need any additional information

Async Suffix Array Read

Implement asynchronous reading for the suffix array, such that the suffix array can be populated while search is going.

Legacy Makefile "Makefile_legacy" does not generate all necessary static libraries with the "static" target

When running:
make -f Makefile_legacy static
The library libdivsufsort64.a and associated headers are missing.

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.