Code Monkey home page Code Monkey logo

Comments (4)

ekg avatar ekg commented on September 25, 2024

@lh3, we have developed a library to provide a standard interface to sequence graphs with embedded paths, https://github.com/vgteam/libhandlegraph.

The idea with this interface hierarchy is to expose something based on a few primitive types without needing to implement the data structure using those types. For instance, we often represent graphs using fully succint data structures, but this means that entities in the graph can't be represented as pointers to nodes or or atomic IDs. The handle concept refers to the bidirectional identifier used by a particular implementation to refer to a node (S line) in the graph.

The class hierarchy includes immutable sequence graphs, graphs with paths (VG model), and mutable versions of them. It also exposes a positional index based on the embedded paths.

Two implementations are based on reading GFA files into a self index and exposing aspects of this API on top of them (xg and odgi). We have a study in progress to compare implementations.

It should be easy enough to add a simpler fixed C and C++ interface on top of these. I don't think the semantics become radically different. There is a mismatch with the number of coordinate spaces. There are some semantic mismatches with rGFA, but they can be resolved.

from gfatools.

lh3 avatar lh3 commented on September 25, 2024

An important question is about the scope of the library. vg is too large. I think in its current form, libhandlegraph is too small. My preference is to include at least a GFA parser and an in-memory data structure like handle graph. I don't have a strong opinion on serialization, indexing and other stuffs.

Another question is about the terminology. The use of "(sequence) segment" and "link" can be traced back to the discussion on the FASTG format. Richard and I wanted to avoid "vertex", "node", "edge" and "arc" because in the assembly world, people always have different opinions. In a de Bruijn graph, "vertex" and "edge" are interchangeable to some extent, and as a result, a graph simplified from a de Bruijn graph is more often represented in the "edge way", with sequences put on edges instead of nodes. Adopting the GFA terminology will help to avoid such confusions.

from gfatools.

ekg avatar ekg commented on September 25, 2024

from gfatools.

bricoletc avatar bricoletc commented on September 25, 2024

How about https://github.com/edawson/gfakluge ? Though i don't think it supports rGFA.

from gfatools.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.