Code Monkey home page Code Monkey logo

Bio4j bioinformatics graph data platform

Bio4j is a bioinformatics graph data platform, integrating most data available in Uniprot KB (SwissProt + Trembl), Gene Ontology (GO), UniRef (50,90,100), NCBI Taxonomy, and Expasy Enzyme DB.

Bio4j provides a completely new and powerful framework for protein related information querying and management. The use of a graph-based data model makes possible to store and query data in a way that semantically represents its own structure. On the contrary, traditional relational models and databases must flatten the data they represent into tables, creating artificial ids in order to connect the different tuples; which can in some cases eventually lead to domain models that have almost nothing to do with the actual structure of data.

Project structure and overview

Bio4j can look a bit intimidating at first, with all those repositories with kind of similar names; here you have a guided tour around:

bio4j/bio4j

In this repository bio4j/bio4j you will find the generic Bio4j model and API. Entities, relationships and their properties are modeled using a typed property graph model. For example, there are vertex types for Protein or GoTerm, and a GoAnnotation edge type going from Protein to GoTerm. This graph schema is separated into different graphs, corresponding to the different data sources (UniProt, Go, UniRef, ...) and connections between them (UniProtGo, UniProtUniRef, ...).

The API, based on bio4j/angulillos, lets you write generic typed traversals over this graph schema:

protein.uniref50Member_outV()
  .map(
    UniRef50Cluster::uniRef50Member_inV
  )
  .map(
    prts -> prts.map(
      Protein::goAnnotation_outV
    )
  );

which can later be executed on a particular backend. Generic data import code is also here, which can be used to load the data using any implementation of angulillos.

bio4j/angulillos

You can think of bio4j/angulillos as a strongly typed version of the property graph model. You can describe graph schemas and write generic traversals over them which are guranteed to be well-typed in that for example

  • you cannot retrieve the outgoing edges of and edge
  • and you can get the tweets that a user tweeted, but not the users that a tweet follows!

bio4j/bio4j-titan

In bio4j/bio4j-titan you will find a Titan-based Bio4j distribution. This is the the default standard distribution, and we also provide through AWS S3 the database binaries with all data already loaded. Go there if you want to stop reading and use Bio4j now!

bio4j/angulillos-titan

bio4j/angulillos-titan is an implementation of the angulillos API using Titan.

Documentation

Community and contact

Licensing

Bio4j is an open source platform released under the AGPLv3 license.

bio4j's Projects

bio4j icon bio4j

Bio4j abstract model and general entry point to the project

bio4j-json icon bio4j-json

JSON serialization related classes for Bio4j entities used in Java programs

deprecated-bio4jexplorer icon deprecated-bio4jexplorer

DEPRECATED Actionscript/Flex viewer that allows to explore the node types of Bio4j and the possible relationship types between them. Please go to: https://github.com/bio4j/bio4j for up to date information

deprecated-bio4jgotools icon deprecated-bio4jgotools

DEPRECATED Bio4j Go Tools AIR app. Please go to: https://github.com/bio4j/bio4j for up to date information

deprecated-bio4jgotoolsweb icon deprecated-bio4jgotoolsweb

DEPRECATED Bio4j Go Tools web swf project. Please go to: https://github.com/bio4j/bio4j for up to date information

deprecated-bio4jmodel icon deprecated-bio4jmodel

DEPRECATED Model classes repo. Please go to: https://github.com/bio4j/Bio4j for up to date information

deprecated-bio4jserviceswebgui icon deprecated-bio4jserviceswebgui

DEPRECATED Flash viewer for general Bio4j Web services. Please go to: https://github.com/bio4j/bio4j for up to date information

deprecated-bio4jtestserver icon deprecated-bio4jtestserver

DEPRECATED Bio4jTestServer project includes a group of Services using Bio4j as back-end. Please go to: https://github.com/bio4j/bio4j for up to date information

el-grafo icon el-grafo

GSoC 2014 project - D3-based Bio4j data model visualization

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.