Code Monkey home page Code Monkey logo

csrgraph's Introduction

Build Status

CSRGraphs

Fast and memory efficient library for large read-only graphs.

By exploiting CSR Sparse matrices, we can efficiently pack and traverse graphs in memory (as long as we don't add/delete nodes often).

Installation

pip install csrgraph

Reading in Graphs

You can read a graph from an edgelist:

import csrgraph as cg

G = cg.read_edgelist("path_to_file.csv", directed=False, sep=',')
# Node names are stored and accessible
G['cat']

Other objects that can be put in the csrgraph constructor:

  • NetworkX Graph. CSRGraph will support edge weights natively.
  • Numpy dense matrix Should be square, one row/column per node.
  • CSR Matrix. Same as above. This also supports the (data, indices, indptr) format.

Methods on graphs

Currently, we support very fast random walks with G.random_walks(). The GloVe, GraRep and GGVec graph embedding algorithms are also directly accessible from the graph object.

CSGraph methods

You can access the underlying scipy.sparse.csr_matrix with the .mat accessor.

import csrgraph as cg

G = cg.csrgraph(data)
G.mat # This is the underlying scipy.sparse.csr_matrix

All the procedures in scipy csgraph module here will function directly on the G.mat object.

Gotchas

All graphs are directed. We support undirected graphs by adding "return edges" on each edge. The only issue is that this doubles the number of edges. You'll generally find CSRGraph's efficient memory use and fast methods makes up for this design.

4.2B Node limit. You can't currently have more nodes than UINT32_MAX (around 4.2billion) -- you'll run out of node IDs. You can have as many edges as will fit in RAM however.

Only float edge weights Eventually we might support complex edge weight objects, but for now we only support 32bit floats.

Accessing nodes by their name is O(log2(n)) Internally, CSRGraphs' map of node names is a (binary searched) sorted array indexing into node ID. Since node names are accessed much more often than than node IDs by name, this is a worthy tradeoff, but it may be unexpected if you heavily use the G[node_name] operator.

csrgraph's People

Contributors

vhranger avatar cthoyt avatar rtbs-dev avatar eloaf avatar maq-ravijit-ramana avatar h920032 avatar jdm365 avatar wangbingnan136 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.