Code Monkey home page Code Monkey logo

super_pag's Introduction

Super_PAG

The super-PAGs will become a new and comprehensive publicly accessible non-redundant super-geneset source with curated biological context.

TO-DO List

  • Run the sample code in PAGER
  • Get familiar with the PAGER 3.0 API
  • Run the top-down optimization method in clustering_algorithms
  • Make the optimation code work with PAGER 3.0 API
  • Looking for the literature on clustering algorithms
  • Propose some new clustering methods to apply to our research
  • Visualize the subgraph, which is the result of the optimization algorithm.
  • Scraping the PAGER 3.0 to get a tree hierarchy of the PAGs like MESH.

Lead professor

Dr. Yue from Harrison College of Pharmacy. He is an assistant research professor in the Harrison College of Pharmacy's Department of Health Outcomes Research and Policy. He joined the college in March 2023. Dr. Yue's past research has focused on utilizing biological data mining, systems biology, network biology, machine learning, visual analytics, and translational informatics to improve human health. He has also constructed several web servers and applications for functional genomics downstream analysis and drug repositioning.

Research Strategies

Visualization with Python

1.Networkx

2.PyVis

The algorithms for partitioning a weighted graph

1. BSF(Breadth-First-Search)

The well-known BFS (Breadth-First-Search) algorithm can also be used for graph partitioning. BFS algorithm traverses the graph level by level and marks each vertex with the level in which it was visited. After completion of the traversal, the set of vertices of the graph is portioned into two parts V1 and V2 by putting all vertices with level less than or equal to a pre-determined threshold L in the set V1 and putting the remaining vertices (with level greater than L) in the set V2. L is so chosen that |V1| is close to |V2|.

2. Kernighan-Lin Algorithm

The Kernighan-Lin algorithm (KL algorithm hereafter) is one of the oldest heuristic graph partitioning algorithms proposed in 1970 [6]. In the simplest possible setting, the KL algorithm takes an edge-weighted graph G = (V, E, edge-weight function c) with 2n vertices and an initial bi-partition (V1, V2) of the vertex set V where |V1| = |V2| = n and produces a new partition (V1’, V2’) such that |V1’| = |V2’| = n and the total cost of the new partition is lower than (or equal to) the cost of the original partition.

3. Fiduccia-Mattheyses Algorithm

The Fiduccia-Mattheyses algorithm (FM algorithm hereafter) is a significant advancement in the field of graph partitioning, introduced by C.M. Fiduccia and R.M. Mattheyses in 1982. This algorithm is an improvement over the Kernighan-Lin algorithm and is specifically designed for partitioning large-scale VLSI circuits.

The FM algorithm operates on an edge-weighted graph G = (V, E, edge-weight function c) and aims to optimize the partitioning of the graph's vertices into two subsets, V1 and V2, while maintaining a balance between the sizes of these subsets. Unlike the Kernighan-Lin algorithm, which swaps pairs of vertices between partitions, the FM algorithm moves individual vertices across the partition boundary. This approach allows for more granular adjustments, often leading to better optimization in fewer iterations.

A key feature of the FM algorithm is its use of a data structure called a gain bucket, which efficiently identifies the vertices whose movement would most decrease the cut size (i.e., the number of edges crossing the partition boundary). The algorithm iteratively moves the vertex with the highest gain from one partition to the other, updating the gains of adjacent vertices as it proceeds.

4. Spectral Bisection Algorithm

The Spectral Bisection Algorithm is a method used in graph partitioning, which utilizes the spectral properties of graphs to find an optimal bisection. This algorithm is particularly effective for partitioning sparse graphs and is widely used in various applications, including parallel computing, VLSI design, and network analysis.

The Spectral Bisection Algorithm operates based on the following steps:

4.1 Laplacian Matrix: It begins by constructing the Laplacian matrix L of the graph G. The Laplacian matrix is defined as L = D - A, where D is the degree matrix and A is the graph's adjacency matrix.

4.2 Eigenvalues and Eigenvectors: The algorithm then computes the eigenvalues and eigenvectors of the Laplacian matrix. The second smallest eigenvalue (known as the Fiedler value) and its corresponding eigenvector (the Fiedler vector) are of particular interest.

4.3 Partitioning: The graph is partitioned into two sets based on the sign of the components of the Fiedler vector. Nodes corresponding to positive components of the Fiedler vector are placed in one set, and those corresponding to negative components are placed in the other set.

PAGER 3.0 Dateset API

Relate works

  1. PAGER: constructing PAGs and new PAG–PAG relationships for network biology
  2. PAGER 2.0: an update to the pathway, annotated-list and gene-signature electronic repository for Human Network Biology

super_pag's People

Contributors

qilimk avatar

Watchers

 avatar Carlos Rabat avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.