Code Monkey home page Code Monkey logo

interplex's Introduction

tdaverse

Motivation

Topological data analysis (TDA) relies heavily on mature libraries like PHAT, Dionysus, and GUDHI. While these libraries have interfaces to Python and, through the {TDA} package, R, they have been developed primarily by and for statistical topologists. As TDA matures and standard workflows emerge, the need arises for more accessible and modular implementations. The SciKit-TDA project, an extension of SciPy, is underway in Python for this purpose. The tdaverse collection is intended to meet these needs in R through a tidyverse lens.

The tidyverse consists of numerous R packages that are built upon a shared set of syntactic and grammatical conventions and designed to interface naturally with each other. With its sibling collections r-lib and tidymodels, it provides a comprehensive toolkit for building advanced data analysis and modeling pipelines. The goal of tdaverse is to provide the data structures, computational engines, statistical models, and visualization tools needed to efficiently explore and analyze topological data in R and to integrate these tasks into tidy workflows.

Packages

Published packages

interplex

Inspired by {intergraph}, the {interplex} package provides coercers between different data structures that encode simplicial complexes, and also converts between these and graph and network structures (with the loss of 2- and higher-dimensional simplices).

This package will enable tdaverse users to couple functionality from other packages into their workflows, for example layout algorithms from {igraph} and simplicial filtrations from GUDHI (via {reticulate}).

simplextree

{simplextree} is an R package aimed at simplifying computation for simplicial complexes. The package provides R bindings to a simplex tree data structure implemented in C++11 and exported as an Rcpp module. Instances can be created from abstract or geometric data and exported and imported via serialization, and they can be efficiently inspected, queried, modified, and traversed using both Rcpp and S3 methods. The underlying library implementation also exports a C++ header, which can be specified as a dependency and used in other packages via {Rcpp} attributes.

simplextree will interface with other packages for various tasks: to sample geometric complexes based on arbitrary manifolds with {tdaunif}, to construct and update the nerves of mappers in {Mapper}, and to perform computations involving simplicial complexes stored in other formats via {interplex}.

ripserr

{ripserr} ports the Ripser and Cubical Ripser persistent homology computational engines from C++ via Rcpp. It can be used as a convenient and efficient tool in TDA pipelines involving point cloud data (Risper) or image and volume data (Cubical Ripser).

ripserr is designed as a minimal standalone package and will be called to compute persistence data when underlying simplicial filtrations are not needed.

TDAstats

Persistent homology can be used in hypothesis testing to compare the topological structure of two point clouds. {TDAstats} uses a permutation test in conjunction with the Wasserstein metric for nonparametric statistical inference.

TDAstats was originally designed with three goals in mind: the calculation, statistical inference, and visualization of persistent homology. Since its release, calculation has been moved to engine ports like {risperr} and {ggplot2}-style visualization has been moved to {ggtda}. Ongoing development of TDAstats will focus on statistical inference.

tdaunif

Methods for detecting topological structure from point cloud data sets are often validated by applying them to point clouds sampled from spaces with known topology. Functions that generate such samples are therefore valuable to developers of topological–statistical software. The goal of {tdaunif} is to assemble a comprehensive collection of such samplers for convenient use.

In addition testing TDA software, tdaunif will be used with {simplextree} to generate geometric random simplicial complexes and on its own as an educational tool for the study of ≥3-dimensional manifolds.

Incubating packages

landmark

The {landmark} package provides functions to calculate landmark sets for finite metric spaces using the maxmin procedure (for fixed-radius balls) or an adaptation of it for rank data (for roughly fixed-cardinality nearest neighborhoods). These procedures can also return membership lists for the covers centered at these landmark sets. These covering method engines will be invoked by {Mapper} and other arbitrary cover–based constructions.

Mapper

The {Mapper} package provides a set of tools for computing the mapper construction. Previous versions of this package included the simplex tree class and the maxmin procedure, which have been or are being spun off and expanded as the {simplextree} and {landmark} packages.

ggtda

The {ggtda} package provides {ggplot2} layers (statistical transformations and geometric elements) and themes for publication-quality plots of data arising from topological objects and models. Persistent homology can be computed for continuous functions and Reeb graphs as well as point clouds, and ggtda layers are in development for numerous plot types that have been proposed to gain insight from persistence data. In addition, ggtda also provides layers to conveniently plot ball covers, Vietoris–Rips complexes, and Čech complexes for 2-dimensional point clouds.

plt

{plt} provides an {Rcpp} interface to the Persistence Landscapes Toolbox. The C++ class for persistence landscapes is exposed as an Rcpp module and wrapped as an S4 class. Vector space operations and additional routines are provided through R.

Conceived packages

Cover

Covers of data sets are ubiquitous in lower-level topological methods, including mapper-like constructions. In order to allow more flexible implementations, the object-oriented package {Cover} would spin off the CoverRef R6 class from {Mapper} and introduce tools for efficiently storing and analyzing towers and other aggregates of covers.

reebit ?

Reeb graphs can be represented as graphs with height or value attributes, but few methods are available to perform basic computations like critical point pairing. For starters, an R wrapper of the ReebGraphPairing Java program could produce (extended) persistent homology for downstream analysis.

perrrsist / filtratr ?

A great advantage of GUDHI is the ability to work directly with simplicial filtrations, including to construct them from raw data and to compute persistent data from them. {ripserr} sidesteps these objects, but they can be performed using {TDA}. The idea for this package is to port different engines for computing and processing filtrations, analogously to {parsnip}.

morphom ?

A variety of vectorizations for persistence data have been proposed and validated, often to achieve stability properties. This package would consolidate them.

tidyplex

Analogous to {tidygraph}, this package would provide a "tidy" API to print, summarize, annotate, and perhaps visualize simplicial complexes and filtrations.

phoment ?

This package would wrap the vectorizations of {morphom} into a set of preprocessing steps for use with {recipes}.

People

Contribute

To learn more and contribute to package design or development, please visit the GitHub repositories and consider commenting on or creating an issue! Or check this list of low-, medium-, and high-hanging fruit.

Coordinators

  • Raoul R. Wadhwa (Cleveland Clinic Lerner College of Medicine, Case Western Reserve University)
  • Matt Piekenbrock (Khoury College of Computer Sciences, Northeastern University)
  • Jason Cory Brunson (Laboratory for Systems Medicine, Division of Pulmonary, Critical Care, and Sleep Medicine, University of Florida)

interplex's People

Contributors

corybrunson avatar

Stargazers

 avatar

Watchers

 avatar

interplex's Issues

ensure that uses of as_list are version-specified

The as_list() method in {simplextree} was updated from v0.9.1 to v1.0.2 to print a list of matrices, one for each dimension, rather than a list of vertex vectors, one for each simplex. Use utils::packageVersion() internally to toggle between reformatting the output.

Possibly extend the list of usable formats to include the list of matrices and possibly the output of serialize(), which are not classes but have rigorous formatting rules.

support for simplicial filtrations

The package should support coercion of simplicial filtrations as well as complexes.

If we include graphs/networks annotated with compatible values, then every data structure included in the current version supports simplicial filtrations. In some cases the structures are different from those for complexes, and some cases the structure includes information on how the filtration was derived.

Most cases should be straightforward. Some exceptions:

  • Coercers to graphs/networks will require a new parameter values analogous to index whose character argument gives the name of the attribute that will store the values/weights/heights. Probably no new generics are needed.
  • Coercers from graphs will have values receive the name of the attribute from which values are obtained. New generics (e.g. as_rcpp_filtration()) are probably warranted, to avoid behavior that depends on whether this parameter is used. Question: Should the method check that the values respect inclusion? I believe so.
  • Coercers to and from TDA-style lists should ensure that $increasing agrees with whether the filtration is sublevel or superlevel. Question: How is this encoded in the simplex tree implementations, or are superlevel filtrations not supported?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.