Code Monkey home page Code Monkey logo

topologic's Introduction

Topologic

Topologic is now Deprecated

Please use Graspologic instead.

Documentation Status PyPI

Release Status

Version 0.1.1 is released!

Installation Instructions

pip install topologic

Plans

  • None - see deprecation message above

Bugs or Feature Requests

Please file a new issue if you find any bugs, either in the code or the documentation.

Development Setup Instructions

Topologic was developed for Python 3.6+ and makes extensive use of type hints throughout and f-strings throughout. Python 2.7 is not supported.

Topologic is known to work with Python x64 3.6, 3.7, and 3.8 on Windows and Ubuntu, and presumed to work on MacOS as well. Please submit a new issue with any issues found on any of these versions.

Windows

py -m venv venv
venv\Scripts\activate.bat
pip install -U setuptools wheel pip
pip install -r requirements.txt

It is possible that you will need to install Visual Studio Build Tools for some of the topologic dependencies. Some dependencies such as scipy and numpy have C code that must be compiled for your version of Python to work. Please follow the directions in your console if you have errors after installing the requirements and then try again after following these instructions.

MacOS

python3 -m venv venv
source venv/bin/activate
pip install -U setuptools wheel pip
pip install -r requirements.txt

Ubuntu

sudo apt-get update && sudo apt-get install python3-pip python3-dev
python3 -m venv venv
source venv/bin/activate
pip install -U setuptools wheel pip
pip install -r requirements.txt

Running Tests

mypy ./topologic
mypy ./tests
flake8 ./topologic ./tests
pytest tests topologic

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.microsoft.com.

When you submit a pull request, a CLA-bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repositories using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

Privacy

topologic does not collect, store, or transmit any information of any kind back to Microsoft.

For your convenience, here is the link to the general Microsoft Privacy Statement.

topologic's People

Contributors

daxpryce avatar microsoft-github-operations[bot] avatar microsoftopensource avatar nicaurvi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

topologic's Issues

Update for Sphinx 3.x+

Sphinx 3.0.3 has released, but it doesn't seem to work well with recommonmark just yet. Regardless, we need to find a way to make it work.

Remove metadata types and type registry from the IO package

This well-meaning concept tried to provide type information for every attribute added to a graph object via the io & projections package. While neat, I'm not entirely sure I've ever seen or heard of anyone ever using it. It's a needless complexity if it isn't providing immediate and obvious value, and we should remove it.

Add leiden native code to topologic

Leiden University advanced the community identification algorithm first developed by Universite catholique de Louvain in the paper From Louvain to Leiden: guaranteeing well-connected communities, as well as provided a Java reference implementation released under the MIT license.

We now have a working version of this written in Rust and can use pyo3 to provide it as a native Python module. This should be pulled into topologic for others to use.

It should be noted that this initial release is still somewhat fluid, but we're excited to bring this to everyone!

tc.io.from_dataset is ignoring empty graphs provided

A user has found a bug where a new nx.DiGraph object passed into tc.io.from_dataset is not being used and an undirected graph is returned instead. This is rather surprising since we have unit tests around having graph objects being passed in and being updated.

Explain `cut_process` in docs

I was trying to use ts.statistics.cut_edges_by_weight and I got a bit confused by the cut_process argument here - figured it out, but seems worth explaining in the docstring for this function. Could these be string arguments as well?

Modularity components returning wrong value

There are some wrong assumptions in the networkx edge iteration behavior that is resulting in incorrect modularity values being returned. This was noticed when being implemented the first time but incorrectly chalked up to being a floating point arithmetic inconsistency that built upon itself to the point where we were getting accuracy only within 10E-3, but it is instead a definitive logic error.

Modularity Components

We've had a request to add a function that will return a dictionary of communities to the portion of modularity they are responsible for adding in the full modularity calculation

Prep for including rust native code into topologic

Our Leiden community partitioning is ready for inclusion into topologic. Rather than move all of the existing topologic code around and include the native code all in one PR, I want to ensure we focus on making sure the new structure works first, and then focus on the logic, style, etc of the native code in a separate PR.

Move Louvain

The louvain.py file is one method for creating a global partition structure of the graph. We already plan on including leiden's algorithm for global partitioning, and there could be others.

There's also the ability to partition a graph based off the embeddings, which is currently present in the topologic.embedding.clustering package. A more-unified package like topologic.partitioning should probably be created, with louvain.py being moved into it. We should also rename the functions from best_partition to louvain() and induce_graph_by_communities should be specifically made to be not in the louvain.py module.

In the future I envision this partitioning package to also include ways of taking the output of an embedding clustering and inducing a graph from that as well.

In short, we have some short sighted naming going on, and we should use this opportunity to fix it.

Add PyPI location to README

We're publishing to pypi - prerelease versions, at least. We should ensure our documentation reflects this.

Update ari_scores

  • Move into a module called similarity at topologic.similarity
  • Add support for Jaccard similarity
  • Remove calculate_ari_scores function (no more enriched graphs)

Laplacian Spectral Embedding multiple dimension test fails on Mac OS

  • tests/embedding/test_laplacian_spectral_embedding.py -> test_laplacian_embedding_elbowcut_none

This test fails on Mac OS in the github matrix action to run tests on Linux, Windows and Mac OS. The Linux and Windows tests return exactly as expected. I could not reproduce this issue on Mac OS Catalina running Python 3.7.

It is unclear to me whether this is an issue in the code, sklearn, or Github actions. Reproducible steps would be ideal for tracking down the source of this issue.

Marked resolved when we can remove the pytest.skip() from that test.

Move and update distance module

The distance module should be moved and reworked. As it is right now it's just a glorified alias to the scipy functions, which has very little utility (possibly better documentation and type hinting? Not enough to keep around by itself).

However, we do know how we use these:

  • Given a vector, we iterate through a list of vectors and return the distances
  • Given a vector, we iterate through a list of vectors, sort them, and return the top N

I would argue that these are the utilities we should be providing functionality for - not for calling scipy.spatial.cosine_distance.

At most we should support the 3 functions we currently do (cosine, euclidean, and mahalanobis) via a single function with a hyper parameter for choice - then users can toggle based on a configuration value in their code rather than swapping the function(s) out.

Rework version

We set up a very complicated version file for an internal CICD process that published new versions on merge to master. Switching to open source and requiring a human to be involved in publication obviates this complication, and we should pull it out and do versions like a normal open source python project.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.