Code Monkey home page Code Monkey logo

chemdash's Introduction

chemdash

Chemical Dataset Exploration Tool Built on Plotly Dash

The chemdash app is a simple analysis application built upon the plotly dash framework. As input the application takes a csv file requiring both smiles and Compound_id fields, other fields will be retained and displayed in the data table portion of the application. A UMAP (similar to t-SNE/PCA) plot will be generated as well as a Molecular Grid Map (MGM). Depending on the column selection in the data table, a distribution plot for the given column will also be displayed. Selection of data points in either the UMAP or MGM plot will result in filtering of the data table and highlighting within the distribution plot. For more information on this demo app please check out my blog post.

Installation & Execution

Download the repository locally, then install the dependencies from the top of the git repo:

Docker

This works on platforms that support docker: Note: this works on Macs with Apple processors, but be sure to keep platform set to linux/amd64 so Intel-based packages work.

docker buildx build --platform linux/amd64 -t chemdash:latest .

To start chemdash from the CLI:

docker run --rm --name chemdash -p 8000:8000 -v /tmp:/tmp -v /var/tmp:/var/tmp -d chemdash:latest dataset_800.csv  --port 8000 --debug 

Then open a browser while watching the logs in the terminal:

open http://0.0.0.0:8000
docker logs -f chemdash

To stop it from the CLI:

docker kill chemdash

Non-docker install and run:

A non-docker install can be useful for debugging.

Install non-python dependencies on ubuntu (note 22.04 is required, since some modules in 24.04, with python 3.12, have problems.)

./install-deps-ubuntu-22.04.sh

Install non-python dependencies on Mac:

./install-deps-osx.sh

Install the python dependencies in a virtualenv: `` python3.10 -m venv venv . ./venv/bin/activate pip install -r requirements.txt -e git+https://github.com/gatagat/[email protected]#egg=lap


To run, make sure your new environment is active and navigate to the directory containing _chemdash.py_ and run it.

. venv/bin/activate python3 ./chemdash.py dataset_800.csv --port 8000 --debug


### PyCharm

The repo contains two PyCharm run configurations, both of which use the venv, each with differen test datasets. 
You make either of those as above, and then specify either `chemdash venv` or `chemdash docker` to run the respective way.

NOTE: PyCharm debugging works best with the `chemdash venv` option.


## Data Loading

### 1000 Compounds
Takes about 30 seconds to load the application

### 10000 Compounds 
Takes ~20 minutes to load the application.  If you refer to the blog post above you will see that this is due to the Molecular Grid Map, if you alter the code slightly you should be able to turn that feature off and 10k compounds will likely load in a few minutes.  


## Usage

When the app is running, open your browser to http://0.0.0.0:8000


chemdash's People

Contributors

dmoccia avatar sakoht avatar

Stargazers

 avatar acdbio avatar  avatar  avatar James Yoder avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.