Code Monkey home page Code Monkey logo

clusteratac's Introduction

ClusterATAC

ClusterATAC is a cancer subtype tool based on ATAC-seq profiles. The input for the framework is high-dimensional omics data (such as ATAC-peak data) of all the tumor samples. The output is the corresponding subclass label for each sample. ClusterATAC is mainly divided into two components: 1. GAN-based feature extraction module is used to obtain abstract features from deep learning using high-dimensional original input data. 2. A GMM-based clustering module for determining the number of clusters and the cluster labels corresponding to the input.

# the input raw data file is all.txt and runs the following command to finish all processes: 
python ClusterATAC.py -i ./all.txt  
# the Clustering output file are stored in ./results/all.clusteratac  

Specifically, for the feature extraction module:

python ClusterATAC.py -m feature -i ./all.txt  
# the low-dimensional features encoded by the neural network are stored in ./fea/all.clusteratac  

ClusterATAC's GMM clustering module is used as follows:

python ClusterATAC.py -m cluster -n 22 -i ./all.txt  
# record the corresponding class label for each sample and the output file is ./results/all.clusteratac 

ClusterATAC's performance comparison module (using the autoencoder method as an example) is used as follows:

python ClusterATAC.py -m ae -i ./all.txt
# record the corresponding class label for each sample and the output file is ./ TCGA_ATAC_peak_Log2Counts_dedup_sample.spectral

ClusterATAC is based on the Python program language. The generative adversarial network's implementation was based on the open-source library scikit-learn 0.21.3, Keras 2.2.4, and Tensorflow 1.14.0 (GPU version). After testing, this framework has been working correctly on Ubuntu Linux release 18.04. Due to the high dimensionality of the raw data, the size of the neural network is enormous. We used the NVIDIA TITAN XP (12G) for the model training. When the GPU's memory is not enough to support the running of the tool, we suggest simplifying the encoder's network structure.

clusteratac's People

Contributors

haiyang1986 avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.