Code Monkey home page Code Monkey logo

shiftsmoothedattributions's Introduction

ShiftSmoothedAttributions

Ian Nielsen Systems Devices and Algorithms in Bioinformatics Final Project Spring 2021

Read more about methodology and results here: ShiftSmooth: Locally Smoothed Attribution for Genomic CNNs.

Install

Clone repo.

git clone https://github.com/nielseni6/ShiftSmoothedAttributions.git

Make sure to have the following required libraries if you do not already.

Python 3.6

PyTorch 1.2.0

torchvision 0.4.0

matplotlib 3.2.0

numpy 1.16.3

Pillow 6.0.0

pyseqlogo

If you are getting errors with pyseqlogo please download the pyseqlogo files from their github (https://github.com/saketkc/pyseqlogo) and place a copy of the pyseqlogo folder into the ShiftSmoothedAttributions\Codon_Detection and ShiftSmoothedAttributions\Human_Goldfish_Classification folders.

Getting Started

This documentation is split into two parts, Quickstart and Training. If you simply wish to recreate the results using the precalculated attribution maps then you will want to begin with Quickstart. If you would like to recreate all experiments from scratch, including formatting the dataset, training the model and generating attribution maps then you will want to begin from Training.

Quickstart:

Codon Detection Task

Move to project repository.

cd ShiftSmoothedAttributions\Codon_Detection

To recreate Experiment 1 (Shifting Invariance) run the shift experiment file.

python shift_experiment.py

To recreate Experiment 2 (Are the Areas of Interest Being Highlighted?) Run the display logo file to display the attribution maps given in this repo.

python disp_attr_motif_logo.py

Human/Goldfish Classification Task

Move to project repository.

cd ShiftSmoothedAttributions\Human_Goldfish_Classification

To recreate Experiment 1 (Shifting Invariance) run the shift experiment file.

python shift_experiment.py

To recreate Experiment 2 (Are the Areas of Interest Being Highlighted?) Run the display logo file to display the attribution maps given in this repo. Note: this experiment will not be able to validate the method the same as the codon detection task since the important features are not known for human/goldfish classification.

python disp_attr_logo.py

Train Model:

If you would like to train the model yourself follow these steps

Codon Detection Task

  1. Go to https://www.ncbi.nlm.nih.gov/nuccore/CM000663.2 and click on FASTA.

image

  1. From here click Send To -> File -> Create File, then Save File.

image

  1. Once the file is finished downloading rename it to human_genome_c1.txt and place it in the Codon_Detection\raw_data folder.

  2. Now that the data is downloaded move to project repository.

cd ShiftSmoothedAttributions\Codon_Detection

  1. Run dataset formatter until you are satisfied with the size of the dataset.

python generate_dataset.py

  1. Generate attribution maps using trained model.

python getattributions_motif.py

  1. Follow the steps for the Quickstart for the Codon Detection Task to run experiments using newly generated attribution maps.

Human/Goldfish Classification Task

  1. Go to https://www.ncbi.nlm.nih.gov/nuccore/CM000663.2 and click on FASTA.

image

  1. From here click Send To -> File -> Create File, then Save File.

image

  1. Once the file is finished downloading rename it to human_genome_c1.txt and place it in the Human_Goldfish_Classification\raw_data folder.

Steps 4 through 6 are a repeat of steps 1 through 3 except that we are downloading the goldfish genome this time rather than human. 4. Go to https://www.ncbi.nlm.nih.gov/nuccore/CM010432.1 and click on FASTA.

image

  1. From here click Send To -> File -> Create File, then Save File.

image

  1. Once the file is finished downloading rename it to goldfish_genome_c1.txt and place it in the Human_Goldfish_Classification\raw_data folder.

  2. Now that the data is downloaded move to project repository.

cd ShiftSmoothedAttributions\Human_Goldfish_Classification

  1. Run dataset formatter until you are satisfied with the size of the dataset.

python generate_dataset.py

  1. Generate attribution maps using trained model.

python getattributions.py

  1. Follow the steps for the Quickstart for the Human/Goldfish Classification Task to run experiments using newly generated attribution maps.

shiftsmoothedattributions's People

Contributors

nielseni6 avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.