twang15 / platoacademy Goto Github PK
View Code? Open in Web Editor NEWFree thoughts live
Free thoughts live
Survey of Academic resources in US on this aspect
Yidi Sun, Bioinformatics, CAS
Su-In Lee, U of Washington
Scott Lundberg, MSR
Christoph Molnar, U of Munich
Jerome H. Friedman(https://statweb.stanford.edu/~jhf/)
Preprocessing:
i) Bad quality -> Tool: Use “FASTQ Quality Filter” and/or “FASTQ Quality
ii) Flagged Kmer Content: About 100% of the first six bases are the same sequence -> Tool: Use “FASTQTrimmer” Trimmer
Quality control: Run fastqc on the processed samples to see if the problem has been removed. Tool: fastqc
Library complexity: the fraction of unique fragments present in a given library. A proxy is to look at the sequence
duplication levels on the FastQC report.
Low library complexity may be an indicator that:
– A new sample and a new library should be prepared.
– We have to find a better Ab to perform the IP.
– We can not sequence the same sample anymore because we will not find new sequences.
In certain experimental settings we may expect a low library complexity. i.e. We are profiling a protein that binds to a small subset of the genome.
Mapping (alignment): Treat IP and control the same way (preprocessing and mapping). Tool: bowtie 1 or bowtie 2 (use end-to-end mode) or bwa
– map the reads and removing unmapped reads
– filter reads mapped by quality mapping score
Peak calling
i) Read extension and signal profile generation: Estimation of the fragment length using Strand cross-correlation analysis
ii) Peak assignment and evaluation
– Look for fold enrichment of the sample over input or expected background
– Estimate the significance of the fold enrichment using
Linear regression in R
Formula syntax
The large numbers theorem states that if the same experiment or study is repeated independently a large number of times, the average of the results of the trials must be close to the expected value. Note that the theorem deals only with a large number of trials while the average of the results of the experiment repeated a small number of times might be substantially different from the expected value. However, each additional trial increases the precision of the average result.
This issue is to explore the overview of Bioinformatics
What is mRNA, rRNA and tRNA?
Insulin 100
Frederick G. Banting
In insulin, there is enough glory for all
Invisible Frontier
Two seemingly conflicts: interpretability and feature importance
STAT 504 - Analysis of Discrete Data: https://online.stat.psu.edu/stat504/node/70/
understanding machine learning from theory to algorithms
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.