Code Monkey home page Code Monkey logo

applied-computational-genomics's Introduction

Applied Computational Genomics Course at UU: Spring 2017

  • Faculty: Aaron Quinlan (aquinlan at genetics.utah.edu)
  • Teaching assistants: Tom Sasani ([email protected]), Julie Feusier ([email protected]) and Jingtao Guo ([email protected])
  • Meets Tu and Th from 9:10-10:30 in HSEB 2948; January 10 - April 25
  • TA Hours: M and W from 4-5 PM, meet in 2nd floor HSEB lobby

Overview

This course will provide a comprehensive introduction to fundamental concepts and experimental approaches in the analysis and interpretation of experimental genomics data. It will be structured as a series of lectures covering key concepts and analytical strategies. A diverse range of biological questions enabled by modern DNA sequencing technologies will be explored including sequence alignment, the identification of genetic variation, structural variation, and ChIP-seq and RNA-seq analysis. Students will learn and apply the fundamental data formats and analysis strategies that underlie computational genomics research. The primary goal of the course is for students to be grounded in theory and leave the course empowered to conduct independent genomic analyses.

Prerequisites

  • Online introduction to Linux. Students must complete one of the following online tutorials (or both) before class begins.
  • Undergraduate or graduate level Molecular Biolog (for future versions of the class)
  • An Apple or Linux computer. Or a Windows machine with Putty installed.

Course lecture slides

Homework

Syllabus

  • Class 1 (Tu Jan 10; Quinlan): Course overview and Intro to UNIX

    • Class 1 Slides
    • Required Reading Prior to Lecture:
    • Topics covered
      • Brief history of computational biology
      • Course computing environment
      • Intro. to UNIX: Part 1
        • Logging in
        • The "shell"
        • "Home"
        • Navigation
        • File system
        • Files
        • Basic commands: ls, pwd, cd, mkdir, head
  • Class 2 (Th Jan 12; Quinlan): Intro to UNIX Part 2

  • Class 3 (Tu Jan 17; Quinlan): The human genome

    • Class 3 Slides
    • Required Reading Prior to Lecture:
    • Topics covered
      • Karyotype
      • Chromosome structure
      • Centromeres
      • Banding
      • Chromatin
      • How was the genome sequenced?
        • sequencing technology
        • assembly strategy
      • Chromosomes
        • size
        • gene content
        • centromeres
      • Haplotypes
      • Genes and transcripts
      • Repeat content
        • mobile elements
        • simple repeats
      • GC content, banding
      • CpG islands
  • Class 4 (Th Jan 19; Quinlan): Using UNIX to find patterns in a genome

    • Required Reading Prior to Lecture:
      • None.
    • Topics covered
      • The UNIX PATH
      • Environment variables
      • Basic regular expressions with grep
      • sort
      • uniq
    • Homework 2 (finding biological patterns in FASTA files with UNIX) assigned
  • Class 5 (Tu Jan 24; Quinlan): Genetic variation: mutations, polymorphisms, and haplotypes

    • Required Reading Prior to Lecture:
    • Topics covered
      • Genetic variation: what, why, etc.
      • Mutation vs. polymorhism
      • De novo mutation
        • Human mutation rates
      • Polymorphism
      • SNPs INDELs
        • abundance
        • frequency
        • examples
        • 1000 Genomes
        • Site frequency spectrum
      • Population stratification
      • Intro to haplotypes and recombination
  • Class 6 (Th Jan 26; Quinlan): Modern DNA sequencing technologies

  • Class 7 (Tu Jan 31; Quinlan): DNA sequence mapping and alignment](https://docs.google.com/presentation/d/1RskyGhXx4Lc6wSvvb_ZuCUJGUiP2RAr9X8bGh9Kz77I/edit?usp=sharing)

    • Required Reading Prior to Lecture:
    • Topics covered
      • Sequence alignment
        • Theory
        • Mapping versus alignment
        • Local versus global alignment
          • Smith waterman
          • Needleman-wunsch
        • Advanced algorithms
        • Alignment for RNA-seq
        • Alignment for SV detection.
        • Tools
          • BWA, etc.
  • Class 8 (Th Feb 2; Quinlan): SAM/BAM format, samtools, and IGV](https://docs.google.com/presentation/d/1_iT3btOZqjPmVb8Ryk5ssMBCMxoQ0MVmasZ6G0luA-c/edit?usp=sharing)

    • The SAM/BAM format
    • Samtools
    • IGV
    • Homework 4 (creating and working with SAM/BAM files with samtools and IGV) assigned
  • Class 9 (Tu Feb 7; Quinlan): SNP and INDEL discovery (part 1)](https://docs.google.com/presentation/d/1D4XY9XxQiyYcwwhomRRONxCPr_bJvcC0WM4sb8vouZM/edit?usp=sharing)

  • Class 10 (Th Feb 9; Quinlan): SNP and INDEL discovery (part 2)](https://docs.google.com/presentation/d/12jeJQPbntPPPGYszIH1l9u83mXFVU1XdJw-bNgbFu28/edit?usp=sharing)

    • Required Reading Prior to Lecture:
    • Topics covered
      • VCF format
        • Attributes
        • Genotypes
      • Population calling
      • Basic annotations
    • Landscape of human genetic variation
      • Alleles and genotypes
      • Allele frequency spectrum
      • Hardy weinberg equilibrium
      • More on haplotypes and recombination
    • Exploring the format
      • examples
      • IGV
    • Manipulating VCF with bcftools
    • Homework 5 (variant calling and working with VCF files with bcftools and UNIX) assigned
  • Class 11 (Tu Feb 14; Quinlan): VCF format, Hardy Weinberg Equilibrium, VCF toolkits

    • Topics covered
      • VCF Format
      • Allele frequencies
      • Genotype frequencies
      • Hardy Weinberg Equilibrium
  • Class 12 (Th Feb 16; Quinlan): VCF annotation and interpetation

  • Class 13 (Tu Feb 21; Quinlan): Variation in genome structure

  • Class 14 (Th Feb 23; Marth): Somatic mutation in cancer

  • Class 15 (Tu Feb 28; Quinlan): Genome annotation

    • Required Reading Prior to Lecture:
      • None
    • Topics covered
      • How and why do we annotate a genome?
      • Conservation
      • CpG islands
      • Repeatmasker
      • Chromatin modifications
      • DNA methylations
      • Linkage blocks
  • Class 15 (Th Mar 2; Quinlan): Genome data formats and genome arithmetic

    • Required Reading Prior to Lecture:
      • None
    • Topics covered
      • The genome as a coordinate system
      • BED format
      • GFF format
      • VCF format
      • UCSC and Biomart to retrieve genome annotations
      • UCSC and IGV to visualize
      • a bit of awk
  • Class 16 (Tu Mar 7; Quinlan): Applied genome arithmetic with bedtools; part 1

  • Class 17 (Tu Mar 7; Quinlan): Applied genome arithmetic with bedtools; part 2

  • Class 18 (Th Mar 9; Quinlan): Digging deeper into UNIX, part 1

    • awk
    • sed
    • tr
    • PATH
    • .bashrc
  • Class 19 (Tu Mar 21; Quinlan): ChIP-seq analysis

    • experimental design
    • protocols
    • examples
  • Class 20 (Th Mar 23; Quinlan): RNA-seq analysis

    • analyses
    • toolsets
    • Class project assignment
  • Class 21 (Tu Mar 28; Quinlan): Basic probability

    • Probability with coins and dice
    • Probability with DNA
    • Conditional probabilities
    • Use R for examples
  • Class 22 (Th Mar 30; Quinlan): Statistical tests

    • Gaussian
      • Z scores
    • Chi-squared
    • Fisher
    • KS test
    • Rank tests
    • Applications
  • Class 23 (Tu Apr 4; Quinlan): How do I know if my observation is significant?

    • Models
    • Expectation
    • Tests for significance
  • Class 24 (Th Apr 6; Quinlan): Data visualization, part 1

    • Why
    • Pattern recognition
    • Detect problems
    • Ansombe’s quartet
    • Introduce class projects
  • Class 25 (Tu Apr 11; Quinlan): Data visualization, part 2

  • Class 26 (Th Apr 13; Quinlan): Digging deeper into UNIX, part 2

    • loops
    • shuffling
    • randomization
    • advanced commands
    • basic scripts and pipelines
  • Class 27 (Tu Apr 18; Quinlan): Advanced topics

  • Class 28 (Th Apr 20; Quinlan): Group Presentations, part 1

  • Class 29 (Tu Apr 25; Quinlan): Group Presentations, part 2

applied-computational-genomics's People

Contributors

arq5x avatar tomsasani avatar zkamvar avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.