Code Monkey home page Code Monkey logo

jhsph753and4's Introduction

Biostatistics 753 and 754

Class github repository for 753 and 754; doctoral classes in the Department of Biostatistics at Johns Hopkins. You may use the material according to the Creative Commons Share-Alike 2.0 (CC-SA 2.0) license.

Course Information

This course is the third term of the intensive introduction to methods for applied statistics. The goal of this sequence is to develop Ph.D. level biostatisticians who are capable of both applied data analysis and developing the next generation of statistical methods. Both data analysis and methods development require substantial hands on experience, so the focus of this class will be on hands on data analysis.

Learning objectives

Upon completion of this course students will be able to:

  1. Obtain, clean, transform and process raw data into usable formats
  2. Formulate quantitative models to address scientific questions
  3. Organize and perform a complete data analysis, from exploration, to analysis, to synthesis, to communication.
  4. Understand and apply a range of statistical methods for inference and prediction.
  5. Develop ideas for new statistical methods, tools, and analyses

Students will also be encouraged to independently read and apply statistical methods from texts and the scientific literature that are not covered in the course. They will also be encouraged to think of improvements or variations on existing methods to address specific scientific questions.

Evaluation and feedback

  • 35% = Data analysis (peer graded/instructor summarized)
  • 20% = Bi-weekly problems (graded by TA)
  • 10% = Data analysis review (completion)
  • 25% = Final Project (graded by instructor)

You will get receive grades on the methods problems, feedback from your peers, and brief (< 1 paragraph + grade) feedback from me within a week of submitting your assignment. If you would like further feedback on your assignments please schedule time to meet with me. I will try to leave Fridays available from 10am-3pm in 20 minute slots available. You may book up to 3 slots at a time:

http://jtleek.youcanbook.me/

Grading philosophy

I believe the purpose of the Ph.D. is to train you to be able to think for yourself and initiate and complete your own projects. I am super excited to talk to you about ideas, work out solutions with you, and help you to figure out statistical methods and/or data analysis. I don't think that graduate school grades are important for this purpose. This means that I don't care very much about graduate student grades.

That being said, the purpose of this course is to prepare you for the qualifying exam. I will therefore assign grades on a three level scale:

  1. A - Excellent
  2. B - Passing
  3. C - Needs improvement

If you receive A's and B's and perform in a similar way on the qualifying exam, I anticipate that you will pass. If you receive C's that is my way of letting you know that your work would not pass on the qualifying exam. I don't feel comfortable assigning percentages to data analyses, but to be able to calculate grades at the completion of the course I will use the following percentages: A = 100%, B = 85%, C = 75% of available points.

Data analysis assignments

(For more on my project philosophy see: http://bit.ly/wQT5uI)

Each student will be required to perform two data analysis projects during the course of the class. Students will be given 2 weeks to perform each analysis. The project assignments will consist of a scientific description of the problem. Students are responsible for all stages of each data analysis from obtaining the data to the final report. At the conclusion of each analysis each student must turn in:

  1. A write-up of their data analysis in a synthesized format, with numbered figures and references. (You may also include supplementary material for detailed additional calculations/analyses)
  2. A reproducible Rmd file that produces all of the numbers, figures and results in your write-up.

All documents should be submitted electronically.

  1. Did you answer the scientific question? (30%)
  2. Did you use appropriate statistical methods? (40%)
  3. Was your write-up simple, clear, and precise? (20%)
  4. Was your code reproducible? (10%)

Keep in mind that this is a methods class. In some cases standard methodology will be sufficient to answer the question of interest. You may speak to your fellow students about specific statistical questions related to the projects, but the overall idea, analysis, and write-up should be your own individual work. You should cite any help you get from fellow students/TAs in your report in standard citation format.

Data Analysis Reviews

After each data analysis is turned in, they will be randomly assigned to another student for review. Your review will be due one week after it is assigned. Your comments should have the format of a typical peer review. You should include a summary of the analyses and conclusions in the project you are reviewing, any major revisions, and any minor revisions. I will also evaluate each data analysis independently to assign a grade. Synthesized comments will be made available for each project.

Homework

Every two weeks you will be assigned one or more mathematical, directed problems focused on a statistical method we have covered. These problems may have multiple parts. The solutions should be submitted as PDF files.

Final Project

The final project will have the same format as the data analyses. It will be slightly longer than the weekly projects in terms of space and more in depth in terms of analysis. For 753, the final project will be assigned to you. For 754 you will be able to select your own data set and project.

The choice of your final project is up to you. The project should involve data/code that you can obtain, process, analyze, and synthesize yourself. Keep in mind that real scientists make their own data. You may use any of the methods you learn during the course, or any other methods you know/look up etc.

Grading for the final exam will be weighted by the difficulty of the project you undertake. The more difficult the project you take on, the greater the multiplier of your final score. The maximum possible score will still be 100%.

Structure of Class Time

Class will consist of both lectures on statistical methdology and hands on practice. The hands on practice will be assigned in advance of each lecture and will give you time to look it over and come up with questions. The plan will be for students to work on the problem and ask questions, followed by the instructor or a chosen student presenting their solution.

Tentative syllabus (753 and 754)

  • Obtaining data and data processing
  • Exploratory data analysis
  • Regression and generalizations
  • Smoothing
  • Prediction
  • High dimensional analysis
  • Simulation studies

Books

jhsph753and4's People

Contributors

dzhuyx avatar jtleek avatar ncarchedi avatar rdpeng avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.