Introduction

This lesson summarizes the topics we'll be covering in section 20 and why they'll be important to you as a data scientist.

Objectives

You will be able to:

Understand and explain what is covered in this section
Understand and explain why the section will help you to become a data scientist

Hypothesis and AB testing

In this section, we'll be looking at .....

Experimental Design

Without good experimental design, it's very easy to draw the wrong conclusions from your experiments. Because of that, we kick this section off by looking at the scientific method and the key elements of good experimental design - forming alternate and null hypotheses, conducting an experiment, analysing the results for statistical significance and drawing conclusions.

Effect Size

We then look at how to calculate and interpret the size of the difference between control and test groups. We'll see how the "Effect Size" can be used to communicate the practical significance of experimental results, to perform meta-analyses of multiple studies, and to perform power analysis to determine the number of particicpants that a study would require to achieve a certain probability of finding a true effect. We'll also look at t-tests and how they can be used to compare two averages to see how significant the differences are between two sets of results.

Type 1 and Type 2 Errors

From there, we introduce the concept of type 1 (false positive) and type 2 (false negative) errors and the inherent trade off between them.

Statistical Power

We then introduce the concept of the power of a statistical test - the test's ability to detect a difference, when one exists. We look at how it relates to p-values and effect size for hypothesis testing, and get some practice calculating statistical powers using SciPy. We then pull together all of the previous ideas and ask you to design an experiment for a policital campaign.

Multiple Comparisons

From there, we look at some of the issues that arise when trying to perform multiple comparisons - from the risks of spurious correlations to the important of corrections such as the Bonferroni Correction to deal with the risks inherent in multiple comparisons.

A/B Testing

Next up is A/B testing. We start by introducing the concept of an A/B terst, and then building on our recent experience of experimental design, we go through the process of designing, structuring and running an A/B test.

Goodharts Law and Metric Tracking

We then take a little bit of time to consider the implications of misusing metrics - even if our experiments are initially sound.

ANOVA Testing

Finally, we spend some time introducing the (Analysis of Variance) method for generalizing previous discussions regarding statistical tests to multiple groups.

Summary

Without a good understanding of experimental design, it's easy to end up confusing spurious correlations for meaningful results or placing too much (or too little) weight on the results of any given test. In this section we cover a range of tools and techniques to ensure that you design your experiments regirously and interpret them thoughtfully.

learn-co-curriculum / dsc-2-20-01-introduction Goto Github PK

dsc-2-20-01-introduction's Introduction

Introduction

Introduction

Objectives

Hypothesis and AB testing

Experimental Design

Effect Size

Type 1 and Type 2 Errors

Statistical Power

Multiple Comparisons

A/B Testing

Goodharts Law and Metric Tracking

ANOVA Testing

Summary

dsc-2-20-01-introduction's People

Contributors

Watchers

dsc-2-20-01-introduction's Issues

Recommend Projects

Recommend Topics

Recommend Org