The Hypothesis of Testing: Paradoxes arising out of reported coronavirus case-counts

Coronavirus case-count data has influenced government policies and drives most epidemiological forecasts. Limited testing is cited as the key driver behind minimal information on the COVID-19 pandemic. While expanded testing is laudable, measurement error and selection bias are the two greatest problems limiting our understanding of the COVID-19 pandemic; neither can be fully addressed by increased testing capacity. In this paper, we demonstrate their impact on estimation of point prevalence and the effective reproduction number. We show that estimates based on the millions of molecular tests in the US has the same mean square error as a small simple random sample. To address this, a procedure is presented that combines case-count data and random samples over time to estimate selection propensities based on key covariate information. We then combine these selection propensities with epidemiological forecast models to construct a \emph{doubly robust} estimation method that accounts for both measurement-error and selection bias. This method is then applied to estimate Indiana's prevalence using case-count, hospitalization, and death data with cumulative demographic information, the United States's only statewide random molecular sample collected from April 25--29th, and Facebook's COVID-19 symptom survey. We end with a series of recommendations based on the proposed methodology.

Project Description

This project includes the code needed to reproduce results. This includes (A) sourcing both US and World testing (B) algorithmic development, and (C) application of models to the cleaned datasets. If using this code please cite the paper using the following bibtex:

@article{dempsey:2020,
author = {Dempsey, Walter},
title = {The Hypothesis of Testing: Paradoxes arising out of reported coronavirus case-counts},
booktitle = {arXiv},
year = {2020}}

Code Description

If there are steps to run the code list them as follows:

Dependencies: all code is developed in R.
Datasets and exploratory data analysis

The methods directory contains all relevant code to this project.
Final reports can be found in the write-up directory

wdempsey / covid-umich Goto Github PK

covid-umich's Introduction

The Hypothesis of Testing: Paradoxes arising out of reported coronavirus case-counts

Project Description

Code Description

covid-umich's People

Contributors

Watchers

Forkers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent