WXML 2020 covid-modeling learning guide

This page provides a survey of some influential mathematical models being used to track and forecast COVID-19 in the United States. It was produced by Jarod Alper, an Associate Professor of Mathematics at the University of Washington whose research lies in algebraic geometry. He is not an expert in mathematical epidemiology and has no prior experience with infectious disease modeling. This page was put together as a learning guide for a Spring 2020 Washington Experimental Mathematics Lab (WXML) project.

Last updated: May 6, 2020

Influential modeling groups (in no particular order)
• Imperial College London (ICL)

• Institute for Health Metrics and Evaluation (IHME)
• The IHME model has 4 components:
1. identifying and processing covid data,
2. a statistical model where for each location, a curve fit is applied to population death rates as a function of time since the death rate exceeds a given threshold in the location
3. predicting the time a given location reaches this given threshold if it hasn't already, and
4. an individual age-structured microsimulation to model hospital usage.
• For the curve fit in (ii), two different S-shaped (or sigmoid) curves were considered each depending on 3 parameters. You can see the curves and change the parameter values in this desmos graph.
• The model is regularly-updated and provides predictions on deaths and hospital usage in the US and Europe.
• projections, paper, faq, update notes, CurveFit documentation.

• Institute for Disease Modeling (IDM)
• IDM research reports modeling transmission dynamics in various locations including King & Snohomish counties, Oregon and sub-Saharan Africa.
• King & Snohomish counties March 10 report: Working paper – model-based estimates of COVID-19 burden in King and Snohomish counties through April 7, 2020 by Klein, et al.
• Provides projections on infections and deaths through April 7 using a stochastic SEIR model with models parameters taken from the scientific literature. To monitor hospital usage, each model uses a discrete compartmentalized event care usage model.
• Not age-structured or spatially located but simulations use a negative binomial transmission model with heterogeneous parameter k = 0.54.
• Considers affects various social distancing measures given as a percentage in the reduction of transmission rates (usual, 25%, 50%, 75%)
• King county March 29 reports: Understanding the Impact of COVID-19 Policy Change in the Greater Seattle Area using Mobility Data by Burstein, et al and Social distancing and mobility reductions have reduced COVID-19 transmission in King County, WA by Thakkar, et al.
• Uses mobility data from Facebook Data For Good to determine effect of policy changes. Namely, mobility data is used determine (1) population flux between day & night, (2) perecent difference in daytime residential occupancy, (3) perecent difference in commuting, and (4) percent difference in daytime non-residential occupancy.
• Uses a stochastic SEIR model to estimate the effective reproductive rate R_e through March 24 from case data. Unknown parameters (assumed to depend on time) such as the transmission rate and reporting rate (% of infections detected) are fit to case data.
• King county April 10 report: Physical distancing is working and still needed to prevent COVID-19 resurgence in King, Snohomish, and Pierce counties by Thakkar, et al.
• Uses similar method as the March 29 reports with updated data to estimate the effective reproductive rate through March 30.
• King county April 21 report: Short report: Updated analysis of COVID-19 transmission in King County, WA by Thakkar, et al.
• Uses similar method as the March 29 reports with updated data to estimate the effective reproductive rate through April 9.
• King county April 29 report: Sustained reductions in transmission have led to declining COVID- 19 prevalence in King County, WA by Thakkar, et al.
• Estimates the effective reproductive rate through April 15.
• Uses a stochastic SEIR model as in the March 29 report but incorporates a case detection rate p_t (the probability an infected case is reported) with independent values before and after March 10. It also assumes that the transmission rate varies over time with the structure inferred by the positive vs negative testing data. Assumes that an unknown quantity z_t of infections were imported to King County on January 15. The model determines the unknown parameters p_t and z_t by fitting them to the observed data.

• Global Epidemic and Mobility Project (GLEAM) with contributions from Northeastern
• The effect of travel restrictions on the spread of the 2019 novel coronavirus (COVID-19) outbreak (preprint March 6, published April 24), Science, by Chinazzi M., Davis J.T. Additional details can be found in the supplement
• Model description: individual-based, stochastic, and spatial-based model based on a metapopulation network approach which subdivides the world's roughly 200 countries into 3200 geographic subpopulations.
• Uses airline transportation data from Official Aviation Guide (OAG) and International Air Transport Association (IATA), and ground mobility from statistics offices for 30 countries on 5 continents.
• Visualization dashboard for number of infections, deaths, hospital and ICU beds needed in the US.
• GLEAM also offers:
• EpiRisk: "EpiRisk is a computational platform designed to allow a quick estimate of the probability of exporting infected individuals from sites affected by a disease outbreak to other areas in the world through the airline transportation network and the daily commuting patters. It also lets the user to explore the effects of potential restrictions applied to airline traffic and commuting flows."
• GLEAMviz: a desktop application allowing you to configure and run and analyze your own simulations.

• Centre for the Mathematical Modeling of Infectious Diseases (CMMID) and the London School of Hygiene & Tropical Medicine (LSHTM)

• COVID-Projections by Youyang Gu (YYG)
• model details
• Uses a basic SEIR/SEIS model to simulate the covid epidemic in each country/state/region. The model is neither age-structure nor models hospital usage. The parameters of the model are learned using machine learning techniques attempting to minimize the error between the actual data on the number of deaths (as reported by John Hopkins CSSE) and their projections.
• model summary with comparison to IHME
• projections of their model
• tracker of the basic reproduction number R_0 and number of infections

• Univisty of Texas COVID-19 Modeling Consortium
• Report: UT COVID-19 Mortality Forecasting Model (April 16)
• Uses a statistical-curve fitting approach to a fit a sigmoid curve (in fact, the same curve as the IHME model depending on 3 parameters) and a probabalistic error model to observed death rates. In comparison to the IHME model, they "reformulated the approach in a generalized linear model framework to correct a statistical flaw that leads to the underestimation of uncertainty in the IHME forecasts."
• Uses local mobile-phone GPS data from SafeGraph from each state to quantify the effects of social-distancing measures
• projections
• reports and publications

• Los Alamos National Laboratory (LANL)
• model description and forecasets
• Uses a basic SIR model combined with a statistical process where for each state, the model attempts to determine/learn (1) the "growth parameter" (i.e. transmissibility rate) as a function of time (assumed to decrease over time) based on trends in the number of new cases reported, (2) the case fatality rate determined by the number of new deaths and reported cases, and (3) the discrepancy between the actual and reported number of cases/deaths as a function of time.
• In forecasting, the model assumes the case fatality rate to be constant and that deaths are synchronous with a positive test.

• Columbia University
• Severe COVID-19 Risk Mapping : This tool shows projections for hospital demand for each US county and the expected date of peak capacity.
• paper summarizing results: Flattening the curve before it flattens us: hospital critical care capacity limits and mortality from novel coronavirus (SARS-CoV2) cases in US counties - 3 and 6 week projections from April 2, 2020
• paper with modeling details (contains model details) Simulation of SARS-CoV2 Spread and Intervention Effects in the Continental US with Variable Contact Rates, March 24, 2020
• Uses a metapopulation SEIR model (similar to this Science paper) with S, E, I, R components for each county incorporating commuting and random movements of individuals using data from the US Census Bureau on county-to-county commuting patterns. Transmissions are separated into daytime (8 hours) and nighttime (16 hours).

• University of Minnesota
• Model with links to video briefing, slides, faq, infographic and technical documentation
• Estimates number of daily cases & deaths and ICU usage in Minnesota and predicts peak usage. Examines impact of social distancing (assumed to reduce transmission rates by 50%) and shelter in place (assumed to reduce transmission rates by 80%).
• Uses an age and comorbidity structured compartamentalized SEIR model.

• Geneva
• COVID-19 Epidemic Forecasting Dashboard
• "We calculate the growth rate of cumulative cases (resp. deaths) between two days ago and today. If it's greater than 5%, we use an exponential model to forecast the cumulative number of cases (resp. deaths), and then derive the daily number of cases (resp. deaths). If it's less than 5%, we use an linear model instead."
• No other descriptive model documentation is provided but there is a public GitHub repository.
Modeling papers (in no particular order)
• Kissler (Harvard), et al. Science paper
• Projecting the transmission dynamics of SARS-CoV-2 through the postpandemic period (preprint March 6, published in Science April 14)
• Supplementary materials
• Explores the role of seasonality, length of immunity, cross-immunity (with two other coronavirus HCoV-OC43 and HCoV-HKU1), social distancing and adding critical care capacity to model the transmission dynamics through 2025.
• Uses a compartmentalized deterministic SEIRS model with gamma-distributed waiting times.
• To estimate the incidence of each coronavirus, used the percentage of positive tests multiplied by the population-weighted proportion of influenza-like illnesses.
• homogeneous model: no age-structure, not spacially-located, did not include school closings.

• Dandekar (MIT)--Barbastathis (MIT) paper in PNAS
• Quantifying the effect of quarantine control in Covid-19 infectious spread using machine learning article (preprint April 3, 2020, published in PNAS)
• Uses a classical SIR approach with "a neural network added as a non linear function approximator (Rackauckas et al. 2020) informs the infected variable I in the SIR model. This neural network encodes information about the quarantine strength function in the locale where the model is implemented. The neural network is trained from publicly available infection and population data for Covid-19 for a specific region under study".
• The neural network is a two-layer fully-connected neural network which trains Q(t), the percentage of the infected population I(t) that are in quarantine at time t, so that Q(t)I(t) is the number of people in the quarantine compartment. The funnction Q(t) is calculated from the (S, I, R, T) vector by the formula Q = f o A_2 o f o A_1 where A_2:R^10 -> R and A_1:R^4 -> R^10 are affine-linear transformations fitted to data, and f is the nonlinear activation function ReLu. This is a rather small neural network as it is training only 61 parameters.
• Compares their SIR + neural network controlling quarantine strength Q to a classical SEIR approach (which does not incorporate a compartment for quarantine)
• Giodano (Trento), et al paper in Nature Medicine
• Modeling the COVID-19 epidemic and implementation of population-wide interventions in Italy (published Aprill 22, 2020)
• Uses a compartamentalized SIDARTHE model; this is a variation of the classical SIR/SEIR epidemiological model but here S=susceptible, I=infected, D=diagnosed, A=ailing, R=recognized, T=threatened, H=healed, E=extinct. This distinguishes between infected individuals on whether they have been diagnosed and the severity.
• They run simulations using data from Italy modeling affects of social-distancing, population-wide testing ad contact tracing.

The CDC has has a very informative summary of COVID-19 forecasts and models

Comparison of models

Recommended books
• Keeling and Rohani, Modeling infectious diseases in humans and animals
• This is one of the standard texts in mathematical epidemiology.
• Kiss, Miller, and Simon, Mathematics of Epidemics on Networks: From Exact to Approximate Models

Online COVID-related lectures and courses

Basic data and visualizations

Some educational and informative visualizations
• Modeling COVID-19 Spread vs Healthcare Capacity by Alison Hill (Harvard),
• An informative model from NY Times regarding impacts of social distancing
• article by Nicholas Kristof and Stuart A. Thompson
• model created with Gabriel Goh, Steven De Keninck, Ashleigh Tuite and David N. Fisman
• An interactive visualization by John Burn-Murdoch
• tracks the exponential spread of COVID-19 by country/state
• COVID-19 Scenarios
• by Richard Neher, et al at the Biozentrum, University of Basel
• basic homogeneous SEIR model with compartments including various hospital stages.
• individual model parameters can be varied.
• Impacts of social distancing measures are considered.
• Washington post covid simulator
• article by Harry Stevens
• Most viewed Washington post article ever.
• Covid Trends
• By Aatish Bhatia in collaboration with Minute Physics.
• Visualization of spread by country and US state.
• Breaking the wave
• by Jon McClure, a Reuters Graphics editor
• explains how the rate of increase in covid deaths is decreasing
• Forecasting s-curves is hard by Constance Crozier
• Several interactive notebooks by Yong-Yeol Ahn