There’s some uproar this week because the estimates provided by the model apparently most used by the Trump administration have been revised significantly downward. I’m not clear on exactly what was done, but a revision is not entirely surprising.
Let me give some background on modeling generally and the model that the administration seems to be using. I say “seems to be using” because the White House task force has not been clear about what model is being used.
The model whose results have changed is the IHME model (Institute for Health Metrics and Evaluation University of Washington). seems to be the primary model used by the White House response team. However, the figures President Donald Trump repeated in briefings last week of as many as 2.2 million deaths are the same as given by the Imperial College model. Dr. Deborah Birx has mentioned both the IHME model and what seems to be an internal model.
There are a plethora of models, with more showing up every day, including at the state level. Modeling epidemics is not terribly complex; fitting a curve or a few differential equations will do. It’s useful to have several models to cross-check, but that’s not happening much, given the other pressures on states.
I think of models as being of one of two types, from the bottom up and from the top down. A model built from the top down chooses a curve to fit to a data set and then uses that curve to look at other data. One built from the bottom up takes component parts that go into the progress of the epidemic – how effectively the virus is transferred from one person to another, the effect of social distancing – gives them each a mathematical representation, and combines those representations into a model. Gradations between the two are possible.
The influence of various factors is easier to see in a bottom-up model. In a top-down model, the factors may be mixed with each other and harder to separate. The types of assumptions are different for the two types of model. A bottom-up model can separate the parts of the transmission process: interactions among people, susceptibility to the virus, the infectiousness of carriers, along with damping down of transmission by distancing and acquiring immunity. Both types of model can be useful. Matching the models with data and with each other helps to firm up parameters and assumptions.
The Imperial College model is bottom-up. It starts with the ways the SARS-CoV-2 virus might be transferred among people, works through how social distancing affects that, and builds separate equations for different parts of the process. The IHME model is top-down. It fits curves representing deaths in various locations with four parameters. It then works back from numbers of deaths to the need for hospitalization and equipment.
I’ve looked at the Imperial College model in some detail.
As I understand it, the function describing the curve in the IHME model was derived by fitting cumulative death data from Wuhan, China. Then the same function was fit to data from other places. Of four variables, two are interpolated with Wuhan data (the change now presumably incorporating data from seven regions from Italy and Spain) and two that vary with specific location.
Because of those last two, the uncertainty band will be larger early on, when less location-specific data is available, than later with more data. The IHME projections are said to be updated as new data come in.
The uncertainty bands are enormous, for many reasons. The discussion in the description of the model gives some sources of uncertainty, but there are other sources that may not be included. I hope to discuss all the things we don’t know and how they affect modeling in another post.
Although the United States plot reaches its maximum in mid-April, the maxima for individual states range from this week to mid-May. The Imperial College model projects a maximum for June and July.
Perhaps the most questionable assumption of the IHME model is the strict lockdown observed in Wuhan and Italy. The United States has had less population distancing than in those places. I’m not at all clear how this quantitatively fits into the four parameters of the primary equation of the IHME model. It looks to me like it is implicit in using the Wuhan and other data, but some of the discussion in the paper sounds like it is more explicitly included without saying how.
When I’ve listened to Birx speak about the models in the press conferences, I’ve had a concern that she doesn’t really understand them. She is said to have worked with the modeling of the AIDS epidemic, but that was some time back, and computer capabilities have greatly increased what can be done with models.
In the IHME model, I find misleading that the hospital resource projections are given in three and four significant figures when the error bands are so large. Those numbers imply a precision that cannot be part of the model. That, and the large adjustment just made, will discredit models generally with the public, along with the wide range of predictions by the different models.
Here’s an article on epidemiological simulations generally that you may find informative.
Cross-posted to Balloon Juice