Wednesday, April 8, 2020

Predictions

Some of our readers are understandably having a hard time understanding the numbers and where the analysis is coming from. I thought I would try something a little different by making a few predictions - well, extrapolations really - and showing where they are coming from. I will focus for the moment on the United States, since it's a little more like comparing apples to apples than, say, showing models for Italy and Turkey side-by-side.
First up, New York (State). New York was hit early and hit hard. It still has 1/3 of the active cases in the US right now. Also, a highly cited model keeps predicting that NY is about to peak in the next few days. Let's take a look at the past two weeks, and see how things are looking right now. The blue line is the active confirmed cases in NY over the past, while the purple and red lines extrapolate (predict) those numbers into the next few days. The purple line is a straight line estimate (linear growth). This estimate has been pretty good at predictions over the past week, though it tends to be a little conservative. The red line is an N-squared model, assuming some curvature to the data. This has more flexibility, so it can easily overfit the data, and in fact does a slightly worse job of prediction over the past week, typically overshooting the real numbers. The reality will probably be somewhere between these two lines, but hugging much closer to the linear (purple) line. That means by the end of the day today, we expect the number of active cases to be between 127k-131k. By April 12, the expectation is between 155k-167k.

The next chart shows the behavior of some of the states with the most active infections. Each of these is transitioning from exponential growth to linear, and the N-squared model has been a good fit so far. Extrapolating out, we see that each of these are behaving very similarly, with only Michigan break from the pack and growing slightly faster than the rest.

Finally, a few of the other states, including both states with linear growth and transitional models. These states are chosen specifically because they are stable over the past ten days, so states with new events with a sudden impact on the model, or very noisy data are not show, as they aren't good examples for our explanations. However, the same principles apply.

No comments:

Post a Comment