“Reported” versus “Actual”: Two Different Things
Very few states or countries report the “actual date” of death. The data sets typically presented in aggregate data-bases such as The COVID Tracking Project use “reported date” of death. This data is often choppy with many deaths reported early in the work week, with large weekend gaps. Certainly this methodology is the simplest and does not require go-backs.
Unfortunately, this “reported date” of death is often misconstrued as the “actual date” of death. Misrepresentation of death date data can inform poor decision making and instill panic: Just the News: How COVID-19 fatality reports are distorting the data on daily death rates. Reported death data can be presented as a sharp rise in deaths. In reality, the spike is likely a smaller hump representative of events in the past.
A Methodology for Estimating Actual Date of Death from Reported Death Date Data
Here we present a simple methodology to accurately estimate the trend-lines for the actual date of death. We use a random number generator and either a uniform or normal distribution to assign deaths a date in the past with a mean and sigma (for normal) or a mean and up / down uniform bounds. We can add extra delay or spread depending on the day of the week.
In the figure below we can see reported actual date of death from the Arizona Department of Public Health in blue plotted against the rolling tally of reported deaths per day from the COVID Tracking project in orange. This moving average for the “reported date” of death can be seen to be sharply spiking. This moving average is presenting a different curve than the “actual date” of death, which shows a less severe past increase. Our algorithm derived “actual date” of death is shown in green and closely matches the Arizona reported “actual date” of death in blue.
Our algorithm applies a normal (Gaussian) distribution to each data sample and spreads the reported dates back in time. We averaged 100 iterations through the data set. The mean delay was adjusted to account for reporting delays by day. Below is the table of parameters used for this exercise that yielded a close match to the actual date of death data.
|Day||Mean Delay||Sigma Delay|
In the absence of accurate day of death tracking this algorithm can be applied to derive a close approximation of actual date of death. This derivation gives a much more accurate view of actual date of death in the absence of clean record keeping where only “reported date” of death data is available.