TY - JOUR
T1 - Estimating county-level mortality rates using highly censored data from CDC WONDER
AU - Quick, Harrison
N1 - Publisher Copyright:
© 2019 Centers for Disease Control and Prevention (CDC).
PY - 2019
Y1 - 2019
N2 - Introduction CDC WONDER is a system developed to promote informationdriven decision making and provide access to detailed public health information to the general public. Although CDC WONDER contains a wealth of data, any counts fewer than 10 are suppressed for confidentiality reasons, resulting in left-censored data. The objective of this analysis was to describe methods for the analysis of highly censored data. Methods A substitution approach was compared with 1) a simple, nonspatial Bayesian model that smooths rates toward their statewide averages and 2) a more complex Bayesian model that accounts for spatial and between-age sources of dependence. Age group-specific county-level data on heart disease mortality were used for the comparisons. Results Although the substitution and nonspatial approach provided agestandardized rate estimates that were more highly correlated with the true rate estimates, the estimates from the spatial Bayesian model provided a superior compromise between goodness-of-fit and model complexity, as measured by the deviance information criterion. In addition, the spatial Bayesian model provided rate estimates with greater precision than the nonspatial approach; in contrast, the substitution approach did not provide estimates of uncertainty. Conclusion Because of the ability to account for multiple sources of dependence and the flexibility to include covariate information, the use of spatial Bayesian models should be considered when analyzing highly censored data from CDC WONDER.
AB - Introduction CDC WONDER is a system developed to promote informationdriven decision making and provide access to detailed public health information to the general public. Although CDC WONDER contains a wealth of data, any counts fewer than 10 are suppressed for confidentiality reasons, resulting in left-censored data. The objective of this analysis was to describe methods for the analysis of highly censored data. Methods A substitution approach was compared with 1) a simple, nonspatial Bayesian model that smooths rates toward their statewide averages and 2) a more complex Bayesian model that accounts for spatial and between-age sources of dependence. Age group-specific county-level data on heart disease mortality were used for the comparisons. Results Although the substitution and nonspatial approach provided agestandardized rate estimates that were more highly correlated with the true rate estimates, the estimates from the spatial Bayesian model provided a superior compromise between goodness-of-fit and model complexity, as measured by the deviance information criterion. In addition, the spatial Bayesian model provided rate estimates with greater precision than the nonspatial approach; in contrast, the substitution approach did not provide estimates of uncertainty. Conclusion Because of the ability to account for multiple sources of dependence and the flexibility to include covariate information, the use of spatial Bayesian models should be considered when analyzing highly censored data from CDC WONDER.
UR - http://www.scopus.com/inward/record.url?scp=85068148216&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85068148216&partnerID=8YFLogxK
U2 - 10.5888/pcd16.180441
DO - 10.5888/pcd16.180441
M3 - Article
C2 - 31198162
AN - SCOPUS:85068148216
SN - 1545-1151
VL - 16
JO - Preventing Chronic Disease
JF - Preventing Chronic Disease
IS - 6
M1 - 180441
ER -