TY - JOUR
T1 - Bayesian multivariate areal wombling for multiple disease boundary analysis
AU - Ma, Haijun
AU - Carlin, Bradley P.
PY - 2007
Y1 - 2007
N2 - Multivariate data summarized over areal units (counties, zip codes, etc.) are common in the field of public health. Estimation or testing of geographic boundaries for such data may have varied goals. For example, for data on multiple disease outcomes, we may be interested in a single set of "composite" boundaries for all diseases, separate boundaries for each disease, or both. Different areal wombling (boundary analysis) techniques are needed to meet these different requirements. But in any case, the underlying statistical model needs to account for correlations across both diseases and locations. Utilizing recent developments in multivariate conditionally autoregressive (MCAR) distributions and spatial structural equation modeling, we suggest a variety of Bayesian hierarchical models for multivariate areal boundary analysis, including some that incorporate random neighborhood structure. Many of our models can be implemented via standard software, namely WinBUGS for posterior sampling and R for summarization and plotting. We illustrate our methods using Minnesota countylevel esophagus, larynx, and lung cancer data, comparing models that account for both, only one, or neither of the aforementioned correlations. We identify both composite and cancer-specific boundaries, selecting the best statistical model using the DIC criterion. Our results indicate primary boundaries in both the composite and cancer-specific response surface separating the mining- and tourism-oriented northeast counties from the remainder of the state, as well as secondary (residual) boundaries in the Twin Cities metro area.
AB - Multivariate data summarized over areal units (counties, zip codes, etc.) are common in the field of public health. Estimation or testing of geographic boundaries for such data may have varied goals. For example, for data on multiple disease outcomes, we may be interested in a single set of "composite" boundaries for all diseases, separate boundaries for each disease, or both. Different areal wombling (boundary analysis) techniques are needed to meet these different requirements. But in any case, the underlying statistical model needs to account for correlations across both diseases and locations. Utilizing recent developments in multivariate conditionally autoregressive (MCAR) distributions and spatial structural equation modeling, we suggest a variety of Bayesian hierarchical models for multivariate areal boundary analysis, including some that incorporate random neighborhood structure. Many of our models can be implemented via standard software, namely WinBUGS for posterior sampling and R for summarization and plotting. We illustrate our methods using Minnesota countylevel esophagus, larynx, and lung cancer data, comparing models that account for both, only one, or neither of the aforementioned correlations. We identify both composite and cancer-specific boundaries, selecting the best statistical model using the DIC criterion. Our results indicate primary boundaries in both the composite and cancer-specific response surface separating the mining- and tourism-oriented northeast counties from the remainder of the state, as well as secondary (residual) boundaries in the Twin Cities metro area.
KW - Areal data
KW - Cancer
KW - Epidemiology and End Results (SEER) data
KW - Multivariate conditionally autoregressive (MCAR) model
KW - Surveillance
UR - http://www.scopus.com/inward/record.url?scp=72249091018&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=72249091018&partnerID=8YFLogxK
U2 - 10.1214/07-BA211
DO - 10.1214/07-BA211
M3 - Article
AN - SCOPUS:72249091018
SN - 1936-0975
VL - 2
SP - 281
EP - 302
JO - Bayesian Analysis
JF - Bayesian Analysis
IS - 2
ER -