TY - JOUR
T1 - A graphical model for skewed matrix-variate non-randomly missing data
AU - Zhang, Lin
AU - Bandyopadhyay, Dipankar
N1 - Publisher Copyright:
© 2018 The Authors. Published by Oxford University Press. All rights reserved.
PY - 2020/4/1
Y1 - 2020/4/1
N2 - Epidemiological studies on periodontal disease (PD) collect relevant bio-markers, such as the clinical attachment level (CAL) and the probed pocket depth (PPD), at pre-specified tooth sites clustered within a subject's mouth, along with various other demographic and biological risk factors. Routine cross-sectional evaluation are conducted under a linear mixed model (LMM) framework with underlying normality assumptions on the random terms. However, a careful investigation reveals considerable non-normality manifested in those random terms, in the form of skewness and tail behavior. In addition, PD progression is hypothesized to be spatially-referenced, i.e. disease status at proximal tooth-sites may be different from distally located sites, and tooth missingness is non-random (or informative), given that the number and location of missing teeth informs about the periodontal health in that region. To mitigate these complexities, we consider a matrix-variate skew-t formulation of the LMM with a Markov graphical embedding to handle the site-level spatial associations of the bivariate (PPD and CAL) responses. Within the same framework, the non-randomly missing responses are imputed via a latent probit regression of the missingness indicator over the responses. Our hierarchical Bayesian framework powered by relevant Markov chain Monte Carlo steps addresses the aforementioned complexities within an unified paradigm, and estimates model parameters with seamless sharing of information across various stages of the hierarchy. Using both synthetic and real clinical data assessing PD status, we demonstrate a significantly improved fit of our proposition over various other alternative models.
AB - Epidemiological studies on periodontal disease (PD) collect relevant bio-markers, such as the clinical attachment level (CAL) and the probed pocket depth (PPD), at pre-specified tooth sites clustered within a subject's mouth, along with various other demographic and biological risk factors. Routine cross-sectional evaluation are conducted under a linear mixed model (LMM) framework with underlying normality assumptions on the random terms. However, a careful investigation reveals considerable non-normality manifested in those random terms, in the form of skewness and tail behavior. In addition, PD progression is hypothesized to be spatially-referenced, i.e. disease status at proximal tooth-sites may be different from distally located sites, and tooth missingness is non-random (or informative), given that the number and location of missing teeth informs about the periodontal health in that region. To mitigate these complexities, we consider a matrix-variate skew-t formulation of the LMM with a Markov graphical embedding to handle the site-level spatial associations of the bivariate (PPD and CAL) responses. Within the same framework, the non-randomly missing responses are imputed via a latent probit regression of the missingness indicator over the responses. Our hierarchical Bayesian framework powered by relevant Markov chain Monte Carlo steps addresses the aforementioned complexities within an unified paradigm, and estimates model parameters with seamless sharing of information across various stages of the hierarchy. Using both synthetic and real clinical data assessing PD status, we demonstrate a significantly improved fit of our proposition over various other alternative models.
KW - Bayesian
KW - Bayesian
KW - MCMC
KW - Matrix-variate data
KW - Non-random missingness
KW - Skew-t
KW - Spatial
UR - http://www.scopus.com/inward/record.url?scp=85081950264&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85081950264&partnerID=8YFLogxK
U2 - 10.1093/biostatistics/kxy056
DO - 10.1093/biostatistics/kxy056
M3 - Article
C2 - 30371748
AN - SCOPUS:85081950264
SN - 1465-4644
VL - 21
SP - E80-E97
JO - Biostatistics
JF - Biostatistics
IS - 2
ER -