Continuous variable dichotomization is a popular technique used in the estimation of the effect of risk factors on health outcomes in multivariate regression settings. Researchers follow this practice in order to simplify data analysis, which it unquestionably does. However thresholds used to dichotomize those variables are usually ad-hoc, based on expert opinions, or mean, median or quantile splits and can add bias to the effect of the risk factors on specific outcomes and underestimate such effect. In this paper, we suggest the use of a semi-parametric method and visualization for improvement of the threshold selection in variable dichotomization while accounting for mixture distributions in the outcome of interest and adjusting for covariates. For clinicians, these empirically based thresholds of risk factors, if they exist, could be informative in terms of the highest or lowest point of a risk factor beyond which no additional impact on the outcome should be expected.
Bibliographical noteFunding Information:
Acknowledgments The authors would like to thank the MESA investigators and staff for their flexibility on the use of their data for this work and the participants of the MESA study for their valuable contributions. This work was supported by the National Heart, Lung, and Blood Institute Grant 1 R21 HL081175-01A1. MESA was supported by contracts N01-HC-95159 through N01-HC-95165 and N01-HC-95169 from the National Heart, Lung, and Blood Institute.
- Generalized additive model
- Recycled prediction
- Smearing estimates
- Threshold detection