The ecological fallacy is related to Simpson’s paradox (1951) where relationships among group means may be counterintuitive and substantially different from relationships within groups, where the groups are usually geographic entities such as census tracts. We consider the problem of estimating the correlation between two jointly normal random variables where only ecological data (group means) are available. Two empirical Bayes estimators and one fully Bayesian estimator are derived and compared with the usual ecological estimator, which is simply the Pearson correlation coefficient of the group sample means. We simulate the bias and mean squared error performance of these estimators, and also give an example employing a dataset where the individual level data are available for model checking. The results indicate superiority of the empirical Bayes estimators in a variety of practical situations where, though we lack individual level data, other relevant prior information is available.
Bibliographical noteFunding Information:
This work was partially supported by University of Connecticut Research Foundation grant #168. The authors thank Professor Alan Gelfand, an anonymous referee and the associate editor for many helpful comments.
- ecological correlation
- ecological fallacy
- empirical Bayes