Subject: Simpson's paradox Simpson's paradox refers to the reversal of relationship following the collapsing over heterogeneity of multi-way contingency tables. More recently, it has come to refer to ANY change of relationship (i.e., magnitude and sign) following such collapsing. It can be seen to be a special case of the more general problem of inappropriate cross-level inference, of which the "ecological fallacy" and the "individualistic fallacy" are also special cases. It is also a very old problem, recognised at least as early as Yule (1903). An intriguing recent example is that provided by Wardrop (1995) in which he argues that the much believed, but fallacious "hot-hand" in basketball arises from just such collapsing. That is, collapsed over heterogeneous shooters, there is an apparent "hot-hand", but only at that aggregated level of analysis; with shooters as the unit of analysis, no such relationship exists (as Gilovich and Tversky have long maintained). Anil Menon (Syracuse University, School of Engg. and Computer Science) compiled this reference list and placed it on a "Simpson's Paradox" web-site he had created. Unfortunately, the web-site URL has changed (or no longer exists): Articles on Simpson's Paradox and Related topics Last updated: 19/03/96 I got interested in Simpson's paradox while studying deception in Genetic Algorithms. Here is a list of articles that might be useful. Fortunately, John Vokey at the Department of Psychology, University of Lethbridge, was kind enough to post most of these references, saving me an ascii adventure. I have grouped the bibliography in several ways: * A Beginner's Guide, * Chronologically, * Alphabetically. Eventually, I may put up a topically organized list as well... Please inform me if I have let out any pertinent articles. Some related links are: * Simple example based on drug tests. * A news group discussion (may have been removed). * Graphical Methods for Categorical Data. _________________________________________________________________ A Beginner's Guide I would recommend that the newcomer start off with: * Authors : Blyth, C. R. Title : On Simpson's paradox and the sure thing principle. Source : Journal of the American Statistical Association, 67, 1972, 364-381. For some real-life examples of Simpson's paradox, see Keyfitz's classic book. * Authors : Keyfitz, N. Booktitle : Applied mathematical demography, Wiley, New York, pp. 385-391, 1977. My favorite analysis of Simpson's paradox is the one in Simon and Blume's excellent book: * Authors : Simon, C. P. and Blume, L. Booktitle : Mathematics for Economists, W. W. Norton and Company, New York, pp. 368-371, pp. 784-791, 1994. They explain it using Don Saari's results. The importance of his work in the study of ``social paradoxes'' cannot be over-emphasized. A good starting point to Saari's remarkable theorem is: * Authors : Saari, D. G. Title : The source of some paradoxes from social choice and probability. Source : Journal of Economic Theory, 41(1), 1-22, 1987 Shyam Sunder's paper gives Yuji Ijiri's necessary and sufficient condition for Simpson's paradox to occur in the ``simplest possible case''. This condition is a special case of Saari's theorem, but is particularly clear and simple to use in practice. I had no idea accountants worried about such matters. * Authors : Sunder, S. Title : Simpson's reversal paradox and cost allocation. Source : Journal of Accounting Research, 21, 222-233, 1983. Finally, I urge the reader to take a look at Vaupel and Yashin's very readable paper on the pernicious effects of heterogeneity on statistical decision making. It reads like a Stephen King novel (and is also equally horrifying). * Authors : Vaupel, J. W. and Yashin, A. I. Title : Heterogeneity's ruses: some surprising effects of selection on population dynamics. Source : The American Statistician, 39(3), 176-185, 1985. _________________________________________________________________ Chronological Bibliography The 1900's Authors : Yule, G. U. Title : Notes on the theory of association of attributes in statistics. Source : Biometrica, 2, 121-134, 1903. The 1930's Authors : Thorndike, E. L. Title : On the fallacy of imputing the correlations found for groups to individuals or smaller groups composing them. Source : American Journal of Psychology, 52, 122-124, 1939. The 1940's Authors : Deming, M. E. and Stephan, F. F. Title : On a least squares adjustment of a sampled frequency table when the expected marginal totals are known. Source : Annals of Mathematical Statistics, 11, 1940, 427-444. Authors : Lindquist, E. F. Title : Statistical analysis in educational research. Source : Boston: Houghton Mifflin, 1940. Authors : Deming, W. E. Title : Statistical adjustment of data. Source : New York: Dover Publications, Inc., 1943. The 1950's Authors : Robinson, W. S. Title : Ecological correlations and the behavior of individuals. Source : American Sociological Review, 15, 351-357, 1950. Authors : Simpson, E. H. Title : The interpretation of interaction in contingency tables. Source : The American Statistician, 13, 238-241, 1951. The 1960's Authors : Mosteller, F. Title : Association and estimation in contingency tables. Source : Journal of the American Statistical Association, 63, 1-28, 1968. The 1970's Authors : Goodman, L. A. Title : The multivariate analysis of qualitative data: interactions among multiple classifications. Source : Journal of the American Statistical Association, 65, 226-256, 1970. Authors : Blyth, C. R. Title : On Simpson's paradox and the sure thing principle. Source : Journal of the American Statistical Association, 67, 1972, 364-381. Authors : Bickel, P. J., Hammel, E. A., and O'Connell, J. W. Title : Sex bias in graduate admissions: Data from Berkeley. Source : Science, 187, 1975, 398-404. Authors : Bishop, Y. M. M., Fienberg, S. E., & Holland, P. W. Title : Discrete multivariate analysis: Theory and practice. Source : Cambridge, Massachusetts: The MIT Press, 1975. Authors : Gardner, M. Title : On the fabric of inductive logic and some probability paradoxes. Source : Scientific American, 234, 119- 124, 1976. Authors : Fienberg, S. E. Title : The analysis of cross-classified categorical data. Source : Cambridge, Massachusetts: The MIT Press, 1977. Authors : Keyfitz, N. Booktitle : Applied mathematical demography, Wiley, New York, pp. 385-391, 1977. Authors : Knapp, T. R. Title : The unit-of-analysis problem in applications of simple correlation analysis to educational research. Source : Journal of Educational Statistics, 2, 171-186, 1977. Authors : Freedman, D., Pisani, R., and Purves, R. Title : Statistics. Source : W.W. Norton & Company, New York, 1978. Authors : Whittemore, A. S. Title : Collapsibility of multi- dimensional contingency tables. Source : Journal of the Royal Statistical Society, Ser. B., 40, 328-340, 1978. The 1980's Authors : Hintzman, D. L. Title : Simpson's paradox and the analysis of memory retrieval. Source : Psychological Review, 87, 398-410, 1980. Authors : Flexser, A. J. Title : Homogenizing the 2 X 2 contingency table: A method for removing dependencies due to subject and item differences. Source : Psychological Review, 88, 327-339, 1981. Authors : Martin, E. Title : Simpson's paradox resolved: A reply to Hintzman. Source : Psychological Review, 88, 372-374, 1981. Authors : Mantell, N. Title : Simpson's paradox in reverse. Source : The American Statistician, 36, 395, 1982. Authors : Saari, D. G. Title : Inconsistencies of weighted summation voting systems. Source : Mathematics of Operations Research, 7(4), 479-490, 1982. Authors : Shapiro, S. H. Title : Collapsing contingency tables -- a geometric approach. Source : The American Statistician, 36, 43-46, 1982. Authors : Wagner, C. H. Title : Simpson's paradox in real life. Source : The American Statistician, 36, 46-48, 1982. Authors : Kennedy, J. J. (1983) Title : Analyzing qualitative data. Introductory log-linear analysis for behavioral research. Source : New York: Praeger Publishers, 1983. Authors : Sunder, S. Title : Simpson's reversal paradox and cost allocation. Source : Journal of Accounting Research, 21, 222-233, 1983. Authors : Knapp, T. R. Title : Instances of Simpson's paradox. Source : College Mathematics Journal, 16, 209-211, 1985. Authors : Paik, M. Title : A graphic representation of a three-way contingency table: Simpson's paradox and correlation. Source : The American Statistician, 39, 53-54, 1985. Authors : Vaupel, J. W. and Yashin, A. I. Title : The deviant dynamics of death in heterogeneous populations. Source : Sociological Methodology, Tuma, N. B. (ed), pp. 179-211, 1985. Authors : Vaupel, J. W. and Yashin, A. I. Title : Heterogeneity's ruses: some surprising effects of selection on population dynamics. Source : The American Statistician, 39(3), 176-185, 1985. Authors : Cohen, J. E. Title : An uncertainty principle in demography and the unisex issue. Source : The American Statistician, 41, 1986, 32-39. Authors : Saari, D. G. Title : The source of some paradoxes from social choice and probability. Source : Journal of Economic Theory, 41(1), 1-22, 1987 Authors : Saari, D. G. Title : Symmetry, Voting and Social Choice Source : The Mathematical Intelligencer, 10(3), 32-42, 1988. Authors : Kaigh, W. D. Title : A category representation paradox. Source : The American Statistician, 43(2), 92-97, 1989. Authors : Wermuth, N. Title : Moderating effects of subgroups in linear models. Source : Biometrika, 76, 81-92, 1989. The 1990's Authors : Freehling, J. S. Title : Simpson's paradox and database profiling. Source : Direct Marketing, 53(5), 26-27, 1990. Authors : Haunsperger, D. B. and Saari, D. G. Title : The lack of consistency for statistical decision procedures. Source : The American Statistician, 45(3), 252-255, 1991. Authors : Klay, M. P. and Wesley, L. P. Title : Simpson's paradox: a maximum likelihood solution. Source : SRI International Technical Report, No. 502, 1-11, 1991. Authors : Mittal, Y. Title : Homogeneity of subpopulations and Simpson's Paradox. Source : Journal of the American Statistical Association, 86(413), 167-172, 1991. Authors : Abramson N. S., Kelsey S. F., Safar P., and Sutton-Tyrrell K. Title : Simpson's paradox and clinical trials: What you find is not necessarily what you prove. Source : Annals of Emergency Medicine 21, pp. 1480-1482, 1992. Authors : DeBlois, B. M. Title : Simpson's Paradox. Source : Mathematica Militaris, 3(1), 1992. Authors : Mehrez, A., Brown, J. R., and Khouja, M. Title : Aggregate efficiency measures and Simpson's paradox. Source : Contemporary Accounting Research, 9(1), 329-342, 1992. Authors : Rogers, A. Title : Heterogeneity and selection in multistate population analysis. Source : Demography, 29(1), 31-38, 1992. Authors : Gunter, B. Title : A trio of statistical double takes. Source : Quality Progress, 26(6), 84-86, 1993. Authors : Simon, C. P. and Blume, L. Booktitle : Mathematics for Economists, W. W. Norton and Company, New York, pp. 368-371, pp. 784-791, 1994. Authors : Wardrop, R. L. Title : Simpson's Paradox and the Hot Hand in Basketball. Source : The American Statistician, 49, 24-28, 1995. _________________________________________________________________ Alphabetical Bibliography Authors : Abramson N. S., Kelsey S. F., Safar P., and Sutton-Tyrrell K. Title : Simpson's paradox and clinical trials: What you find is not necessarily what you prove. Source : Annals of Emergency Medicine 21, pp. 1480-1482, 1992. Authors : Bickel, P. J., Hammel, E. A., and O'Connell, J. W. Title : Sex bias in graduate admissions: Data from Berkeley. Source : Science, 187, 1975, 398-404. Authors : Bishop, Y. M. M., Fienberg, S. E., & Holland, P. W. Title : Discrete multivariate analysis: Theory and practice. Source : Cambridge, Massachusetts: The MIT Press, 1975. Authors : Blyth, C. R. Title : On Simpson's paradox and the sure thing principle. Source : Journal of the American Statistical Association, 67, 1972, 364-381. Authors : DeBlois, B. M. Title : Simpson's Paradox. Source : Mathematica Militaris, 3(1), 1992. Authors : Cohen, J. E. Title : An uncertainty principle in demography and the unisex issue. Source : The American Statistician, 41, 1986, 32-39. Authors : Deming, W. E. Title : Statistical adjustment of data. Source : New York: Dover Publications, Inc., 1943. Authors : Deming, M. E. and Stephan, F. F. Title : On a least squares adjustment of a sampled frequency table when the expected marginal totals are known. Source : Annals of Mathematical Statistics, 11, 1940, 427-444. Authors : Fienberg, S. E. Title : The analysis of cross-classified categorical data. Source : Cambridge, Massachusetts: The MIT Press, 1977. Authors : Flexser, A. J. Title : Homogenizing the 2 X 2 contingency table: A method for removing dependencies due to subject and item differences. Source : Psychological Review, 88, 327-339, 1981. Authors : Freedman, D., Pisani, R., and Purves, R. Title : Statistics. Source : W.W. Norton & Company, New York, 1978. Authors : Freehling, J. S. Title : Simpson's paradox and database profiling. Source : Direct Marketing, 53(5), 26-27, 1990. Authors : Gardner, M. Title : On the fabric of inductive logic and some probability paradoxes. Source : Scientific American, 234, 119- 124, 1976. Authors : Gunter, B. Title : A trio of statistical double takes. Source : Quality Progress, 26(6), 84-86, 1993. Authors : Goodman, L. A. Title : The multivariate analysis of qualitative data: interactions among multiple classifications. Source : Journal of the American Statistical Association, 65, 226-256, 1970. Authors : Haunsperger, D. B. and Saari, D. G. Title : The lack of consistency for statistical decision procedures. Source : The American Statistician, 45(3), 252-255, 1991. Authors : Hintzman, D. L. Title : Simpson's paradox and the analysis of memory retrieval. Source : Psychological Review, 87, 398-410, 1980. Authors : Kaigh, W. D. Title : A category representation paradox. Source : The American Statistician, 43(2), 92-97, 1989. Authors : Kennedy, J. J. (1983) Title : Analyzing qualitative data. Introductory log-linear analysis for behavioral research. Source : New York: Praeger Publishers, 1983. Authors : Keyfitz, N. Booktitle : Applied mathematical demography, Wiley, New York, pp. 385-391, 1977. Authors : Klay, M. P. and Wesley, L. P. Title : Simpson's paradox: a maximum likelihood solution. Source : SRI International Technical Report, No. 502, 1-11, 1991. Authors : Knapp, T. R. Title : The unit-of-analysis problem in applications of simple correlation analysis to educational research. Source : Journal of Educational Statistics, 2, 171-186, 1977. Authors : Knapp, T. R. Title : Instances of Simpson's paradox. Source : College Mathematics Journal, 16, 209-211, 1985. Authors : Lindquist, E. F. Title : Statistical analysis in educational research. Source : Boston: Houghton Mifflin, 1940. Authors : Mantell, N. Title : Simpson's paradox in reverse. Source : The American Statistician, 36, 395, 1982. Authors : Martin, E. Title : Simpson's paradox resolved: A reply to Hintzman. Source : Psychological Review, 88, 372-374, 1981. Authors : Mehrez, A., Brown, J. R., and Khouja, M. Title : Aggregate efficiency measures and Simpson's paradox. Source : Contemporary Accounting Research, 9(1), 329-342, 1992. Authors : Mittal, Y. Title : Homogeneity of subpopulations and Simpson's Paradox. Source : Journal of the American Statistical Association, 86(413), 167-172, 1991. Authors : Mosteller, F. Title : Association and estimation in contingency tables. Source : Journal of the American Statistical Association, 63, 1-28, 1968. Authors : Paik, M. Title : A graphic representation of a three-way contingency table: Simpson's paradox and correlation. Source : The American Statistician, 39, 53-54, 1985. Authors : Rogers, A. Title : Heterogeneity and selection in multistate population analysis. Source : Demography, 29(1), 31-38, 1992. Authors : Robinson, W. S. Title : Ecological correlations and the behavior of individuals. Source : American Sociological Review, 15, 351-357, 1950. Authors : Saari, D. G. Title : Inconsistencies of weighted summation voting systems. Source : Mathematics of Operations Research, 7(4), 479-490, 1982. Authors : Saari, D. G. Title : The source of some paradoxes from social choice and probability. Source : Journal of Economic Theory, 41(1), 1-22, 1987 Authors : Saari, D. G. Title : Symmetry, Voting and Social Choice Source : The Mathematical Intelligencer, 10(3), 32-42, 1988. Authors : Shapiro, S. H. Title : Collapsing contingency tables -- a geometric approach. Source : The American Statistician, 36, 43-46, 1982. Authors : Simon, C. P. and Blume, L. Booktitle : Mathematics for Economists, W. W. Norton and Company, New York, pp. 368-371, pp. 784-791, 1994. Authors : Simpson, E. H. Title : The interpretation of interaction in contingency tables. Source : The American Statistician, 13, 238-241, 1951. Authors : Sunder, S. Title : Simpson's reversal paradox and cost allocation. Source : Journal of Accounting Research, 21, 222-233, 1983. Authors : Thorndike, E. L. Title : On the fallacy of imputing the correlations found for groups to individuals or smaller groups composing them. Source : American Journal of Psychology, 52, 122-124, 1939. Authors : Vaupel, J. W. and Yashin, A. I. Title : Heterogeneity's ruses: some surprising effects of selection on population dynamics. Source : The American Statistician, 39(3), 176-185, 1985. Authors : Vaupel, J. W. and Yashin, A. I. Title : The deviant dynamics of death in heterogeneous populations. Source : Sociological Methodology, Tuma, N. B. (ed), pp. 179-211, 1985. Authors : Wagner, C. H. Title : Simpson's paradox in real life. Source : The American Statistician, 36, 46-48, 1982. Authors : Wardrop, R. L. Title : Simpson's Paradox and the Hot Hand in Basketball. Source : The American Statistician, 49, 24-28, 1995. Authors : Wermuth, N. Title : Moderating effects of subgroups in linear models. Source : Biometrika, 76, 81-92, 1989. Authors : Whittemore, A. S. Title : Collapsibility of multi- dimensional contingency tables. Source : Journal of the Royal Statistical Society, Ser. B., 40, 328-340, 1978. Authors : Yule, G. U. Title : Notes on the theory of association of attributes in statistics. Source : Biometrica, 2, 121-134, 1903. -- Dr. John R. Vokey, Associate Professor, Department of Psychology University of Lethbridge, Lethbridge, Alberta, CANADA T1K 3M4 mailto:vokey@hg.uleth.ca http://www.uleth.ca/~vokey