Soc 505: Causal Inference in the Social Sciences

Sociology 505: Causal Inference in the Social Sciences

Instructor: Glenn Shafer gshafer@soil

Although it is the goal of most statistical investigation, causal inference has traditionally been ignored by statistical theory. Fortunately, there is now intense activity in a number of fields, including sociology, psychology, econometrics, philosophy, and artificial intelligence, aimed at correcting this situation. This course will sample broadly from the burgeoning literature on causality in these fields. Topics will include the problem of selectivity, the use of concomitants, the causal interpretation of independence, the role of randomized experiments, and path analysis.

Each student will be expected to lead the discussion of at least one paper, but the instructor will lecture on basic topics and on relatively technical papers.


Week 1. Introduction

How can causal statements be justified in the social sciences? How far can regression and other statistical methods go towards justifying causal claims, and how valid is the run-of-the mill use of these methods in sociology? These readings provide a good starting point for addressing these general questions. Marini and Singer survey the philosophical literature on causality and discuss its application to the social sciences. Freedman and his discussants look at the state of the art in sociology. Lieberson's excellent book looks at the shortcomings of statistical methodology in sociology and suggests some alternatives.


Week 2. The Fallacy of Observational Control

"Correlation cannot prove causation." "Observational studies cannot substitute for experiments." "Statistical adjustment can do more harm than good when we are trying to understand causal relations." These strictures on the use of statistical evidence are more often repeated than understood. This week of reading takes a closer look, beginning with a close look at the more technical chapters of Lieberson's book.


Week 3. Strategies for Studying Selectivity

This week's readings provide useful case studies of how social scientists and epidemiologists struggle with the problem of selectivity. Goldman analyzes flaws in arguments used to assess whether marriage or high socio-economic status are causes of the greater longevity with which they are associated. Robins is concerned with the effect of long-term occupational hazards, the study of which is often bedeviled by the "healthy survivor effect": workers with greater exposure may be healthier because less robust individuals were not able to stay on the job long enough to acquire high levels of exposure.


Week 4. The Most Famous Example: Smoking and Lung Cancer

The example of smoking and lung cancer is cited so often in discussions of causal inference from statistical evidence that anyone who wants to participate in such discussions needs to know something about the evidence against smoking and how it has been used. The 1959 paper by Cornfield still provides the best overview of the evidence. Additional perspective is provided by papers by Cook and Stolley, which examine the role the famous statistician R. A. Fisher played in the controversy.


Week 5. Path Analysis

The graphical saliency of its causal interpretation makes path analysis a dependable producer of debate on causality. This week's readings begins with a debate led by David Freedman, which focuses both on the practical use of path analysis models and their technical meaning. The article by Kang and Seneta clarifies what is asserted by these models.


Week 6. The Causal Interpretation of Conditional Independence

Causal models, including the simultaneous-equations models used in econometrics and the path-analytic models used in sociology, are often explained in terms of conditional independence or partial uncorrelatedness. Conditional independence, it seems, indicates some absence of causal connection. A closer look shows that an independence relation or a path analysis model usually has more than one causal interpretation. Consequently, causal claims for path analysis and other models usually need to be made more specific before they can be subjected to real tests.


Week 7. The Rubin-Holland Interpretation of Causality

Don Rubin's interpretation of causality in terms of manipulation, real or hypothetical, has been increasingly influential in the social sciences. Here we look at Rubin's original essay and Paul Holland's widely read summary of the approach.


Week 8. Structural Equation Models and Latent Variables

What is the meaning of latent variables in causal models? As this week's readings will show, this question can be answered using either the instructor's probability-tree picture or Rubin's counterfactual picture.


Week 9. Conjecturing Causal Relations from Large Data Sets

Structural equation models are usually formulated a priori. Statistical evidence may be used to choose among a small number of models, but the emphasis is on testing and estimation. A more ambitious approach has been formulated recently in artificial intelligence. In this approach, models are inferred from data, using algorithms that search through a large number of models in order to match the conditional independence or uncorrelatedness relations observed in the data.


Week 10. Philosophical Accounts of Causal Explanation

Although there are many philosophical accounts of causality, those in the Reichenbach tradition are closest to the concerns of social sciences. Humphreys and Salmon, with their emphasis on causal explanation, are especially relevant.


Week 11. Psychological Accounts of Causality

Within psychology, there are at least three distinct traditions that have investigated causality in quite distinct ways. Developmental psychologists have been concerned with how children develop judgments of causality, behavioral psychologists with how adults judge contingency, and social psychologists with how people make attributions. The following articles represent a thin sampling from the developmental and social psychology traditions.


Week 12. Conclusion

Having taken our own tour of the literature on causal inference, we now review what we have learned using surveys by two authorities, a statistician and a sociologist.


Back to the Sociology Department!


sociolog@princeton.edu Jan '95