16  Program Evaluation

16.1 The Goal of Program Evaluation

The goal of program evaluation is to study the effectiveness of interventions. Decision makers, including policymakers and industry leaders, need to know to what extent (possibly costly) interventions will have their intended effects.

In public policy and economics, emblematic questions in program evaluation include:

  • Do programs that transfer cash to individuals reduce poverty?
  • Does increasing the minimum wage affect employment?
  • Does health insurance improve health outcomes?

In industry, questions requiring casual analysis might be:

  • Does changing the price of one item affect the overall amount customers spend?
  • How does changing an interface affect user engagement?
  • Do personalized discounts or recommendations result in additional spending?

To consider these questions, we cannot rely solely on basic regression analysis.

16.2 Causality vs. Correlation

Regression gets at correlation, not at causality. While this may be intuitive, we can review the fundamental concepts to have a more complete understanding of why regression may be inappropriate for drawing certain conclusions from the data.

16.2.1 Causality

In everyday situations, there can be a clear understanding of cause and effect. If I enter the pin of my door code, that will cause the front door of my house to unlock. There is no question here what was the cause and what was the effect. Decision makers are interested in understanding causality when it is not so obvious.

Causality is the effects within a causal model keeping other conditions the same (ceteris paribus). A causal model basically contains the following elements

  • Variables determined inside the model (\(Y\)). These are called outcome or dependent variables.
  • Variables determined outside the model \((X, U)\). These are called covariates, regressors, or independent variables.
  • Functional relationships between \((X, U)\) and \(Y\). This can be written generally as \(Y = g(X,U)\).

Causality refers to some inherent relationship of cause \((X, U)\) and effect \(Y\).

16.2.2 Correlation

Correlation refers to a statistical relationship between \(X\) and \(Y\). Mathematically, it is

\[\begin{equation*} \rho_{X, Y} = \frac{\text{Cov}(X, Y)}{\sigma_X \sigma_Y}. \end{equation*}\]

  • \(\text{Cov}(X, Y)\) is the covariance between \(X\) and \(Y\). It is equal to \(\mathbb{E}\left[(X - \mu_X)(Y - \mu_Y)\right]\), where \(\mu_X\) and \(\mu_Y\) are the means of \(X\) and \(Y\). The covariance measures how \(X\) varies with \(Y\) and how \(Y\)varies with \(X\).
  • \(\sigma_X\) and \(\sigma_Y\) are the standard deviations of \(X\) and \(Y\).

Because the correlation is the covariance divided by the standard deviations, it can be thought of as a rescaled covariance. There are some important properties of correlation.

  • Correlation is symmetric. This is one clear reason why the concepts of correlation and causality are separate.
  • The sign of the correlation is equal to the sign of the covariance.
  • The correlation is between \(-1\) and \(1\). This helps give a sense of the direction and magnitude of the linear relationship between \(X\) and \(Y\).

16.3 Challenges of Program Evaluation

To understand why program evaluation is challenging, it is useful to define the potential outcomes framework.

16.3.1 Potential Outcomes

Suppose there is an outcome \(Y\). What if we were to “fix” units to some value of \(X\) called \(x\). Note that this does not mean that we always actually see units with this value of \(x\) in real life. It is the thought experiment of setting \(X\) to a certain value. Then, the potential outcome is \(Y(x)\). Based on the function defined above, this is equal to \(g(x, U)\). We can make this more concrete.

Consider that there is some job training. \(D\) indicates whether the individual receives treatment (\(D = 1\)) or not (\(D = 0\)). The potential outcome \(Y(1)\) is the outcome (wage) fixing the individual to receiving the treatment. The potential outcome \(Y(0)\) is the outcome fixing the individual to not receiving the treatment.

16.3.2 Fundamental Problem of Causal Analysis

The potential outcome framework makes it simple to understand why causal analysis is hard. The fundamental problem is that we can never observe \(Y(1)\) and \(Y(0)\) at the same time for the same unit. There is no way to solve this problem and so we must use techniques in causal analysis. These techniques allow us to speak to causality, but under assumptions. We will never be able to compare, at the individual level, \(Y(1)\) and \(Y(0)\) from direct observation. The counterfactual is what would have happened in the absence of treatment.

16.4 Common Threats to Causal Analysis

16.4.1 Selection Bias

This is a very common issue that arises for program evaluation. If individuals can chose values of \(X\), then there may be a relationship between \(U\) and \(Y\). For example, people who are employed are a selected sample, even after considering observed characteristics.

Suppose we evaluate a job training program and find that participants have higher wages than non-participants. This does not necessarily mean the program is effective. The people who sign up may already be more motivated or skilled.

16.4.2 Omitted Variable Bias

Omitted variable bias arises when an unobserved factor influences both treatment and the outcome.

Example: We might observe that students in smaller classes perform better on standardized tests. However, if wealthier districts tend to have smaller classes, the true cause of higher test scores could be better funding, not class size.

16.4.3 Reverse Causality

Reverse causality occurs when the outcome influences treatment rather than the other way around.

Example: A study finds that hospitals with more doctors per patient have higher mortality rates. Does this mean more doctors cause worse health outcomes? Likely not—hospitals with high mortality rates might hire more doctors in response to sicker patients.

16.4.4 Measurement Error

If variables are measured inaccurately, our estimates of causal effects may be biased.

Example: If self-reported income is frequently understated, a study examining the effect of education on earnings might misrepresent the true relationship.

16.4.5 Attrition

When individuals drop out of a study non-randomly, it can distort results.

Example: In a medical trial, if only the healthiest patients continue treatment while the sickest drop out, the program may appear more effective than it actually is.

16.4.6 Spillover Effects

Sometimes, a treatment affects individuals who are not directly receiving it, making causal estimates difficult.

Example: If a school introduces a new reading program, students in neighboring schools might also benefit if teachers share materials. The true effect of the program is then underestimated.

16.5 Further Reading

These notes draw from Matt Mastens’s Identification and Causality notes.