18  Discontinuity Designs

Randomized controlled trials are considered the “gold standard.” This means that when possible, a randomized controlled trial can produce plausible and useful causal estimates for the question at hand. However, randomization is not always possible, desirable, or ethical. There are some situations where there is “quasi-randomness.” This means that there is an argument that something about the intervention occurred as if it were random. Discontinuity designs are one example of a quasi-random design.

18.1 Defining a Discontinuity

Suppose treatment variable is \(D_i\). If \(D_i = 1\), then unit \(i\) is treated. If \(D_i = 0\), then unit \(i\) is in the control group. However, in this case, treatment was not randomly assigned. Instead, there is an underlying, continuous variable, \(W_i\), that determines \(D_i\):

\[\begin{equation} D_i = \begin{cases} 1 & W_i \geq c \\ 0 W_i < c. \end{cases} \end{equation}\]

Given some threshold \(c\), any value of \(W_i\) above the threshold implies treatment status and any value below implies control status. This setup allows us to use a regression discontinuity design.

The general idea is that units just below and just above the cutoff are similar, both in observed and unobserved ways. The average treatment effect for units for which \(W_i = c\) will provide a good approximation to the treatment effect.

The regression model is:

\[ Y_i = \beta_0 + \beta_1 D_i + \beta_2 W_i + U_i. \tag{18.1}\]

This is sharp RDD because it is deterministic. As long as we know \(W_i\) we know with certainty what \(D_i\) will be.

The identifying assumption is that units cannot alter \(W_i\) with precision around the cut-off. For example, suppose \(W_i\) is a test score and the cutoff \(c\) determines if a student enters a gifted program. If students can study before the test, take the test multiple times, or appeal to slightly change their score, then the randomness is not as plausible. Students who partake in these “manipulations” may be unobservably different than students who do not.

18.2 Illustration with Simulated Data

Simulate and visualize the data. There is a cutoff at \(W = 0\).

library(dplyr)

Attaching package: 'dplyr'
The following objects are masked from 'package:stats':

    filter, lag
The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union
library(ggplot2)

# Set seed
set.seed(470500)

# Sample data
W <-runif(1000, -1, 1)
Y <- 3 + 2 * W + 10 * (W >= 0) + rnorm(1000)

# Construct the data
df <- data.frame(Y = Y, W = W) %>%
  as_tibble()

df <- df %>%
  mutate(D = ifelse(W >= 0, 1, 0))

# Plot the data
ggplot(data = df) +
  geom_point(aes(x = W, y = Y, color = as.factor(D))) +
  geom_vline(xintercept = 0) +
  labs(color = "Treatment Status")

Scatterplot of outcome Y against running variable W, showing a sharp jump at W = 0 by treatment status.

The simplest way to estimate the regression is taking Equation Equation 18.1 at face value and estimating that. We are interested in the coefficient on \(D\). That is, given a value of \(W\), the effect of treatment is 10.137 units.

\[ Y = \alpha + \tau D + \epsilon \]

rdd <- lm(formula = Y ~ D + W,
   data = df)

summary(rdd)

Call:
lm(formula = Y ~ D + W, data = df)

Residuals:
    Min      1Q  Median      3Q     Max 
-3.2590 -0.6539 -0.0217  0.6412  2.9480 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  2.93787    0.07114   41.30   <2e-16 ***
D           10.13690    0.12467   81.31   <2e-16 ***
W            1.82822    0.10773   16.97   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.9982 on 997 degrees of freedom
Multiple R-squared:  0.9732,    Adjusted R-squared:  0.9731 
F-statistic: 1.807e+04 on 2 and 997 DF,  p-value: < 2.2e-16

This model linearly extrapolates across the threshold. It imposes that both the treatment and control group have the same slope (the coefficient on \(W\)). Notice that the lines below are parallel so we can believe it, but it is not always the case. In those settings, researchers use other models.

ggplot(data = df) +
  geom_point(aes(x = W, y = Y, color = as.factor(D))) +
  geom_vline(xintercept = 0) +
  geom_smooth(aes(x = W, y = Y, group = D), method = "lm", se = FALSE) +
  labs(color = "Treatment Status")
`geom_smooth()` using formula = 'y ~ x'

Scatterplot with fitted lines showing a discontinuity in Y at W = 0, by treatment status.

Here is an example with more complex data.

# Set seed
set.seed(470500)

# Sample data
W <-runif(1000, -1, 1)
Y <- 3 + 2 * W + 15 * (W >= 0) + 3 * W^2 + rnorm(1000)

# Construct the data
df <- data.frame(Y = Y, W = W) %>%
  as_tibble()

df <- df %>%
  mutate(D = ifelse(W >= 0, 1, 0))

# Plot the data
ggplot(data = df) +
  geom_point(aes(x = W, y = Y, color = as.factor(D))) +
  geom_vline(xintercept = 0)  +
  geom_smooth(aes(x = W, y = Y, group = D), method = "lm", se = FALSE) +
  labs(color = "Treatment Status")
`geom_smooth()` using formula = 'y ~ x'

Scatterplot with linear fits showing a sharp jump and differing slopes in Y at W = 0 by treatment status.

Here it is clear there are different slopes above and below the threshold. We can interact the running variable (\(W\)) with the treatment variable (\(D\)).

lm(formula = Y ~ D + W:D,
   data = df) %>% 
  summary()

Call:
lm(formula = Y ~ D + W:D, data = df)

Residuals:
    Min      1Q  Median      3Q     Max 
-3.4561 -0.6953 -0.0607  0.6742  3.1522 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  3.04936    0.04739   64.35   <2e-16 ***
D           14.57101    0.10195  142.92   <2e-16 ***
D:W          4.73620    0.15891   29.80   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.043 on 997 degrees of freedom
Multiple R-squared:  0.9852,    Adjusted R-squared:  0.9852 
F-statistic: 3.321e+04 on 2 and 997 DF,  p-value: < 2.2e-16

Often, researchers add a quadratic term to relax the assumption that the fit is linear.

lm(formula = Y ~ D + W:D + I(W^2) + I(W^2):D,
   data = df) %>% 
  summary()

Call:
lm(formula = Y ~ D + W:D + I(W^2) + I(W^2):D, data = df)

Residuals:
     Min       1Q   Median       3Q      Max 
-3.13868 -0.67245 -0.02409  0.63367  2.96582 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  2.64185    0.06809  38.802  < 2e-16 ***
D           15.48166    0.14458 107.077  < 2e-16 ***
I(W^2)       1.17407    0.14629   8.025 2.84e-15 ***
D:W          1.65227    0.59487   2.778  0.00558 ** 
D:I(W^2)     1.94160    0.59915   3.241  0.00123 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.9979 on 995 degrees of freedom
Multiple R-squared:  0.9865,    Adjusted R-squared:  0.9864 
F-statistic: 1.815e+04 on 4 and 995 DF,  p-value: < 2.2e-16

18.3 Examples of RDD

See example papers.

18.4 Further Reading

18.4.1 References