Mediator#
A mediator is a variable that sits in the causal pathway between an exposure and an outcome, explaining the mechanism through which the exposure exerts its effect on the outcome.
Graphical Summary#

Key Formula#
The key formula for the concept of a mediator is represented in a causal diagram as:
Where:
\(X\) is the independent variable (e.g., genetic variant)
\(W\) is the mediator variable
\(Y\) is the dependent variable (e.g., trait)
The arrows (\(\rightarrow\)) indicate the direction of causal influence
This diagram illustrates that a mediator (\(W\)) lies in the causal pathway between the independent variable (\(X\)) and the dependent variable (\(Y\)). The mediator transmits the effect of \(X\) on \(Y\), creating a causal pathway through which \(X\) affects \(Y\).
Technical Details#
The Mediation Framework#
A mediator explains the mechanism by which a genetic variant affects an outcome, representing the actual biological pathway:
Where:
Total Effect: SNP -> Outcome (\(\beta\) without controlling for mediator)
Effect through Mediator: SNP -> Mediator -> Outcome (the mediated pathway = \(a \times b\))
Other Effects: Effect NOT through the mediator (\(\beta\) when controlling for mediator - includes unmeasured pleiotropy)
Evidence for Mediation#
Strong evidence when controlling for the mediator:
Reduces effect size: Total effect > Effect after controlling for mediator
Eliminates significance: p-value increases substantially
Biological plausibility: Mediator is in known pathway
Analysis Steps#
Estimate total effect:
lm(Outcome ~ SNP)(should be significant)Test for mediation:
lm(Outcome ~ SNP + Mediator)(SNP effect should reduce/disappear)
Interpretation:
If SNP effect disappears -> Complete mediation
If SNP effect reduces -> Partial mediation
If SNP effect unchanged -> No mediation
Mediation vs Other Variable Types#
Type |
Question |
Action |
Structure |
Examples |
|---|---|---|---|---|
Confounder |
Does this affect both SNP and outcome? |
Must control to remove bias |
SNP ← Confounder → Outcome |
Population ancestry, age, sex, environmental exposures |
Collider |
Is this caused by both SNP and outcome? |
Never control—creates bias |
SNP → Collider ← Outcome |
Study participation, hospital admission, survival to study age |
Mediator |
Does this explain HOW SNP affects outcome? |
Can control to isolate direct effects |
SNP → Mediator → Outcome |
Gene expression, protein levels, hormone levels, enzyme activity |
Important: Whether a variable is a confounder, mediator, or collider depends on your research question and the causal structure of your specific analysis. The same variable can play different roles in different analyses.
Example#
A genetic variant is associated with height—but how does it influence height? We have 5 individuals and suspect growth hormone is the mediator. If growth hormone is the only pathway, then controlling for it should eliminate the variant’s association with height. If the association disappears, we have complete mediation; if it merely reduces, we have partial mediation.
Setup#
# Clear the environment
rm(list = ls())
set.seed(16)
# Define genotypes for 5 individuals at 3 variants
# These represent actual alleles at each position
# For example, Individual 1 has genotypes: CC, CT, AT
genotypes <- c(
"CC", "CT", "AT", # Individual 1
"TT", "TT", "AA", # Individual 2
"CT", "CT", "AA", # Individual 3
"CC", "TT", "AA", # Individual 4
"CC", "CC", "TT" # Individual 5
)
# Reshape into a matrix
N = 5
M = 3
geno_matrix <- matrix(genotypes, nrow = N, ncol = M, byrow = TRUE)
rownames(geno_matrix) <- paste("Individual", 1:N)
colnames(geno_matrix) <- paste("Variant", 1:M)
alt_alleles <- c("T", "C", "T")
# Convert to raw genotype matrix using the additive model
Xraw_additive <- matrix(0, nrow = N, ncol = M) # count number of non-reference alleles
rownames(Xraw_additive) <- rownames(geno_matrix)
colnames(Xraw_additive) <- colnames(geno_matrix)
for (i in 1:N) {
for (j in 1:M) {
alleles <- strsplit(geno_matrix[i,j], "")[[1]]
Xraw_additive[i,j] <- sum(alleles == alt_alleles[j])
}
}
X <- scale(Xraw_additive, center = TRUE, scale = TRUE)
We assign the growth hormones levels for each individual from variant 3:
# Generate growth hormone levels FROM variant 3 (mediator pathway)
GH_raw <- 6 + 2 * Xraw_additive[, 3] + rnorm(N, 0, 0.1) # Variant 3 affects GH
GH <- scale(GH_raw)
Then we assign the height for the individuals (mediated by hormones):
# Create mediator structure: Variant 3 -> Growth Hormone -> Height
# Height is caused by:
# 1. Direct effect from growth hormone (the mediator)
# 2. Small effects from variants 1 and 2 (not mediated)
# 3. NO direct effect from variant 3 (fully mediated through GH)
height_raw <- 160 + # Base height
3 * GH + # Growth hormone effect (mediator pathway)
1 * X[, 1] + # Small direct effect from variant 1
0.5 * X[, 2] + # Small direct effect from variant 2
0 * X[, 3] + # NO direct effect from variant 3 (fully mediated)
rnorm(N, 0, 0.5) # Small noise
Y <- scale(height_raw)
OLS Regression#
Then we perform OLS regression for the third variant:
# Perform OLS regression for the third variant
SNP <- X[, 3] # Extract genotype for SNP 3
model <- lm(Y ~ SNP) # OLS regression: Trait ~ SNP
adjusted_model <- lm(Y ~ SNP + GH) # Adjust for GH, OLS regression: Trait ~ SNP + GH
summary_model <- summary(model)
summary_adjusted_model <- summary(adjusted_model)
p_value <- summary_model$coefficients[2, 4] # p-value for SNP effect
beta <- summary_model$coefficients[2, 1] # Estimated beta coefficient
p_value_adjusted <- summary_adjusted_model$coefficients[2, 4] # p-value for SNP effect adjusted for growth hormone
beta_adjusted <- summary_adjusted_model$coefficients[2, 1] # Estimated beta coefficient adjusted for growth hormone
# Create results table
results <- data.frame(Variant = "Variant 3", Beta = beta, P_Value = p_value,
Beta_Adjusted = beta_adjusted, P_Value_Adjusted = p_value_adjusted)
results
| Variant | Beta | P_Value | Beta_Adjusted | P_Value_Adjusted |
|---|---|---|---|---|
| <chr> | <dbl> | <dbl> | <dbl> | <dbl> |
| Variant 3 | 0.9277802 | 0.02304393 | 2.220295 | 0.6211939 |
Variant 3 shows complete mediation: its association with height (p = 0.023) disappears when controlling for growth hormone (p = 0.62). This confirms that Variant 3 affects height entirely through growth hormone, with no direct pathway.