Meta-Analysis Random Effect#
Random-effects meta-analysis accounts for variability in effect sizes across studies by assuming each study estimates a different effect drawn from a shared distribution, thus modeling both within-study error and between-study heterogeneity.
Graphical Summary#

Key Formula#
In random-effects meta-analysis, we assume that the true effect sizes vary across studies. The weighted mean effect size is calculated as:
Where:
\(\hat{\beta}\) is the combined effect estimate across all studies
\(\hat{\beta}_k\) is the effect estimate from study \(k\)
\(w_k^* = \frac{1}{\text{SE}_k^2 + \tau^2}\) is the random-effects weight for study \(k\)
\(\tau^2\) is the between-study variance (heterogeneity)
\(K\) is the number of studies
The key difference from fixed-effects is that weights now include \(\tau^2\), which accounts for true heterogeneity between studies.
Technical Details#
Heterogeneity#
Heterogeneity refers to the variability in effect sizes across different studies beyond what we’d expect from random sampling error alone. In other words, are the studies telling us the same story, or are they finding genuinely different effects? The source of heterogeneity comes from different genetic ancestries, study designs or measurement methods, and environmental context, etc.
When heterogeneity is present, assume each study estimates its own true effect drawn from a common distribution.
Assumption: True effects follow a distribution
Where:
\(\beta\) = mean effect across all possible studies (what we want to estimate)
\(\tau^2\) = between-study variance (how much true effects vary across studies)
Each study \(k\) has its own true effect \(\beta_k\)
Measuring Heterogeneity#
\(I^2\) statistic: The most intuitive measure - tells us what percentage of the observed variation comes from real differences between studies rather than random chance.
Interpretation:
\(I^2 = 0\%\): Studies are consistent - variation is just due to random sampling
\(I^2 = 25\%\): Low heterogeneity - studies are mostly similar
\(I^2 = 50\%\): Moderate heterogeneity - some real differences between studies
\(I^2 = 75\%\): High heterogeneity - studies are finding quite different effects
Example: If \(I^2 = 60\%\), this means 60% of the variation we see across studies reflects real differences in effect sizes, while only 40% is due to random sampling error.
Comparison Between Fixed and Random Effects#
Aspect |
Fixed Effect |
Random Effect |
|---|---|---|
Assumption |
All studies estimate the same true effect |
Studies estimate different true effects from a distribution |
Weight Formula |
\(\frac{1}{\text{SE}_k^2}\) (only within-study variance) |
\(\frac{1}{\text{SE}_k^2 + \tau^2}\) (within-study + between-study variance) |
When to Use |
Studies are very similar in design and population |
Significant heterogeneity between studies |
Results |
Narrower confidence intervals, give the effect in the populations |
Wider confidence intervals, give the average effect across broader populations |
Example#
We simulate two studies with different true effect sizes (drawn from a distribution with τ² = 0.3) to demonstrate random-effects meta-analysis. This scenario mimics real heterogeneity where populations differ in genetic backgrounds or environmental factors, making the random effects model more appropriate than fixed effects.
Setup#
rm(list=ls())
set.seed(18)
# Simulate 2 diverse cohorts where TRUE EFFECT SIZES are drawn from a distribution
K <- 2 # Number of studies
N <- c(5000, 8000) # Different sample sizes
# Different MAFs reflecting population diversity
mafs <- c(0.25, 0.40)
# RANDOM EFFECTS MODEL: True effect sizes drawn from distribution
# This is the key difference - betas are RANDOM, not fixed
beta_mean <- 1.0 # Mean effect size across all possible studies
tau_squared_true <- 0.3 # True between-study variance
# Draw true effect sizes from distribution
true_betas <- rnorm(K, mean = beta_mean, sd = sqrt(tau_squared_true))
Then we generate the data for each study and create a summary table.
# Generate data for each study
studies_data <- list()
for(i in 1:K) {
# Generate genotypes
genotypes <- rbinom(N[i], 2, mafs[i])
# Generate phenotypes using the RANDOM true effect
phenotypes <- true_betas[i] * genotypes + rnorm(N[i], 0, 3)
# Run regression
lm_result <- lm(phenotypes ~ genotypes)
# Store results
studies_data[[i]] <- list(
study_id = i,
n = N[i],
maf = mafs[i],
true_beta = true_betas[i],
observed_beta = coef(lm_result)["genotypes"],
se = summary(lm_result)$coefficients["genotypes", "Std. Error"]
)
}
# Create summary table
studies <- data.frame(
Study = 1:K,
N = sapply(studies_data, function(x) x$n),
MAF = sapply(studies_data, function(x) x$maf),
True_Beta = sapply(studies_data, function(x) x$true_beta),
Observed_Beta = sapply(studies_data, function(x) x$observed_beta),
SE = sapply(studies_data, function(x) x$se)
)
studies$P_Value <- 2 * pnorm(-abs(studies$Observed_Beta / studies$SE))
studies
| Study | N | MAF | True_Beta | Observed_Beta | SE | P_Value |
|---|---|---|---|---|---|---|
| <int> | <dbl> | <dbl> | <dbl> | <dbl> | <dbl> | <dbl> |
| 1 | 5000 | 0.25 | 1.507443 | 1.507900 | 0.06967405 | 7.1981e-104 |
| 2 | 8000 | 0.40 | 1.998400 | 1.973478 | 0.04827749 | 0.0000e+00 |
Heterogenity#
We first assess the heterogeneity for this meta-analysis:
# Calculate inverse variance weights for heterogeneity testing
w <- 1 / studies$SE^2
studies$Weight <- w / sum(w)
# Calculate naive weighted average (fixed-effects estimate)
beta_naive <- sum(studies$Observed_Beta * w) / sum(w)
# Calculate Q statistic (test for heterogeneity)
Q <- sum(w * (studies$Observed_Beta - beta_naive)^2)
df <- K - 1
p_heterogeneity <- 1 - pchisq(Q, df)
# I^2 statistic (percentage of variation due to heterogeneity)
I_squared <- max(0, (Q - df) / Q) * 100
cat("Heterogeneity Statistics:\n")
cat("Q statistic:", round(Q, 3), "\n")
cat("p-value for heterogeneity:", round(p_heterogeneity, 4), "\n")
cat("I^2 statistic:", round(I_squared, 1), "%\n\n")
if(I_squared > 25) {
cat("HETEROGENEITY DETECTED - Random-effects model is appropriate!\n")
} else {
cat("Low heterogeneity detected\n")
}
Heterogeneity Statistics:
Q statistic: 30.168
p-value for heterogeneity: 0
I^2 statistic: 96.7 %
HETEROGENEITY DETECTED - Random-effects model is appropriate!
Meta-analysis Random Effects#
So now we conduct the meta-analysis for the two studies:
# Estimate between-study variance (tau^2) using DerSimonian-Laird method
sum_w <- sum(w)
sum_w_squared <- sum(w^2)
tau_squared_est <- max(0, (Q - df) / (sum_w - sum_w_squared/sum_w))
# Calculate random-effects weights
w_random <- 1 / (studies$SE^2 + tau_squared_est)
studies$Weight_Random <- w_random / sum(w_random)
# Random-effects estimate
beta_random <- sum(studies$Observed_Beta * w_random) / sum(w_random)
se_random <- sqrt(1 / sum(w_random))
z_random <- beta_random / se_random
p_random <- 2 * pnorm(-abs(z_random))
cat("Random-Effects Meta-Analysis Results:\n")
results <- data.frame(
Estimate = round(beta_random, 4),
SE = round(se_random, 4),
Z_score = round(z_random, 4),
P_value = format(p_random, scientific = TRUE, digits = 3)
)
results
Random-Effects Meta-Analysis Results:
| Estimate | SE | Z_score | P_value |
|---|---|---|---|
| <dbl> | <dbl> | <dbl> | <chr> |
| 1.7434 | 0.2328 | 7.4897 | 6.9e-14 |
Meta-analysis Fixed Effects#
We also perform the fixed effect meta analysis for comparison:
# Also perform fixed-effects meta-analysis for comparison
beta_fixed <- sum(studies$Observed_Beta * w) / sum(w)
se_fixed <- sqrt(1 / sum(w))
z_fixed <- beta_fixed / se_fixed
p_fixed <- 2 * pnorm(-abs(z_fixed))
cat("\nFixed-Effects Meta-Analysis Results:\n")
results_fixed <- data.frame(
Estimate = round(beta_fixed, 4),
SE = round(se_fixed, 4),
Z_score = round(z_fixed, 4),
P_value = format(p_fixed, scientific = TRUE, digits = 3)
)
results_fixed
Fixed-Effects Meta-Analysis Results:
| Estimate | SE | Z_score | P_value |
|---|---|---|---|
| <dbl> | <dbl> | <dbl> | <chr> |
| 1.8225 | 0.0397 | 45.9261 | 0e+00 |
Comparison of Results#
# Compare the two approaches
comparison <- data.frame(
Model = c("Fixed-Effects", "Random-Effects"),
Estimate = c(round(beta_fixed, 4), round(beta_random, 4)),
SE = c(round(se_fixed, 4), round(se_random, 4)),
CI_Width = c(round(1.96 * se_fixed * 2, 4), round(1.96 * se_random * 2, 4)),
P_value = c(format(p_fixed, scientific = TRUE, digits = 3),
format(p_random, scientific = TRUE, digits = 3))
)
comparison
| Model | Estimate | SE | CI_Width | P_value |
|---|---|---|---|---|
| <chr> | <dbl> | <dbl> | <dbl> | <chr> |
| Fixed-Effects | 1.8225 | 0.0397 | 0.1556 | 0e+00 |
| Random-Effects | 1.7434 | 0.2328 | 0.9125 | 6.9e-14 |
With high heterogeneity (\(I^2 = 96.7\%\)), the fixed-effects and random-effects models yield different results:
Fixed-effects (\(\hat{\beta}=1.82\), SE = 0.04): Assumes a shared fixed effect, producing a smaller SE
Random-effects (\(\hat{\beta}=1.74\), SE = 0.23): Accounts for between-study variance (\(\tau^2\)), producing wider confidence intervals
The random-effects estimate represents the average effect across populations, while the wider SE appropriately reflects both within-study and between-study uncertainty.