PRS-CS: a polygenic prediction method that infers posterior SNP effect sizes under continuous shrinkage (CS) priors

This function is a wrapper for the PRS-CS method implemented in C++. It takes marginal effect size estimates from regression and an external LD reference panel and infers posterior SNP effect sizes using Bayesian regression with continuous shrinkage priors.

Usage

prs_cs(
  bhat,
  LD,
  n,
  a = 1,
  b = 0.5,
  phi = NULL,
  maf = NULL,
  n_iter = 1000,
  n_burnin = 500,
  thin = 5,
  verbose = FALSE,
  seed = NULL
)

Arguments

bhat: A vector of marginal effect sizes.
LD: A list of LD blocks, where each element is a matrix representing an LD block.
n: Sample size of the GWAS.
a: Shape parameter for the prior distribution of psi. Default is 1.
b: Scale parameter for the prior distribution of psi. Default is 0.5.
phi: Global shrinkage parameter. If NULL, it will be estimated automatically. Default is NULL.
maf: A vector of minor allele frequencies, if available, will standardize the effect sizes by MAF. Default is NULL.
n_iter: Number of MCMC iterations. Default is 1000.
n_burnin: Number of burn-in iterations. Default is 500.
thin: Thinning factor for MCMC. Default is 5.
verbose: Whether to print verbose output. Default is FALSE.
seed: Random seed for reproducibility. Default is NULL.

Value

A list containing the posterior estimates: - beta_est: Posterior estimates of SNP effect sizes. - psi_est: Posterior estimates of psi (shrinkage parameters). - sigma_est: Posterior estimate of the residual variance. - phi_est: Posterior estimate of the global shrinkage parameter.

Examples

# Generate example data
set.seed(985115)
n <- 350
p <- 16
sigmasq_error <- 0.5
zeroes <- rbinom(p, 1, 0.6)
beta.true <- rnorm(p, 1, sd = 4)
beta.true[zeroes] <- 0

X <- cbind(matrix(rnorm(n * p), nrow = n))
X <- scale(X, center = TRUE, scale = FALSE)
y <- X %*% matrix(beta.true, ncol = 1) + rnorm(n, 0, sqrt(sigmasq_error))
y <- scale(y, center = TRUE, scale = FALSE)

# Calculate sufficient statistics
XtX <- t(X) %*% X
Xty <- t(X) %*% y
yty <- t(y) %*% y

# Set the prior
K <- 9
sigma0 <- c(0.001, .1, .5, 1, 5, 10, 20, 30, .005)
omega0 <- rep(1 / K, K)

# Calculate summary statistics
b.hat <- sapply(1:p, function(j) {
  summary(lm(y ~ X[, j]))$coefficients[-1, 1]
})
s.hat <- sapply(1:p, function(j) {
  summary(lm(y ~ X[, j]))$coefficients[-1, 2]
})
R.hat <- cor(X)
var_y <- var(y)
sigmasq_init <- 1.5

# Run PRS CS
maf <- rep(0.5, length(b.hat)) # fake MAF
LD <- list(blk1 = R.hat)
out <- prs_cs(b.hat, LD, n, maf = maf)
# In sample prediction correlations
cor(X %*% out$beta_est, y) # 0.9944553
#>           [,1]
#> [1,] 0.9939295