Skip to contents

This function is a wrapper for the SDPR C++ implementation, which performs Markov Chain Monte Carlo (MCMC) for estimating effect sizes and heritability based on summary statistics and reference LD matrices.

Usage

sdpr(
  bhat,
  LD,
  n,
  per_variant_sample_size = NULL,
  array = NULL,
  a = 0.1,
  c = 1,
  M = 1000,
  a0k = 0.5,
  b0k = 0.5,
  iter = 1000,
  burn = 200,
  thin = 5,
  n_threads = 1,
  opt_llk = 1,
  verbose = TRUE
)

Arguments

bhat

A vector of marginal beta values for each SNP.

LD

A list of LD matrices, where each matrix corresponds to a subset of SNPs.

n

The total sample size of the GWAS.

per_variant_sample_size

(Optional) A vector of sample sizes for each SNP. If NULL (default), it will be initialized to a vector of length equal to `bhat`, with all values set to `n`.

array

(Optional) A vector of genotyping array information for each SNP. If NULL (default), it will be initialized to a vector of 1's with length equal to `bhat`.

a

Factor to shrink the reference LD matrix. Default is 0.1.

c

Factor to correct for the deflation. Default is 1.

M

Max number of variance components. Default is 1000.

a0k

Hyperparameter for inverse gamma distribution. Default is 0.5.

b0k

Hyperparameter for inverse gamma distribution. Default is 0.5.

iter

Number of iterations for MCMC. Default is 1000.

burn

Number of burn-in iterations for MCMC. Default is 200.

thin

Thinning interval for MCMC. Default is 5.

n_threads

Number of threads to use. Default is 1.

opt_llk

Which likelihood to evaluate. 1 for equation 6 (slightly shrink the correlation of SNPs) and 2 for equation 5 (SNPs genotyped on different arrays in a separate cohort). Default is 1.

verbose

Whether to print verbose output. Default is true.

Value

A list containing the estimated effect sizes (beta) and heritability (h2).

Note

This function is a wrapper for the SDPR C++ implementation, which is a rewritten and adopted version of the SDPR package. The original SDPR documentation is available at https://htmlpreview.github.io/?https://github.com/eldronzhou/SDPR/blob/main/doc/Manual.html

Examples

# Generate example data
set.seed(985115)
n <- 350
p <- 16
sigmasq_error <- 0.5
zeroes <- rbinom(p, 1, 0.6)
beta.true <- rnorm(p, 1, sd = 4)
beta.true[zeroes] <- 0

X <- cbind(matrix(rnorm(n * p), nrow = n))
X <- scale(X, center = TRUE, scale = FALSE)
y <- X %*% matrix(beta.true, ncol = 1) + rnorm(n, 0, sqrt(sigmasq_error))
y <- scale(y, center = TRUE, scale = FALSE)

# Calculate sufficient statistics
XtX <- t(X) %*% X
Xty <- t(X) %*% y
yty <- t(y) %*% y

# Set the prior
K <- 9
sigma0 <- c(0.001, .1, .5, 1, 5, 10, 20, 30, .005)
omega0 <- rep(1 / K, K)

# Calculate summary statistics
b.hat <- sapply(1:p, function(j) {
  summary(lm(y ~ X[, j]))$coefficients[-1, 1]
})
s.hat <- sapply(1:p, function(j) {
  summary(lm(y ~ X[, j]))$coefficients[-1, 2]
})
R.hat <- cor(X)
var_y <- var(y)
sigmasq_init <- 1.5

# Run SDPR
LD <- list(blk1 = R.hat)
out <- sdpr(b.hat, LD, n)
# In sample prediction correlations
cor(X %*% out$beta_est, y) #
#>           [,1]
#> [1,] 0.9983291