Train eQTL weights using multiple RSS methods (OTTERS Stage I)

Implements the training stage of the OTTERS framework (Omnibus Transcriptome Test using Expression Reference Summary data, Zhang et al. 2024). Trains eQTL effect size weights for a gene region using multiple summary-statistics-based methods in parallel, enabling downstream omnibus TWAS testing.

Usage

otters_weights(
  sumstats,
  LD,
  n,
  methods = list(lassosum_rss = list(), prs_cs = list(phi = 1e-04, n_iter = 1000,
    n_burnin = 500, thin = 5), sdpr = list(iter = 1000, burn = 200, thin = 1, verbose =
    FALSE)),
  p_thresholds = c(0.001, 0.05),
  check_ld_method = "eigenfix"
)

Arguments

sumstats

A data.frame of eQTL summary statistics. Must contain column z (z-scores). If z is absent but beta and se are present, z-scores are computed as beta / se.

LD

LD correlation matrix R for the gene region (single matrix, not a list). Should have row/column names matching variant identifiers if variant alignment is desired.

n

eQTL study sample size (scalar).

methods

Named list of RSS methods and their extra arguments. Each element name must correspond to a *_weights function in pecotmr (without the _weights suffix). Defaults match the original OTTERS pipeline (Zhang et al. 2024):

lassosum_rss: s grid = c(0.2, 0.5, 0.9, 1.0), lambda from 0.0001 to 0.1 (20 values on log scale)
prs_cs: phi = 1e-4 (fixed, not learned), 1000 iterations, 500 burn-in, thin = 5
sdpr: 1000 iterations, 200 burn-in, thin = 1 (no thinning)

To add learners (e.g., mr_ash_rss), simply append to this list.

p_thresholds

Numeric vector of p-value thresholds for P+T. Set to NULL to skip P+T. Default: c(0.001, 0.05).

check_ld_method

LD quality check method passed to check_ld. Default "eigenfix" sets negative eigenvalues to zero (required for PRS-CS Cholesky, matching OTTERS' SVD-based PD forcing). Set to NULL to skip checking.

Value

A named list of weight vectors (one per method). Each vector has length equal to nrow(sumstats). P+T results are named PT_<threshold>.

Details

Methods are dispatched dynamically via do.call(paste0(method, "_weights"), ...), so any function following the *_weights(stat, LD, ...) convention can be used (e.g., lassosum_rss_weights, prs_cs_weights, sdpr_weights, mr_ash_rss_weights).

P+T (pruning and thresholding) is handled internally: for each threshold, SNPs with eQTL p-value below the threshold are selected, and their marginal z-scores (scaled to correlation units: z / sqrt(n)) are used as weights.

Examples

set.seed(42)
n <- 500; p <- 20
z <- rnorm(p, sd = 2)
R <- diag(p)
sumstats <- data.frame(z = z)
weights <- otters_weights(sumstats, R, n,
  methods = list(lassosum_rss = list()),
  p_thresholds = c(0.05))