Skip to contents

Implements the training stage of the OTTERS framework (Omnibus Transcriptome Test using Expression Reference Summary data, Zhang et al. 2024). Trains eQTL effect size weights for a gene region using multiple summary-statistics-based methods in parallel, enabling downstream omnibus TWAS testing.

Usage

otters_weights(
  sumstats,
  LD,
  n,
  methods = list(lassosum_rss = list(), prs_cs = list(phi = 1e-04, n_iter = 1000,
    n_burnin = 500, thin = 5), sdpr = list(iter = 1000, burn = 200, thin = 1, verbose =
    FALSE)),
  p_thresholds = c(0.001, 0.05),
  check_ld_method = "eigenfix"
)

Arguments

sumstats

A data.frame of eQTL summary statistics. Must contain column z (z-scores). If z is absent but beta and se are present, z-scores are computed as beta / se.

LD

LD correlation matrix R for the gene region (single matrix, not a list). Should have row/column names matching variant identifiers if variant alignment is desired.

n

eQTL study sample size (scalar).

methods

Named list of RSS methods and their extra arguments. Each element name must correspond to a *_weights function in pecotmr (without the _weights suffix). Defaults match the original OTTERS pipeline (Zhang et al. 2024):

  • lassosum_rss: s grid = c(0.2, 0.5, 0.9, 1.0), lambda from 0.0001 to 0.1 (20 values on log scale)

  • prs_cs: phi = 1e-4 (fixed, not learned), 1000 iterations, 500 burn-in, thin = 5

  • sdpr: 1000 iterations, 200 burn-in, thin = 1 (no thinning)

To add learners (e.g., mr_ash_rss), simply append to this list.

p_thresholds

Numeric vector of p-value thresholds for P+T. Set to NULL to skip P+T. Default: c(0.001, 0.05).

check_ld_method

LD quality check method passed to check_ld. Default "eigenfix" sets negative eigenvalues to zero (required for PRS-CS Cholesky, matching OTTERS' SVD-based PD forcing). Set to NULL to skip checking.

Value

A named list of weight vectors (one per method). Each vector has length equal to nrow(sumstats). P+T results are named PT_<threshold>.

Details

Methods are dispatched dynamically via do.call(paste0(method, "_weights"), ...), so any function following the *_weights(stat, LD, ...) convention can be used (e.g., lassosum_rss_weights, prs_cs_weights, sdpr_weights, mr_ash_rss_weights).

P+T (pruning and thresholding) is handled internally: for each threshold, SNPs with eQTL p-value below the threshold are selected, and their marginal z-scores (scaled to correlation units: z / sqrt(n)) are used as weights.

Examples

set.seed(42)
n <- 500; p <- 20
z <- rnorm(p, sd = 2)
R <- diag(p)
sumstats <- data.frame(z = z)
weights <- otters_weights(sumstats, R, n,
  methods = list(lassosum_rss = list()),
  p_thresholds = c(0.05))