TWAS Weights Pipeline — twas_weights

This function performs weights computation for Transcriptome-Wide Association Study (TWAS) incorporating various steps such as filtering variants by linkage disequilibrium reference panel variants, fitting models using SuSiE and other methods, and calculating TWAS weights and predictions. Optionally, it can perform cross-validation for TWAS weights.

Usage

twas_weights_pipeline(
  X,
  y,
  susie_fit = NULL,
  fitted_models = NULL,
  cv_folds = 5,
  sample_partition = NULL,
  weight_methods = "default",
  max_cv_variants = -1,
  cv_threads = 1,
  cv_weight_methods = NULL,
  ensemble = TRUE,
  ensemble_r2_threshold = 0.01,
  ensemble_solver = "quadprog",
  ensemble_alpha = 1,
  estimate_pi = TRUE,
  verbose = 1
)

Arguments

X: A matrix of genotype data where rows represent samples and columns represent genetic variants.
y: A vector of phenotype measurements for each sample.
susie_fit: An object returned by the SuSiE function, containing the SuSiE model fit.
fitted_models: Optional named list of fitted fine-mapping models, such as list(susie = susie_fit, susie_inf = susie_inf_fit).
cv_folds: The number of folds to use for cross-validation. Set to 0 to skip cross-validation. Defaults to 5.
sample_partition: Optional data frame with Sample and Fold columns for cross-validation. If NULL, a random partition is generated.
weight_methods: List of methods to use to compute weights for TWAS; along with their parameters.
max_cv_variants: The maximum number of variants to be included in cross-validation. Defaults to -1 which means no limit.
cv_threads: The number of threads to use for parallel computation in cross-validation. Defaults to 1.
cv_weight_methods: List of methods to use for cross-validation. If NULL, uses the same methods as weight_methods.
ensemble: Logical. If TRUE and cv_folds > 1, learn ensemble combination weights via stacked regression (SR-TWAS). Requires at least two individual methods to have been run and to pass the R-squared cutoff. Defaults to TRUE.
ensemble_r2_threshold: Minimum cross-validated R-squared for an individual method to be included in the ensemble. Methods below this threshold are excluded. Defaults to 0.01.
ensemble_solver: Character string specifying the optimization backend for ensemble learning. One of "quadprog", "nnls", "lbfgsb", or "glmnet". Passed to ensemble_weights. Defaults to "quadprog".
ensemble_alpha: Elastic net mixing parameter, used only when ensemble_solver = "glmnet". Defaults to 1 (lasso).
estimate_pi: If TRUE, estimate spike-and-slab sparsity from mr.ash before running Bayesian alphabet methods that need inclusion probabilities.
verbose: Integer controlling verbosity level: 0 = suppress all messages, 1 = show pecotmr messages but suppress external package messages (default), 2 = show all messages including those from external packages.

Value

A list containing results from the TWAS pipeline, including TWAS weights, predictions, and optionally cross-validation results.

Examples

# Example usage (assuming appropriate objects for X, y, and susie_fit are available):
twas_results <- twas_weights_pipeline(X, y, susie_fit)
#> Error: object 'susie_fit' not found