Skip to contents

This function performs univariate analysis for fine-mapping and Transcriptome-Wide Association Study (TWAS) with optional cross-validation. Fine-mapping fits SuSiE-inf first and then fits SuSiE initialized from the SuSiE-inf result.

Usage

univariate_analysis_pipeline(
  X,
  Y,
  maf,
  X_scalar = 1,
  Y_scalar = 1,
  X_variance = NULL,
  other_quantities = list(),
  imiss_cutoff = 1,
  maf_cutoff = NULL,
  xvar_cutoff = 0,
  ld_reference_meta_file = NULL,
  pip_cutoff_to_skip = 0,
  L = 20,
  L_greedy = 5,
  signal_cutoff = 0.025,
  coverage = c(0.95, 0.7, 0.5),
  min_abs_corr = 0.8,
  finemapping_extra_opts = list(refine = TRUE),
  twas_weights = TRUE,
  sample_partition = NULL,
  max_cv_variants = -1,
  cv_folds = 5,
  cv_threads = 1,
  verbose = 0
)

Arguments

X

A matrix of genotype data where rows represent samples and columns represent genetic variants.

Y

A vector of phenotype measurements.

maf

A vector of minor allele frequencies for each variant in X.

X_scalar

A scalar or vector to rescale X to its original scale.

Y_scalar

A scalar to rescale Y to its original scale.

X_variance

Optional variance of X. Default is NULL.

other_quantities

A list of other quantities to be carried into fine-mapping post-processing. Default is an empty list.

imiss_cutoff

Individual missingness cutoff. Default is 1.0.

maf_cutoff

Minor allele frequency cutoff. Default is NULL.

xvar_cutoff

Variance cutoff for X. Default is 0.05.

ld_reference_meta_file

An optional path to a file containing linkage disequilibrium reference data. Default is NULL.

pip_cutoff_to_skip

Cutoff value for skipping analysis based on PIP values. Default is 0.

L

Maximum number of components in SuSiE. Default is 20.

L_greedy

Initial greedy number of components in SuSiE. Default is 5.

signal_cutoff

Cutoff value for signal identification in PIP values. Default is 0.025.

coverage

A vector of coverage probabilities for credible sets. Default is c(0.95, 0.7, 0.5).

min_abs_corr

Minimum absolute correlation for credible set purity filtering. Default is 0.8, which is stricter than the susieR default of 0.5.

finemapping_extra_opts

Additional options passed to susieR::susie(). SuSiE-inf is always fitted with refine = FALSE; the ordinary SuSiE fit keeps these options and is initialized with model_init.

twas_weights

Whether to compute TWAS weights. Default is TRUE.

sample_partition

Optional data frame with Sample and Fold columns for cross-validation. Default is NULL.

max_cv_variants

The maximum number of variants to be included in cross-validation. Default is -1 (no limit).

cv_folds

The number of folds to use for cross-validation. Default is 5.

cv_threads

The number of threads to use for parallel computation in cross-validation. Default is 1.

verbose

Verbosity level. Default is 0.

Value

A list containing the univariate analysis results.