Skip to contents

This function performs univariate analysis for fine-mapping and Transcriptome-Wide Association Study (TWAS) with optional cross-validation.

Usage

univariate_analysis_pipeline(
  X,
  Y,
  maf,
  X_scalar = 1,
  Y_scalar = 1,
  X_variance = NULL,
  other_quantities = list(),
  imiss_cutoff = 1,
  maf_cutoff = NULL,
  xvar_cutoff = 0,
  ld_reference_meta_file = NULL,
  pip_cutoff_to_skip = 0,
  init_L = 5,
  max_L = 20,
  l_step = 5,
  signal_cutoff = 0.025,
  coverage = c(0.95, 0.7, 0.5),
  min_abs_corr = 0.8,
  finemapping_extra_opts = list(refine = TRUE),
  twas_weights = TRUE,
  sample_partition = NULL,
  max_cv_variants = -1,
  cv_folds = 5,
  cv_threads = 1,
  verbose = 0
)

Arguments

X

A matrix of genotype data where rows represent samples and columns represent genetic variants.

Y

A vector of phenotype measurements.

maf

A vector of minor allele frequencies for each variant in X.

X_scalar

A scalar or vector to rescale X to its original scale.

Y_scalar

A scalar to rescale Y to its original scale.

X_variance

Optional variance of X. Default is NULL.

other_quantities

A list of other quantities to be passed to susie_post_processor. Default is an empty list.

imiss_cutoff

Individual missingness cutoff. Default is 1.0.

maf_cutoff

Minor allele frequency cutoff. Default is NULL.

xvar_cutoff

Variance cutoff for X. Default is 0.05.

ld_reference_meta_file

An optional path to a file containing linkage disequilibrium reference data. Default is NULL.

pip_cutoff_to_skip

Cutoff value for skipping analysis based on PIP values. Default is 0.

init_L

Initial number of components for SuSiE model optimization. Default is 5.

max_L

The maximum number of components in SuSiE. Default is 20.

l_step

Step size for increasing the number of components during SuSiE optimization. Default is 5.

signal_cutoff

Cutoff value for signal identification in PIP values. Default is 0.025.

coverage

A vector of coverage probabilities for credible sets. Default is c(0.95, 0.7, 0.5).

min_abs_corr

Minimum absolute correlation for credible set purity filtering. Default is 0.8, which is stricter than the susieR default of 0.5.

twas_weights

Whether to compute TWAS weights. Default is TRUE.

sample_partition

Sample partition for cross-validation. Default is NULL.

max_cv_variants

The maximum number of variants to be included in cross-validation. Default is -1 (no limit).

cv_folds

The number of folds to use for cross-validation. Default is 5.

cv_threads

The number of threads to use for parallel computation in cross-validation. Default is 1.

verbose

Verbosity level. Default is 0.

Value

A list containing the univariate analysis results.