This function performs weights computation for Transcriptome-Wide Association Study (TWAS) with fitting models using mvSuSiE and mr.mash with the option of using a limited number of variants selected from mvSuSiE fine-mapping for computing TWAS weights with cross-validation.
Usage
multivariate_analysis_pipeline(
X,
Y,
maf,
X_variance = NULL,
other_quantities = list(),
imiss_cutoff = 1,
maf_cutoff = 0.01,
xvar_cutoff = 0.01,
ld_reference_meta_file = NULL,
pip_cutoff_to_skip = 0,
max_L = -1,
data_driven_prior_matrices = NULL,
data_driven_prior_matrices_cv = NULL,
data_driven_prior_weights_cutoff = 1e-04,
canonical_prior_matrices = TRUE,
mrmash_max_iter = 5000,
mvsusie_max_iter = 200,
signal_cutoff = 0.025,
coverage = c(0.95, 0.7, 0.5),
min_abs_corr = 0.8,
twas_weights = TRUE,
sample_partition = NULL,
max_cv_variants = -1,
cv_folds = 5,
cv_threads = 1,
verbose = 0
)Arguments
- X
A matrix of genotype data where rows represent samples and columns represent genetic variants.
- Y
A matrix of phenotype measurements, representing samples and columns represent conditions.
- maf
A list of vectors for minor allele frequencies for each variant in X.
- ld_reference_meta_file
An optional path to a file containing linkage disequilibrium reference data. If provided, variants in X are filtered based on this reference.
- pip_cutoff_to_skip
Cutoff value for skipping conditions based on PIP values. Default is 0.
- max_L
The maximum number of components in mvSuSiE. Default is 30.
- data_driven_prior_matrices
A list of data-driven covariance matrices for mr.mash weights.
- data_driven_prior_matrices_cv
A list of data-driven covariance matrices for mr.mash weights in cross-validation.
- data_driven_prior_weights_cutoff
The minimum weight for prior covariance matrices. Default is 1e-4.
- canonical_prior_matrices
If set to TRUE, will compute canonical covariance matrices and add them into the prior covariance matrix list in mrmash_wrapper. Default is TRUE.
- mrmash_max_iter
The maximum number of iterations for mr.mash. Default is 5000.
- mvsusie_max_iter
The maximum number of iterations for mvSuSiE. Default is 200.
- signal_cutoff
Cutoff value for signal identification in PIP values for susie_post_processor. Default is 0.025.
- coverage
A vector of coverage probabilities, with the first element being the primary coverage and the rest being secondary coverage probabilities for credible set refinement. Defaults to c(0.95, 0.7, 0.5).
- min_abs_corr
Minimum absolute correlation for credible set purity filtering. Default is 0.8, which is stricter than the susieR default of 0.5.
- sample_partition
Sample partition for cross-validation.
- max_cv_variants
The maximum number of variants to be included in cross-validation. Defaults to -1 which means no limit.
- cv_folds
The number of folds to use for cross-validation. Set to 0 to skip cross-validation. Default is 5.
- cv_threads
The number of threads to use for parallel computation in cross-validation. Defaults to 1.
- verbose
Verbosity level. Default is 0.
- min_cv_maf
The minimum minor allele frequency for variants to be included in cross-validation. Default is 0.05.
Examples
library(pecotmr)
data(multitrait_data)
attach(multitrait_data)
data_driven_prior_matrices <- list(
U = prior_matrices,
w = rep(1 / length(prior_matrices), length(prior_matrices))
)
data_driven_prior_matrices_cv <- lapply(prior_matrices_cv, function(x) {
list(U = x, w = rep(1 / length(x), length(x)))
})
result <- multivariate_analysis_pipeline(
X = multitrait_data$X,
Y = multitrait_data$Y,
maf = colMeans(multitrait_data$X),
X_variance = multitrait_data$X_variance,
max_L = 10,
ld_reference_meta_file = NULL,
max_cv_variants = -1,
pip_cutoff_to_skip = 0,
signal_cutoff = 0.025,
data_driven_prior_matrices = data_driven_prior_matrices,
data_driven_prior_matrices_cv = data_driven_prior_matrices_cv,
canonical_prior_matrices = TRUE,
sample_partition = NULL,
cv_folds = 5,
cv_threads = 2,
data_driven_prior_weights_cutoff = 1e-4
)
#> Error in multivariate_analysis_pipeline(X = multitrait_data$X, Y = multitrait_data$Y, maf = colMeans(multitrait_data$X), X_variance = multitrait_data$X_variance, max_L = 10, ld_reference_meta_file = NULL, max_cv_variants = -1, pip_cutoff_to_skip = 0, signal_cutoff = 0.025, data_driven_prior_matrices = data_driven_prior_matrices, data_driven_prior_matrices_cv = data_driven_prior_matrices_cv, canonical_prior_matrices = TRUE, sample_partition = NULL, cv_folds = 5, cv_threads = 2, data_driven_prior_weights_cutoff = 1e-04): maf values must be between 0 and 1