Multivariate Fine-Mapping with mvSuSiE and mr.mash#
Multivariate fine-mapping using mvSuSiE and mr.mash is also available in our pipeline.
Input#
--genoFile
: path to a text file contatining information on genotype files. For example:
#id #path
21 $PATH/protocol_example.genotype.chr21_22.21.bed
22 $PATH/protocol_example.genotype.chr21_22.22.bed
--phenoFile
: a tab delimited file containing chr, start, end, ID and path for the regions. For example:
#chr start end ID path
chr21 0 14120807 TADB_1297 $PATH/protocol_example.ha.bed.gz
chr21 10840000 16880069 TADB_1298 $PATH/protocol_example.ha.bed.gz
--covFile
: path to a gzipped file containing covariates in the rows, and sample ids in the columns.
--customized-association-windows
: a tab delimited file containing chr, start, end, and ID regions. For example:
#chr start end ID
chr21 0 14120807 TADB_1297
chr21 10840000 16880069 TADB_1298
--region-name
: if you only wish to analyze one region, then include the ID of a region found in the customized-association-windows
file
--mixture_prior
: rds file from mr.mash
Output#
*.multicontext_data.rds
> str(readRDS("/restricted/projectnb/xqtl/xqtl_protocol/toy_xqtl_protocol/output/mnm_old/data/test_mnm.chr11_ENSG00000073921.multicontext_data.rds"))
List of 8
$ residual_Y : num [1:150, 1:6] 0.1241 0.2981 -0.6054 0.0435 0.0388 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : chr [1:150] "sample0" "sample1" "sample10" "sample100" ...
.. ..$ : chr [1:6] "Ast" "Exc" "Inh" "Mic" ...
$ residual_Y_scalar: num [1:6] 1 1 1 1 1 1
$ dropped_sample :List of 3
..$ X :List of 6
.. ..$ : chr(0)
.. ..$ : chr(0)
.. ..$ : chr(0)
.. ..$ : chr(0)
.. ..$ : chr(0)
.. ..$ : chr(0)
..$ Y :List of 6
.. ..$ : chr "strand"
.. ..$ : chr "strand"
.. ..$ : chr "strand"
.. ..$ : chr "strand"
.. ..$ : chr "strand"
.. ..$ : chr "strand"
..$ covar:List of 6
.. ..$ : chr(0)
.. ..$ : chr(0)
.. ..$ : chr(0)
.. ..$ : chr(0)
.. ..$ : chr(0)
.. ..$ : chr(0)
$ X : num [1:150, 1:9970] 0 0 1 1 1 0 0 1 1 1 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : chr [1:150] "sample0" "sample1" "sample10" "sample100" ...
.. ..$ : chr [1:9970] "chr11:84957209:G:C" "chr11:84957879:A:G" "chr11:84958236:C:A" "chr11:84958954:T:C" ...
$ maf : Named num [1:9970] 0.33 0.327 0.103 0.343 0.343 ...
..- attr(*, "names")= chr [1:9970] "chr11:84957209:G:C" "chr11:84957879:A:G" "chr11:84958236:C:A" "chr11:84958954:T:C" ...
$ chrom : chr "chr11"
$ grange : chr [1:2] "85957174" "86069881"
$ X_variance : Named num [1:9970] 0.441 0.443 0.192 0.458 0.458 ...
..- attr(*, "names")= chr [1:9970] "chr11:84957209:G:C" "chr11:84957879:A:G" "chr11:84958236:C:A" "chr11:84958954:T:C" ...
*.multicontext_bvsr.rds
> str(readRDS("/restricted/projectnb/xqtl/xqtl_protocol/toy_xqtl_protocol/output/mnm_old/multivariate_fine_mapping/test_mnm.chr11_ENSG00000073921.multicontext_bvsr.rds"),max.level=2)
List of 1
$ ENSG00000073921:List of 10
..$ mrmash_fitted :List of 14
.. ..- attr(*, "class")= chr [1:2] "mr.mash" "list"
..$ reweighted_mixture_prior :Classes 'MashInitializer', 'R6' <MashInitializer>
Public:
clone: function (deep = FALSE)
compute_prior_inv: function ()
initialize: function (Ulist, grid, prior_weights = NULL, null_weight = 0,
n_component: active binding
n_condition: active binding
precompute_cov_matrices: function (d, algorithm = c("R", "cpp"))
precomputed: active binding
prior_variance: active binding
remove_precomputed: function ()
scale_prior_variance: function (sigma)
Private:
inv_mats: NULL
U: NULL
xU: list
..$ reweighted_mixture_prior_cv:List of 5
..$ mvsusie_fitted :List of 28
.. ..- attr(*, "class")= chr "mvsusie"
..$ variant_names : chr [1:6454] "chr11:84957209:G:C" "chr11:84958236:C:A" "chr11:84958954:T:C" "chr11:84959602:T:C" ...
..$ analysis_script : chr "\nlibrary(pecotmr)\n\nphenotype_files = c(\"/restricted/projectnb/xqtl/xqtl_protocol/toy_xqtl_protocol/output/p"| __truncated__
..$ other_quantities :List of 1
..$ context_names : chr [1:6] "Ast" "Exc" "Inh" "Mic" ...
..$ total_time_elapsed : 'proc_time' Named num [1:5] 104.38 0.653 105.828 0 0
.. ..- attr(*, "names")= chr [1:5] "user.self" "sys.self" "elapsed" "user.child" ...
..$ region_info :List of 3
*.multicontext_twas_weights.rds
> str(readRDS("/restricted/projectnb/xqtl/xqtl_protocol/toy_xqtl_protocol/output/mnm_old/multivariate_twas_weights/test_mnm.chr11_ENSG00000073921.multicontext_twas_weights.rds"), max.level=3)
List of 1
$ ENSG00000073921:List of 6
..$ Ast_ENSG00000073921:List of 6
.. ..$ twas_weights :List of 2
.. ..$ twas_predictions :List of 2
.. ..$ variant_names : chr [1:6454] "chr11:84957209:G:C" "chr11:84958236:C:A" "chr11:84958954:T:C" "chr11:84959602:T:C" ...
.. ..$ twas_cv_result :List of 4
.. ..$ total_time_elapsed: 'proc_time' Named num [1:5] 1645.11 3.4 1652.73 0.01 0.02
.. .. ..- attr(*, "names")= chr [1:5] "user.self" "sys.self" "elapsed" "user.child" ...
.. ..$ region_info :List of 3
..$ Exc_ENSG00000073921:List of 6
.. ..$ twas_weights :List of 2
.. ..$ twas_predictions :List of 2
.. ..$ variant_names : chr [1:6454] "chr11:84957209:G:C" "chr11:84958236:C:A" "chr11:84958954:T:C" "chr11:84959602:T:C" ...
.. ..$ twas_cv_result :List of 4
.. ..$ total_time_elapsed: 'proc_time' Named num [1:5] 1645.11 3.4 1652.73 0.01 0.02
.. .. ..- attr(*, "names")= chr [1:5] "user.self" "sys.self" "elapsed" "user.child" ...
.. ..$ region_info :List of 3
..$ Inh_ENSG00000073921:List of 6
.. ..$ twas_weights :List of 2
.. ..$ twas_predictions :List of 2
.. ..$ variant_names : chr [1:6454] "chr11:84957209:G:C" "chr11:84958236:C:A" "chr11:84958954:T:C" "chr11:84959602:T:C" ...
.. ..$ twas_cv_result :List of 4
.. ..$ total_time_elapsed: 'proc_time' Named num [1:5] 1645.11 3.4 1652.73 0.01 0.02
.. .. ..- attr(*, "names")= chr [1:5] "user.self" "sys.self" "elapsed" "user.child" ...
.. ..$ region_info :List of 3
..$ Mic_ENSG00000073921:List of 6
.. ..$ twas_weights :List of 2
.. ..$ twas_predictions :List of 2
.. ..$ variant_names : chr [1:6454] "chr11:84957209:G:C" "chr11:84958236:C:A" "chr11:84958954:T:C" "chr11:84959602:T:C" ...
.. ..$ twas_cv_result :List of 4
.. ..$ total_time_elapsed: 'proc_time' Named num [1:5] 1645.11 3.4 1652.73 0.01 0.02
.. .. ..- attr(*, "names")= chr [1:5] "user.self" "sys.self" "elapsed" "user.child" ...
.. ..$ region_info :List of 3
..$ OPC_ENSG00000073921:List of 6
.. ..$ twas_weights :List of 2
.. ..$ twas_predictions :List of 2
.. ..$ variant_names : chr [1:6454] "chr11:84957209:G:C" "chr11:84958236:C:A" "chr11:84958954:T:C" "chr11:84959602:T:C" ...
.. ..$ twas_cv_result :List of 4
.. ..$ total_time_elapsed: 'proc_time' Named num [1:5] 1645.11 3.4 1652.73 0.01 0.02
.. .. ..- attr(*, "names")= chr [1:5] "user.self" "sys.self" "elapsed" "user.child" ...
.. ..$ region_info :List of 3
..$ Oli_ENSG00000073921:List of 6
.. ..$ twas_weights :List of 2
.. ..$ twas_predictions :List of 2
.. ..$ variant_names : chr [1:6454] "chr11:84957209:G:C" "chr11:84958236:C:A" "chr11:84958954:T:C" "chr11:84959602:T:C" ...
.. ..$ twas_cv_result :List of 4
.. ..$ total_time_elapsed: 'proc_time' Named num [1:5] 1645.11 3.4 1652.73 0.01 0.02
.. .. ..- attr(*, "names")= chr [1:5] "user.self" "sys.self" "elapsed" "user.child" ...
.. ..$ region_info :List of 3
Minimal Working Example Steps#
iv. Run the Fine-Mapping with mvSuSiE#
sos run pipeline/mnm_regression.ipynb mnm \
--name test_mnm --cwd output/mnm \
--genoFile output/genotype_by_chrom/wgs.merged.plink_qc.genotype_by_chrom_files.txt \
--phenoFile output/phenotype/phenotype_by_chrom_for_cis/bulk_rnaseq.phenotype_by_chrom_files.region_list.txt \
output/phenotype/phenotype_by_chrom_for_cis/bulk_rnaseq.phenotype_by_chrom_files.region_list.txt \
output/phenotype/phenotype_by_chrom_for_cis/bulk_rnaseq.phenotype_by_chrom_files.region_list.txt \
output/phenotype/phenotype_by_chrom_for_cis/bulk_rnaseq.phenotype_by_chrom_files.region_list.txt \
output/phenotype/phenotype_by_chrom_for_cis/bulk_rnaseq.phenotype_by_chrom_files.region_list.txt \
output/phenotype/phenotype_by_chrom_for_cis/bulk_rnaseq.phenotype_by_chrom_files.region_list.txt \
--covFile output/covariate/bulk_rnaseq_tpm_matrix.low_expression_filtered.outlier_removed.tmm.expression.covariates.wgs.merged.plink_qc.plink_qc.prune.pca.Marchenko_PC.gz \
output/covariate/bulk_rnaseq_tpm_matrix.low_expression_filtered.outlier_removed.tmm.expression.covariates.wgs.merged.plink_qc.plink_qc.prune.pca.Marchenko_PC.gz \
output/covariate/bulk_rnaseq_tpm_matrix.low_expression_filtered.outlier_removed.tmm.expression.covariates.wgs.merged.plink_qc.plink_qc.prune.pca.Marchenko_PC.gz \
output/covariate/bulk_rnaseq_tpm_matrix.low_expression_filtered.outlier_removed.tmm.expression.covariates.wgs.merged.plink_qc.plink_qc.prune.pca.Marchenko_PC.gz \
output/covariate/bulk_rnaseq_tpm_matrix.low_expression_filtered.outlier_removed.tmm.expression.covariates.wgs.merged.plink_qc.plink_qc.prune.pca.Marchenko_PC.gz \
output/covariate/bulk_rnaseq_tpm_matrix.low_expression_filtered.outlier_removed.tmm.expression.covariates.wgs.merged.plink_qc.plink_qc.prune.pca.Marchenko_PC.gz \
--customized-association-windows reference_data/TAD/TADB_enhanced_cis.bed \
--region-name ENSG00000073921 --save-data --no-skip-twas-weights \
--phenotype-names Ast Exc Inh Mic OPC Oli \
--mixture_prior output/multivariate_mixture/MWE_ed_bovy.EE.prior.rds \
--max_cv_variants 5000 \
--ld_reference_meta_file data/ld_meta_file_with_bim.tsv
Anticipated Results#
For each gene, multivariate finemapping will produce a file containing results for the top hits and a file containing twas weights produced by susie.
test_mnm.chr11_ENSG00000073921.multicontext_bvsr.rds
:
For each gene of interest, this file contains:
mrmash_fitted
reweighted_mixture_prior
reweighted_mixture_prior_cv
mvsusie_fitted
variant_names
analysis_script
other_quantities
context_names
top_loci
susie_result_trimmed
total_time_elapsed
region_info
test_mnm.chr11_ENSG00000073921.multicontext_data.rds
: (from the –save-data argument)
see pecotmr code for description
test_mnm.chr11_ENSG00000073921.multicontext_twas_weights.rds
:
For each gene of interest and phenotype, this file contains:
twas_weights - weights mrmash and mvsusie methods
twas_predictions - twas predictions for mrmash and mvsusie methods
variant_names
twas_cv_result
total_time_elapsed
region_info