Multivariate Fine-Mapping with mvSuSiE and mr.mash

Multivariate Fine-Mapping with mvSuSiE and mr.mash#

Multivariate fine-mapping using mvSuSiE and mr.mash is also available in our pipeline.

Input#

--genoFile: path to a text file contatining information on genotype files. For example:

#id     #path
21      $PATH/protocol_example.genotype.chr21_22.21.bed
22      $PATH/protocol_example.genotype.chr21_22.22.bed

--phenoFile: a tab delimited file containing chr, start, end, ID and path for the regions. For example:

#chr    start   end     ID      path
chr21   0       14120807        TADB_1297       $PATH/protocol_example.ha.bed.gz
chr21   10840000        16880069        TADB_1298       $PATH/protocol_example.ha.bed.gz

--covFile: path to a gzipped file containing covariates in the rows, and sample ids in the columns.
--customized-association-windows: a tab delimited file containing chr, start, end, and ID regions. For example:

#chr    start   end     ID
chr21   0       14120807        TADB_1297
chr21   10840000        16880069        TADB_1298

--region-name: if you only wish to analyze one region, then include the ID of a region found in the customized-association-windows file

--mixture_prior: rds file from mr.mash

Output#

  • *.multicontext_data.rds

> str(readRDS("/restricted/projectnb/xqtl/xqtl_protocol/toy_xqtl_protocol/output/mnm_old/data/test_mnm.chr11_ENSG00000073921.multicontext_data.rds"))
List of 8
 $ residual_Y       : num [1:150, 1:6] 0.1241 0.2981 -0.6054 0.0435 0.0388 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : chr [1:150] "sample0" "sample1" "sample10" "sample100" ...
  .. ..$ : chr [1:6] "Ast" "Exc" "Inh" "Mic" ...
 $ residual_Y_scalar: num [1:6] 1 1 1 1 1 1
 $ dropped_sample   :List of 3
  ..$ X    :List of 6
  .. ..$ : chr(0) 
  .. ..$ : chr(0) 
  .. ..$ : chr(0) 
  .. ..$ : chr(0) 
  .. ..$ : chr(0) 
  .. ..$ : chr(0) 
  ..$ Y    :List of 6
  .. ..$ : chr "strand"
  .. ..$ : chr "strand"
  .. ..$ : chr "strand"
  .. ..$ : chr "strand"
  .. ..$ : chr "strand"
  .. ..$ : chr "strand"
  ..$ covar:List of 6
  .. ..$ : chr(0) 
  .. ..$ : chr(0) 
  .. ..$ : chr(0) 
  .. ..$ : chr(0) 
  .. ..$ : chr(0) 
  .. ..$ : chr(0) 
 $ X                : num [1:150, 1:9970] 0 0 1 1 1 0 0 1 1 1 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : chr [1:150] "sample0" "sample1" "sample10" "sample100" ...
  .. ..$ : chr [1:9970] "chr11:84957209:G:C" "chr11:84957879:A:G" "chr11:84958236:C:A" "chr11:84958954:T:C" ...
 $ maf              : Named num [1:9970] 0.33 0.327 0.103 0.343 0.343 ...
  ..- attr(*, "names")= chr [1:9970] "chr11:84957209:G:C" "chr11:84957879:A:G" "chr11:84958236:C:A" "chr11:84958954:T:C" ...
 $ chrom            : chr "chr11"
 $ grange           : chr [1:2] "85957174" "86069881"
 $ X_variance       : Named num [1:9970] 0.441 0.443 0.192 0.458 0.458 ...
  ..- attr(*, "names")= chr [1:9970] "chr11:84957209:G:C" "chr11:84957879:A:G" "chr11:84958236:C:A" "chr11:84958954:T:C" ...
  • *.multicontext_bvsr.rds

> str(readRDS("/restricted/projectnb/xqtl/xqtl_protocol/toy_xqtl_protocol/output/mnm_old/multivariate_fine_mapping/test_mnm.chr11_ENSG00000073921.multicontext_bvsr.rds"),max.level=2)
List of 1
 $ ENSG00000073921:List of 10
  ..$ mrmash_fitted              :List of 14
  .. ..- attr(*, "class")= chr [1:2] "mr.mash" "list"
  ..$ reweighted_mixture_prior   :Classes 'MashInitializer', 'R6' <MashInitializer>
  Public:
    clone: function (deep = FALSE) 
    compute_prior_inv: function () 
    initialize: function (Ulist, grid, prior_weights = NULL, null_weight = 0, 
    n_component: active binding
    n_condition: active binding
    precompute_cov_matrices: function (d, algorithm = c("R", "cpp")) 
    precomputed: active binding
    prior_variance: active binding
    remove_precomputed: function () 
    scale_prior_variance: function (sigma) 
  Private:
    inv_mats: NULL
    U: NULL
    xU: list 
  ..$ reweighted_mixture_prior_cv:List of 5
  ..$ mvsusie_fitted             :List of 28
  .. ..- attr(*, "class")= chr "mvsusie"
  ..$ variant_names              : chr [1:6454] "chr11:84957209:G:C" "chr11:84958236:C:A" "chr11:84958954:T:C" "chr11:84959602:T:C" ...
  ..$ analysis_script            : chr "\nlibrary(pecotmr)\n\nphenotype_files = c(\"/restricted/projectnb/xqtl/xqtl_protocol/toy_xqtl_protocol/output/p"| __truncated__
  ..$ other_quantities           :List of 1
  ..$ context_names              : chr [1:6] "Ast" "Exc" "Inh" "Mic" ...
  ..$ total_time_elapsed         : 'proc_time' Named num [1:5] 104.38 0.653 105.828 0 0
  .. ..- attr(*, "names")= chr [1:5] "user.self" "sys.self" "elapsed" "user.child" ...
  ..$ region_info                :List of 3
  • *.multicontext_twas_weights.rds

> str(readRDS("/restricted/projectnb/xqtl/xqtl_protocol/toy_xqtl_protocol/output/mnm_old/multivariate_twas_weights/test_mnm.chr11_ENSG00000073921.multicontext_twas_weights.rds"), max.level=3)
List of 1
 $ ENSG00000073921:List of 6
  ..$ Ast_ENSG00000073921:List of 6
  .. ..$ twas_weights      :List of 2
  .. ..$ twas_predictions  :List of 2
  .. ..$ variant_names     : chr [1:6454] "chr11:84957209:G:C" "chr11:84958236:C:A" "chr11:84958954:T:C" "chr11:84959602:T:C" ...
  .. ..$ twas_cv_result    :List of 4
  .. ..$ total_time_elapsed: 'proc_time' Named num [1:5] 1645.11 3.4 1652.73 0.01 0.02
  .. .. ..- attr(*, "names")= chr [1:5] "user.self" "sys.self" "elapsed" "user.child" ...
  .. ..$ region_info       :List of 3
  ..$ Exc_ENSG00000073921:List of 6
  .. ..$ twas_weights      :List of 2
  .. ..$ twas_predictions  :List of 2
  .. ..$ variant_names     : chr [1:6454] "chr11:84957209:G:C" "chr11:84958236:C:A" "chr11:84958954:T:C" "chr11:84959602:T:C" ...
  .. ..$ twas_cv_result    :List of 4
  .. ..$ total_time_elapsed: 'proc_time' Named num [1:5] 1645.11 3.4 1652.73 0.01 0.02
  .. .. ..- attr(*, "names")= chr [1:5] "user.self" "sys.self" "elapsed" "user.child" ...
  .. ..$ region_info       :List of 3
  ..$ Inh_ENSG00000073921:List of 6
  .. ..$ twas_weights      :List of 2
  .. ..$ twas_predictions  :List of 2
  .. ..$ variant_names     : chr [1:6454] "chr11:84957209:G:C" "chr11:84958236:C:A" "chr11:84958954:T:C" "chr11:84959602:T:C" ...
  .. ..$ twas_cv_result    :List of 4
  .. ..$ total_time_elapsed: 'proc_time' Named num [1:5] 1645.11 3.4 1652.73 0.01 0.02
  .. .. ..- attr(*, "names")= chr [1:5] "user.self" "sys.self" "elapsed" "user.child" ...
  .. ..$ region_info       :List of 3
  ..$ Mic_ENSG00000073921:List of 6
  .. ..$ twas_weights      :List of 2
  .. ..$ twas_predictions  :List of 2
  .. ..$ variant_names     : chr [1:6454] "chr11:84957209:G:C" "chr11:84958236:C:A" "chr11:84958954:T:C" "chr11:84959602:T:C" ...
  .. ..$ twas_cv_result    :List of 4
  .. ..$ total_time_elapsed: 'proc_time' Named num [1:5] 1645.11 3.4 1652.73 0.01 0.02
  .. .. ..- attr(*, "names")= chr [1:5] "user.self" "sys.self" "elapsed" "user.child" ...
  .. ..$ region_info       :List of 3
  ..$ OPC_ENSG00000073921:List of 6
  .. ..$ twas_weights      :List of 2
  .. ..$ twas_predictions  :List of 2
  .. ..$ variant_names     : chr [1:6454] "chr11:84957209:G:C" "chr11:84958236:C:A" "chr11:84958954:T:C" "chr11:84959602:T:C" ...
  .. ..$ twas_cv_result    :List of 4
  .. ..$ total_time_elapsed: 'proc_time' Named num [1:5] 1645.11 3.4 1652.73 0.01 0.02
  .. .. ..- attr(*, "names")= chr [1:5] "user.self" "sys.self" "elapsed" "user.child" ...
  .. ..$ region_info       :List of 3
  ..$ Oli_ENSG00000073921:List of 6
  .. ..$ twas_weights      :List of 2
  .. ..$ twas_predictions  :List of 2
  .. ..$ variant_names     : chr [1:6454] "chr11:84957209:G:C" "chr11:84958236:C:A" "chr11:84958954:T:C" "chr11:84959602:T:C" ...
  .. ..$ twas_cv_result    :List of 4
  .. ..$ total_time_elapsed: 'proc_time' Named num [1:5] 1645.11 3.4 1652.73 0.01 0.02
  .. .. ..- attr(*, "names")= chr [1:5] "user.self" "sys.self" "elapsed" "user.child" ...
  .. ..$ region_info       :List of 3

Minimal Working Example Steps#

iv. Run the Fine-Mapping with mvSuSiE#

sos run pipeline/mnm_regression.ipynb mnm \
    --name test_mnm --cwd output/mnm \
    --genoFile output/genotype_by_chrom/wgs.merged.plink_qc.genotype_by_chrom_files.txt \
    --phenoFile output/phenotype/phenotype_by_chrom_for_cis/bulk_rnaseq.phenotype_by_chrom_files.region_list.txt \
    output/phenotype/phenotype_by_chrom_for_cis/bulk_rnaseq.phenotype_by_chrom_files.region_list.txt \
    output/phenotype/phenotype_by_chrom_for_cis/bulk_rnaseq.phenotype_by_chrom_files.region_list.txt \
    output/phenotype/phenotype_by_chrom_for_cis/bulk_rnaseq.phenotype_by_chrom_files.region_list.txt \
    output/phenotype/phenotype_by_chrom_for_cis/bulk_rnaseq.phenotype_by_chrom_files.region_list.txt \
    output/phenotype/phenotype_by_chrom_for_cis/bulk_rnaseq.phenotype_by_chrom_files.region_list.txt \
    --covFile output/covariate/bulk_rnaseq_tpm_matrix.low_expression_filtered.outlier_removed.tmm.expression.covariates.wgs.merged.plink_qc.plink_qc.prune.pca.Marchenko_PC.gz \
    output/covariate/bulk_rnaseq_tpm_matrix.low_expression_filtered.outlier_removed.tmm.expression.covariates.wgs.merged.plink_qc.plink_qc.prune.pca.Marchenko_PC.gz \
    output/covariate/bulk_rnaseq_tpm_matrix.low_expression_filtered.outlier_removed.tmm.expression.covariates.wgs.merged.plink_qc.plink_qc.prune.pca.Marchenko_PC.gz \
    output/covariate/bulk_rnaseq_tpm_matrix.low_expression_filtered.outlier_removed.tmm.expression.covariates.wgs.merged.plink_qc.plink_qc.prune.pca.Marchenko_PC.gz \
    output/covariate/bulk_rnaseq_tpm_matrix.low_expression_filtered.outlier_removed.tmm.expression.covariates.wgs.merged.plink_qc.plink_qc.prune.pca.Marchenko_PC.gz \
    output/covariate/bulk_rnaseq_tpm_matrix.low_expression_filtered.outlier_removed.tmm.expression.covariates.wgs.merged.plink_qc.plink_qc.prune.pca.Marchenko_PC.gz \
    --customized-association-windows reference_data/TAD/TADB_enhanced_cis.bed \
    --region-name ENSG00000073921 --save-data --no-skip-twas-weights \
    --phenotype-names Ast Exc Inh Mic OPC Oli \
    --mixture_prior output/multivariate_mixture/MWE_ed_bovy.EE.prior.rds \
    --max_cv_variants 5000 \
	--ld_reference_meta_file data/ld_meta_file_with_bim.tsv 

Anticipated Results#

For each gene, multivariate finemapping will produce a file containing results for the top hits and a file containing twas weights produced by susie.

test_mnm.chr11_ENSG00000073921.multicontext_bvsr.rds:

  • For each gene of interest, this file contains:

    1. mrmash_fitted

    2. reweighted_mixture_prior

    3. reweighted_mixture_prior_cv

    4. mvsusie_fitted

    5. variant_names

    6. analysis_script

    7. other_quantities

    8. context_names

    9. top_loci

    10. susie_result_trimmed

    11. total_time_elapsed

    12. region_info

test_mnm.chr11_ENSG00000073921.multicontext_data.rds: (from the –save-data argument)

test_mnm.chr11_ENSG00000073921.multicontext_twas_weights.rds:

  • For each gene of interest and phenotype, this file contains:

    1. twas_weights - weights mrmash and mvsusie methods

    2. twas_predictions - twas predictions for mrmash and mvsusie methods

    3. variant_names

    4. twas_cv_result

    5. total_time_elapsed

    6. region_info