Multivariate Fine-Mapping for multiple genes#

Description#

Multi gene fine-mapping and TWAS may also be conducted with our pipeline. This considers multiple genes jointly within specific TAD windows.

This step is similar to the multivariate fine-mapping with two main differences. 1) TAD windows with multiple genes need to be defined. The --pheno_id_map_file parameter is used for this. 2) To speed things up, the genes are filtered out if they don’t have a univariate fine mapped region. Genes may also be filtered out if they do have a univariate fine-mapped signal, but the signal is nowhere close to that of other genes. The --skip-analysis-pip-cutoff parameter is used for this.

Input#

--genoFile: path to a plink bed file containin genotypes. Include the .bed

--phenoFile: a tab delimited file containing chr, start, end, ID and path for the regions. For example:

#chr    start   end     ID      path
chr12   389319  389320  ENSG00000073614 $PATH/snuc_pseudo_bulk.Mic.mega.normalized.log2cpm.bed.gz
chr12   752578  752579  ENSG00000060237 $PATH/snuc_pseudo_bulk.Mic.mega.normalized.log2cpm.bed.gz

--covFile: path to a gzipped file containing covariates in the rows, and sample ids in the columns.

--customized-association-windows: a tab delimited file containing chr, start, end, and ID regions. For example:

#chr    start   end     TAD_id
chr1    0       10985501        chr1_0_10985501
chr1    5101787 11630758        chr1_5101787_11630758

--phenotype-names: the names of the phenotypes if multiple are included. There should be one for each phenotype file you include.

--max-cv-variants: maximum number of variants for cross-validation.

--ld_reference_meta_file: path to file containing chrom, start, end and path for linkage disequilibrium region information. For example:

#chrom  start   end     path
chr1    101384274       104443097       chr1/chr1_101384274_104443097.cor.xz,chr1/chr1_101384274_104443097.cor.xz.bim
chr1    104443097       106225286       chr1/chr1_104443097_106225286.cor.xz,chr1/chr1_104443097_106225286.cor.xz.bim

--independent_variant_list: a gzipped file containing variant information. These should be independent from one another in terms of linkage disequilibrium. For example:

chrom   pos     alt     ref     variant_id
chr1    16206   T       A       chr1:16206:T:A
chr1    16433   C       G       chr1:16433:C:G

--fine_mapping_meta: A file containg a list of gene and region information and other conditions. For example:

#chr    start   end     region_id       TSS     original_data   combined_data   combined_data_sumstats  conditions      conditions_top_loci
chr1    0       6480000 ENSG00000008128 1724356 KNIGHT_pQTL.ENSG00000008128.univariate_susie_twas_weights.rds,MiGA_eQTL.ENSG00000008128.univariate_susie_twas_weights.rds,MSBB_eQTL.ENSG00000008128.univariate_susie_twas_weights.rds,ROSMAP_Bennett_Klein_pQTL.ENSG00000008128.univariate_susie_twas_weights.rds,ROSMAP_DeJager_eQTL.ENSG00000008128.univariate_susie_twas_weights.rds,ROSMAP_Kellis_eQTL.ENSG00000008128.univariate_susie_twas_weights.rds,ROSMAP_mega_eQTL.ENSG00000008128.univariate_susie_twas_weights.rds,STARNET_eQTL.ENSG00000008128.univariate_susie_twas_weights.rds  $PATH/Fungen_xQTL.ENSG00000008128.cis_results_db.export.rds        $PATH/Fungen_xQTL.ENSG00000008128.cis_results_db.export_sumstats.rds       Knight_eQTL_brain,MiGA_GFM_eQTL,MiGA_GTS_eQTL,MiGA_SVZ_eQTL,MiGA_THA_eQTL,BM_10_MSBB_eQTL,BM_22_MSBB_eQTL,BM_36_MSBB_eQTL,BM_44_MSBB_eQTL,monocyte_ROSMAP_eQTL,Mic_DeJager_eQTL,Ast_DeJager_eQTL,Oli_DeJager_eQTL,Exc_DeJager_eQTL,Inh_DeJager_eQTL,DLPFC_DeJager_eQTL,PCC_DeJager_eQTL,AC_DeJager_eQTL,Mic_Kellis_eQTL,Ast_Kellis_eQTL,Oli_Kellis_eQTL,OPC_Kellis_eQTL,Exc_Kellis_eQTL,Inh_Kellis_eQTL,Ast_mega_eQTL,Exc_mega_eQTL,Inh_mega_eQTL,Oli_mega_eQTL,STARNET_eQTL_Mac       Knight_eQTL_brain,MiGA_GFM_eQTL,MiGA_GTS_eQTL,MiGA_SVZ_eQTL,MiGA_THA_eQTL,BM_10_MSBB_eQTL,BM_22_MSBB_eQTL,BM_36_MSBB_eQTL,BM_44_MSBB_eQTL,monocyte_ROSMAP_eQTL,Mic_DeJager_eQTL,Ast_DeJager_eQTL,Oli_DeJager_eQTL,Exc_DeJager_eQTL,Inh_DeJager_eQTL,DLPFC_DeJager_eQTL,PCC_DeJager_eQTL,AC_DeJager_eQTL,Mic_Kellis_eQTL,Ast_Kellis_eQTL,Oli_Kellis_eQTL,OPC_Kellis_eQTL,Exc_Kellis_eQTL,Inh_Kellis_eQTL,Ast_mega_eQTL,Exc_mega_eQTL,Inh_mega_eQTL,Oli_mega_eQTL,STARNET_eQTL_Mac

--phenoIDFile: A bed file containing a list of genes and their LD region. For example:

TAD_id  ID
chr19_0_13957223        ENSG00000172270
chr19_0_13957223        ENSG00000099864
chr19_0_13957223        ENSG00000011304

--skip-analysis-pip-cutoff: A number of the pip cutoff.

--coverage

--maf

--pheno_id_map_file: A file containing IDs and genes. For example:

ID      gene
chr20:50940933:50941105:clu_44490_-:ENSG00000000419     ENSG00000000419
chr20:50940933:50941129:clu_44490_-:ENSG00000000419     ENSG00000000419
chr20:50936262:50942031:clu_44490_-:ENSG00000000419     ENSG00000000419

--prior-canonical-matrices

--save-data: whether to save intermediate data or not

--twas-cv-folds: Perform K folds valiation CV for TWAS. Set this to zero to skip

--trans-analysis: Include this if doing trans-analysis (not using phenotypic coordinate information)

--region-name: if you only wish to analyze one region, then include the ID of a region found in the customized-association-windows file

--cwd: output file path

Output#

  • *.multigene_bvsr.rds

> str(readRDS("ROSMAP_Ast_mega_eQTL.chr11_chr11_77324757_82556425.multigene_bvsr.rds"))
List of 1
 $ chr11_77324757_82556425:List of 12
  ..$ mrmash_fitted              :List of 14
  .. ..$ mu1          : num [1:14830, 1:5] 3.02e-06 3.81e-06 -1.59e-05 -1.59e-05 -1.36e-05 ...
  .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. ..$ : chr [1:14830] "chr11:77325990:C:T" "chr11:77326116:G:A" "chr11:77326354:A:G" "chr11:77326475:G:C" ...
  .. .. .. ..$ : chr [1:5] "ENSG00000074201" "ENSG00000048649" "ENSG00000159063" "ENSG00000188997" ...
  .. ..$ S1           : num [1:5, 1:5, 1:14830] 6.07e-07 1.55e-09 1.78e-09 1.62e-09 1.44e-09 ...
  .. .. ..- attr(*, "dimnames")=List of 3
  .. .. .. ..$ : chr [1:5] "ENSG00000074201" "ENSG00000048649" "ENSG00000159063" "ENSG00000188997" ...
  .. .. .. ..$ : chr [1:5] "ENSG00000074201" "ENSG00000048649" "ENSG00000159063" "ENSG00000188997" ...
  .. .. .. ..$ : chr [1:14830] "chr11:77325990:C:T" "chr11:77326116:G:A" "chr11:77326354:A:G" "chr11:77326475:G:C" ...
  .. ..$ w1           : num [1:14830, 1:50] 0.999 0.998 0.997 0.997 0.999 ...
  .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. ..$ : chr [1:14830] "chr11:77325990:C:T" "chr11:77326116:G:A" "chr11:77326354:A:G" "chr11:77326475:G:C" ...
  .. .. .. ..$ : chr [1:50] "null" "singleton1_grid1" "singleton1_grid2" "singleton1_grid3" ...
  .. ..$ V            : num [1:5, 1:5] 0.40797 0.00349 0.016 -0.01287 0.00417 ...
  .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. ..$ : chr [1:5] "ENSG00000074201" "ENSG00000048649" "ENSG00000159063" "ENSG00000188997" ...
  .. .. .. ..$ : chr [1:5] "ENSG00000074201" "ENSG00000048649" "ENSG00000159063" "ENSG00000188997" ...
  .. ..$ w0           : Named num [1:50] 9.98e-01 7.08e-05 1.12e-04 2.20e-04 4.38e-04 ...
  .. .. ..- attr(*, "names")= chr [1:50] "null" "singleton1_grid1" "singleton1_grid2" "singleton1_grid3" ...
  .. ..$ S0           : num [1:5, 1:5, 1:50] 1e-08 0e+00 0e+00 0e+00 0e+00 0e+00 1e-08 0e+00 0e+00 0e+00 ...
  .. .. ..- attr(*, "dimnames")=List of 3
  .. .. .. ..$ : chr [1:5] "ENSG00000074201" "ENSG00000048649" "ENSG00000159063" "ENSG00000188997" ...
  .. .. .. ..$ : chr [1:5] "ENSG00000074201" "ENSG00000048649" "ENSG00000159063" "ENSG00000188997" ...
  .. .. .. ..$ : chr [1:50] "null" "singleton1_grid1" "singleton1_grid2" "singleton1_grid3" ...
  .. ..$ intercept    : Named num [1:5] 0.0913 -0.0112 -0.3494 -0.1176 -0.1019
  .. .. ..- attr(*, "names")= chr [1:5] "ENSG00000074201" "ENSG00000048649" "ENSG00000159063" "ENSG00000188997" ...
  .. ..$ fitted       : num [1:737, 1:5] 0.0207 0.0475 0.1043 -0.1477 0.0628 ...
  .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. ..$ : chr [1:737] "MAP15387421" "MAP26637867" "MAP34726040" "MAP46246604" ...
  .. .. .. ..$ : chr [1:5] "ENSG00000074201" "ENSG00000048649" "ENSG00000159063" "ENSG00000188997" ...
  .. ..$ G            : num [1:5, 1:5] 0.00883 0.00382 0.01266 0.00541 0.00374 ...
  .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. ..$ : chr [1:5] "ENSG00000074201" "ENSG00000048649" "ENSG00000159063" "ENSG00000188997" ...
  .. .. .. ..$ : chr [1:5] "ENSG00000074201" "ENSG00000048649" "ENSG00000159063" "ENSG00000188997" ...
  .. ..$ pve          : Named num [1:5] 0.0212 0.0123 0.1862 0.0254 0.0548
  .. .. ..- attr(*, "names")= chr [1:5] "ENSG00000074201" "ENSG00000048649" "ENSG00000159063" "ENSG00000188997" ...
  .. ..$ progress     :'data.frame':    55 obs. of  5 variables:
  .. .. ..$ iter        : num [1:55] 1 2 3 4 5 6 7 8 9 10 ...
  .. .. ..$ timing      : num [1:55] 13.1 13.6 14 14 14 ...
  .. .. ..$ mu1_max.diff: num [1:55] 0.11127 0.06216 0.04167 0.00867 0.00708 ...
  .. .. ..$ ELBO_diff   : num [1:55] Inf 49.14 8.237 1.527 0.788 ...
  .. .. ..$ ELBO        : num [1:55] -3296 -3247 -3239 -3237 -3236 ...
  .. ..$ converged    : logi TRUE
  .. ..$ ELBO         : num -3233
  .. ..$ analysis_time: Named num 546
  .. .. ..- attr(*, "names")= chr "elapsed"
  .. ..- attr(*, "class")= chr [1:2] "mr.mash" "list"
  ..$ reweighted_mixture_prior   :Classes 'MashInitializer', 'R6' <environment: 0x300d828> 
  ..$ reweighted_mixture_prior_cv:List of 2
  .. ..$ :Classes 'MashInitializer', 'R6' <environment: 0x300d828> 
  .. ..$ :Classes 'MashInitializer', 'R6' <environment: 0x300d828> 
  ..$ mvsusie_fitted             :List of 28
  .. ..$ alpha             : num [1:20, 1:14830] 3.97e-46 4.03e-12 1.38e-05 5.70e-05 5.88e-05 ...
  .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. ..$ : chr [1:20] "l1" "l2" "l3" "l4" ...
  .. .. .. ..$ : chr [1:14830] "chr11:77325990:C:T" "chr11:77326116:G:A" "chr11:77326354:A:G" "chr11:77326475:G:C" ...
  .. ..$ b1                : num [1:20, 1:14830, 1:5] -7.80e-49 1.28e-14 -5.73e-08 0.00 0.00 ...
  .. .. ..- attr(*, "dimnames")=List of 3
  .. .. .. ..$ single_effect: chr [1:20] "l1" "l2" "l3" "l4" ...
  .. .. .. ..$ variable     : chr [1:14830] "chr11:77325990:C:T" "chr11:77326116:G:A" "chr11:77326354:A:G" "chr11:77326475:G:C" ...
  .. .. .. ..$ condition    : chr [1:5] "ENSG00000074201" "ENSG00000048649" "ENSG00000159063" "ENSG00000188997" ...
  .. ..$ b2                : num [1:20, 1:14830, 1:5] 4.66e-50 7.22e-16 2.27e-09 0.00 0.00 ...
  .. .. ..- attr(*, "dimnames")=List of 3
  .. .. .. ..$ single_effect: chr [1:20] "l1" "l2" "l3" "l4" ...
  .. .. .. ..$ variable     : chr [1:14830] "chr11:77325990:C:T" "chr11:77326116:G:A" "chr11:77326354:A:G" "chr11:77326475:G:C" ...
  .. .. .. ..$ condition    : chr [1:5] "ENSG00000074201" "ENSG00000048649" "ENSG00000159063" "ENSG00000188997" ...
  .. ..$ KL                : num [1:20] 17.4903 11.5316 7.7359 0.109 0.0886 ...
  .. ..$ lbf               : num [1:20] 92.5381 14.961 0.6533 -0.109 -0.0886 ...
  .. ..$ lbf_variable      : num [1:20, 1:14830] -2.396 -1.671 -0.936 0 0 ...
  .. ..$ V                 : num [1:20] 0.03724 0.01042 0.00352 0 0 ...
  .. ..$ sigma2            : num [1:5, 1:5] 0.40797 0.00349 0.016 -0.01287 0.00417 ...
  .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. ..$ : chr [1:5] "ENSG00000074201" "ENSG00000048649" "ENSG00000159063" "ENSG00000188997" ...
  .. .. .. ..$ : chr [1:5] "ENSG00000074201" "ENSG00000048649" "ENSG00000159063" "ENSG00000188997" ...
  .. ..$ elbo              : num [1:31] -3285 -3227 -3223 -3221 -3219 ...
  .. ..$ niter             : int 31
  .. ..$ convergence       :List of 2
  .. .. ..$ delta    : num 0.000904
  .. .. ..$ converged: logi TRUE
  .. ..$ coef              : num [1:14831, 1:5] 7.65e-02 -8.66e-08 -1.61e-08 -1.66e-06 -1.66e-06 ...
  .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. ..$ : chr [1:14831] "(Intercept)" "chr11:77325990:C:T" "chr11:77326116:G:A" "chr11:77326354:A:G" ...
  .. .. .. ..$ : chr [1:5] "ENSG00000074201" "ENSG00000048649" "ENSG00000159063" "ENSG00000188997" ...
  .. ..$ b1_rescaled       : num [1:20, 1:14831, 1:5] -4.34e-03 6.76e-02 1.32e-02 5.85e-18 5.85e-18 ...
  .. ..$ mixture_weights   : num [1:20, 1:14830, 1:11] 0 0 0 0 0 0 0 0 0 0 ...
  .. .. ..- attr(*, "dimnames")=List of 3
  .. .. .. ..$ : NULL
  .. .. .. ..$ : NULL
  .. .. .. ..$ : NULL
  .. ..$ conditional_lfsr  : num [1:20, 1:14830, 1:5] 0.812 0.814 0.702 0.614 0.604 ...
  .. .. ..- attr(*, "dimnames")=List of 3
  .. .. .. ..$ : NULL
  .. .. .. ..$ : NULL
  .. .. .. ..$ : NULL
  .. ..$ lfsr              : num [1:14830, 1:5] 1 1 1 1 1 ...
  .. ..$ single_effect_lfsr: num [1:20, 1:5] 4.37e-01 3.90e-07 4.55e-01 5.82e-01 5.82e-01 ...
  .. ..$ alpha_history     :List of 3
  .. .. ..$ : NULL
  .. .. ..$ : NULL
  .. .. ..$ : NULL
  .. ..$ lbf_history       :List of 3
  .. .. ..$ : NULL
  .. .. ..$ : NULL
  .. .. ..$ : NULL
  .. ..$ prior_history     :List of 3
  .. .. ..$ : NULL
  .. .. ..$ : NULL
  .. .. ..$ : NULL
  .. ..$ fitted            : num [1:737, 1:5] 0.0687 0.069 0.0714 -0.2329 0.0696 ...
  .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. ..$ : chr [1:737] "MAP15387421" "MAP26637867" "MAP34726040" "MAP46246604" ...
  .. .. .. ..$ : chr [1:5] "ENSG00000074201" "ENSG00000048649" "ENSG00000159063" "ENSG00000188997" ...
  .. ..$ intercept         : Named num [1:5] 0.0765 0.0335 -0.3115 -0.1172 -0.1102
  .. .. ..- attr(*, "names")= chr [1:5] "ENSG00000074201" "ENSG00000048649" "ENSG00000159063" "ENSG00000188997" ...
  .. ..$ pip               : Named num [1:14830] 1.38e-05 1.07e-05 1.05e-04 1.05e-04 1.61e-05 ...
  .. .. ..- attr(*, "names")= chr [1:14830] "chr11:77325990:C:T" "chr11:77326116:G:A" "chr11:77326354:A:G" "chr11:77326475:G:C" ...
  .. ..$ walltime          : 'proc_time' Named num [1:5] 364.51 4.63 370.42 0 0
  .. .. ..- attr(*, "names")= chr [1:5] "user.self" "sys.self" "elapsed" "user.child" ...
  .. ..$ sets              :List of 5
  .. .. ..$ cs                :List of 2
  .. .. .. ..$ L1: int [1:11] 1814 1827 1839 1856 1863 1892 1919 1937 1940 1943 ...
  .. .. .. ..$ L2: int [1:242] 458 467 470 474 510 513 516 517 520 521 ...
  .. .. ..$ purity            :'data.frame':    2 obs. of  3 variables:
  .. .. .. ..$ min.abs.corr   : num [1:2] 0.979 0.898
  .. .. .. ..$ mean.abs.corr  : num [1:2] 0.989 0.975
  .. .. .. ..$ median.abs.corr: num [1:2] 0.988 0.973
  .. .. ..$ cs_index          : int [1:2] 1 2
  .. .. ..$ coverage          : num [1:2] 0.954 0.951
  .. .. ..$ requested_coverage: num 0.95
  .. ..$ residual_variance : num [1:5, 1:5] 0.40797 0.00349 0.016 -0.01287 0.00417 ...
  .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. ..$ : chr [1:5] "ENSG00000074201" "ENSG00000048649" "ENSG00000159063" "ENSG00000188997" ...
  .. .. .. ..$ : chr [1:5] "ENSG00000074201" "ENSG00000048649" "ENSG00000159063" "ENSG00000188997" ...
  .. ..$ condition_names   : chr [1:5] "ENSG00000074201" "ENSG00000048649" "ENSG00000159063" "ENSG00000188997" ...
  .. ..$ variable_names    : chr [1:14830] "chr11:77325990:C:T" "chr11:77326116:G:A" "chr11:77326354:A:G" "chr11:77326475:G:C" ...
  .. ..- attr(*, "class")= chr "mvsusie"
  ..$ variant_names              : chr [1:14830] "chr11:77325990:C:T" "chr11:77326116:G:A" "chr11:77326354:A:G" "chr11:77326475:G:C" ...
  ..$ analysis_script            : chr "\nlibrary(data.table)\nlibrary(dplyr)\nlibrary(pecotmr)\ncombine_result_list <- function(univariate_finemapping"| __truncated__
  ..$ other_quantities           :List of 1
  .. ..$ dropped_samples: NULL
  ..$ context_names              : chr [1:5] "ENSG00000074201" "ENSG00000048649" "ENSG00000159063" "ENSG00000188997" ...
  ..$ top_loci                   :'data.frame': 263 obs. of  4 variables:
  .. ..$ variant_id      : chr [1:263] "chr11:77534902:C:T" "chr11:77539867:G:A" "chr11:77543221:C:T" "chr11:77546759:C:T" ...
  .. ..$ maf             : num [1:263] 0.107 0.107 0.107 0.107 0.107 ...
  .. ..$ pip             : num [1:263] 0.000857 0.000857 0.000932 0.000857 0.000857 ...
  .. ..$ cs_coverage_0.95: num [1:263] 2 2 2 2 2 2 2 2 2 2 ...
  ..$ susie_result_trimmed       :List of 12
  .. ..$ pip           : num [1:14830] 1.38e-05 1.07e-05 1.05e-04 1.05e-04 1.61e-05 ...
  .. ..$ sets          :List of 5
  .. .. ..$ cs                :List of 2
  .. .. .. ..$ L1: int [1:11] 1814 1827 1839 1856 1863 1892 1919 1937 1940 1943 ...
  .. .. .. ..$ L2: int [1:242] 458 467 470 474 510 513 516 517 520 521 ...
  .. .. ..$ purity            :'data.frame':    2 obs. of  3 variables:
  .. .. .. ..$ min.abs.corr   : num [1:2] 0.979 0.877
  .. .. .. ..$ mean.abs.corr  : num [1:2] 0.989 0.972
  .. .. .. ..$ median.abs.corr: num [1:2] 0.988 0.973
  .. .. ..$ cs_index          : int [1:2] 1 2
  .. .. ..$ coverage          : num [1:2] 0.954 0.951
  .. .. ..$ requested_coverage: num 0.95
  .. ..$ cs_corr       : num [1:2, 1:2] 1 -0.184 -0.184 1
  .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. ..$ : chr [1:2] "L1" "L2"
  .. .. .. ..$ : chr [1:2] "L1" "L2"
  .. ..$ sets_secondary: NULL
  .. ..$ alpha         : num [1:3, 1:14830] 3.97e-46 4.03e-12 1.38e-05 3.77e-46 3.82e-12 ...
  .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. ..$ : chr [1:3] "l1" "l2" "l3"
  .. .. .. ..$ : chr [1:14830] "chr11:77325990:C:T" "chr11:77326116:G:A" "chr11:77326354:A:G" "chr11:77326475:G:C" ...
  .. ..$ lbf_variable  : num [1:3, 1:14830] -2.396 -1.671 -0.936 -2.449 -1.726 ...
  .. ..$ V             : num [1:3] 0.03724 0.01042 0.00352
  .. ..$ niter         : int 31
  .. ..$ max_L         : int 20
  .. ..$ b1_rescaled   : num [1:3, 1:14831, 1:5] -4.34e-03 6.76e-02 1.32e-02 -1.18e-48 1.94e-14 ...
  .. ..$ coef          : num [1:14831, 1:5] 7.65e-02 -8.66e-08 -1.61e-08 -1.66e-06 -1.66e-06 ...
  .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. ..$ : chr [1:14831] "(Intercept)" "chr11:77325990:C:T" "chr11:77326116:G:A" "chr11:77326354:A:G" ...
  .. .. .. ..$ : chr [1:5] "ENSG00000074201" "ENSG00000048649" "ENSG00000159063" "ENSG00000188997" ...
  .. ..$ clfsr         : num [1:3, 1:14830, 1:5] 0.812 0.814 0.702 0.872 0.862 ...
  .. .. ..- attr(*, "dimnames")=List of 3
  .. .. .. ..$ : NULL
  .. .. .. ..$ : NULL
  .. .. .. ..$ : NULL
  .. ..- attr(*, "class")= chr "susie"
  ..$ total_time_elapsed         : 'proc_time' Named num [1:5] 982 13 999 0 0
  .. ..- attr(*, "names")= chr [1:5] "user.self" "sys.self" "elapsed" "user.child" ...
  ..$ region_info                :List of 3
  .. ..$ region_coord:'data.frame':     1 obs. of  3 variables:
  .. .. ..$ chrom: chr "11"
  .. .. ..$ start: int 77324757
  .. .. ..$ end  : int 82556425
  .. ..$ grange      :'data.frame':     1 obs. of  3 variables:
  .. .. ..$ chrom: chr "11"
  .. .. ..$ start: int 77324757
  .. .. ..$ end  : int 82556425
  .. ..$ region_name : chr [1:12] "chr11_77324757_82556425" "ENSG00000149269" "ENSG00000074201" "ENSG00000048649" ...
  • *.multigene_twas_weights.rds

> str(readRDS("ROSMAP_Ast_mega_eQTL.chr11_chr11_77324757_82556425.multigene_twas_weights.rds"))
List of 1
 $ chr11_77324757_82556425:List of 5
  ..$ ENSG00000074201:List of 5
  .. ..$ twas_weights      :List of 2
  .. .. ..$ mrmash_weights : Named num [1:14830] 3.02e-06 3.81e-06 -1.59e-05 -1.59e-05 -1.36e-05 ...
  .. .. .. ..- attr(*, "names")= chr [1:14830] "chr11:77325990:C:T" "chr11:77326116:G:A" "chr11:77326354:A:G" "chr11:77326475:G:C" ...
  .. .. ..$ mvsusie_weights: Named num [1:14830] -8.66e-08 -1.61e-08 -1.66e-06 -1.66e-06 -8.16e-08 ...
  .. .. .. ..- attr(*, "names")= chr [1:14830] "chr11:77325990:C:T" "chr11:77326116:G:A" "chr11:77326354:A:G" "chr11:77326475:G:C" ...
  .. ..$ twas_predictions  :List of 2
  .. .. ..$ mrmash_predicted : Named num [1:737] -0.0706 -0.0438 0.013 -0.239 -0.0285 ...
  .. .. .. ..- attr(*, "names")= chr [1:737] "MAP15387421" "MAP26637867" "MAP34726040" "MAP46246604" ...
  .. .. ..$ mvsusie_predicted: Named num [1:737] -0.00776 -0.00753 -0.00509 -0.30936 -0.00694 ...
  .. .. .. ..- attr(*, "names")= chr [1:737] "MAP15387421" "MAP26637867" "MAP34726040" "MAP46246604" ...
  .. ..$ variant_names     : chr [1:14830] "chr11:77325990:C:T" "chr11:77326116:G:A" "chr11:77326354:A:G" "chr11:77326475:G:C" ...
  .. ..$ total_time_elapsed: 'proc_time' Named num [1:5] 0.725 0.179 0.981 0.011 0.018
  .. .. ..- attr(*, "names")= chr [1:5] "user.self" "sys.self" "elapsed" "user.child" ...
  .. ..$ region_info       :List of 3
  .. .. ..$ region_coord:'data.frame':  1 obs. of  3 variables:
  .. .. .. ..$ chrom: chr "11"
  .. .. .. ..$ start: int 77324757
  .. .. .. ..$ end  : int 82556425
  .. .. ..$ grange      :'data.frame':  1 obs. of  3 variables:
  .. .. .. ..$ chrom: chr "11"
  .. .. .. ..$ start: int 77324757
  .. .. .. ..$ end  : int 82556425
  .. .. ..$ region_name : chr [1:12] "chr11_77324757_82556425" "ENSG00000149269" "ENSG00000074201" "ENSG00000048649" ...
  ..$ ENSG00000048649:List of 5
  .. ..$ twas_weights      :List of 2
  .. .. ..$ mrmash_weights : Named num [1:14830] -5.97e-07 -1.07e-06 -1.76e-05 -1.76e-05 -2.98e-06 ...
  .. .. .. ..- attr(*, "names")= chr [1:14830] "chr11:77325990:C:T" "chr11:77326116:G:A" "chr11:77326354:A:G" "chr11:77326475:G:C" ...
  .. .. ..$ mvsusie_weights: Named num [1:14830] -1.01e-07 -4.81e-08 -1.17e-06 -1.17e-06 -4.58e-08 ...
  .. .. .. ..- attr(*, "names")= chr [1:14830] "chr11:77325990:C:T" "chr11:77326116:G:A" "chr11:77326354:A:G" "chr11:77326475:G:C" ...
  .. ..$ twas_predictions  :List of 2
  .. .. ..$ mrmash_predicted : Named num [1:737] 0.07298 0.03071 0.00939 -0.07178 0.04762 ...
  .. .. .. ..- attr(*, "names")= chr [1:737] "MAP15387421" "MAP26637867" "MAP34726040" "MAP46246604" ...
  .. .. ..$ mvsusie_predicted: Named num [1:737] 2.05e-02 1.89e-02 4.65e-06 -1.70e-01 1.98e-02 ...
  .. .. .. ..- attr(*, "names")= chr [1:737] "MAP15387421" "MAP26637867" "MAP34726040" "MAP46246604" ...
  .. ..$ variant_names     : chr [1:14830] "chr11:77325990:C:T" "chr11:77326116:G:A" "chr11:77326354:A:G" "chr11:77326475:G:C" ...
  .. ..$ total_time_elapsed: 'proc_time' Named num [1:5] 0.725 0.179 0.981 0.011 0.018
  .. .. ..- attr(*, "names")= chr [1:5] "user.self" "sys.self" "elapsed" "user.child" ...
  .. ..$ region_info       :List of 3
  .. .. ..$ region_coord:'data.frame':  1 obs. of  3 variables:
  .. .. .. ..$ chrom: chr "11"
  .. .. .. ..$ start: int 77324757
  .. .. .. ..$ end  : int 82556425
  .. .. ..$ grange      :'data.frame':  1 obs. of  3 variables:
  .. .. .. ..$ chrom: chr "11"
  .. .. .. ..$ start: int 77324757
  .. .. .. ..$ end  : int 82556425
  .. .. ..$ region_name : chr [1:12] "chr11_77324757_82556425" "ENSG00000149269" "ENSG00000074201" "ENSG00000048649" ...
  ..$ ENSG00000159063:List of 5
  .. ..$ twas_weights      :List of 2
  .. .. ..$ mrmash_weights : Named num [1:14830] -2.51e-06 -9.33e-06 -3.67e-04 -3.67e-04 4.71e-06 ...
  .. .. .. ..- attr(*, "names")= chr [1:14830] "chr11:77325990:C:T" "chr11:77326116:G:A" "chr11:77326354:A:G" "chr11:77326475:G:C" ...
  .. .. ..$ mvsusie_weights: Named num [1:14830] -2.12e-07 -2.77e-07 -3.14e-05 -3.14e-05 2.22e-07 ...
  .. .. .. ..- attr(*, "names")= chr [1:14830] "chr11:77325990:C:T" "chr11:77326116:G:A" "chr11:77326354:A:G" "chr11:77326475:G:C" ...
  .. ..$ twas_predictions  :List of 2
  .. .. ..$ mrmash_predicted : Named num [1:737] 0.4017 0.3971 -0.0145 -0.0315 0.4016 ...
  .. .. .. ..- attr(*, "names")= chr [1:737] "MAP15387421" "MAP26637867" "MAP34726040" "MAP46246604" ...
  .. .. ..$ mvsusie_predicted: Named num [1:737] 0.3727 0.3725 -0.0192 -0.1086 0.3728 ...
  .. .. .. ..- attr(*, "names")= chr [1:737] "MAP15387421" "MAP26637867" "MAP34726040" "MAP46246604" ...
  .. ..$ variant_names     : chr [1:14830] "chr11:77325990:C:T" "chr11:77326116:G:A" "chr11:77326354:A:G" "chr11:77326475:G:C" ...
  .. ..$ total_time_elapsed: 'proc_time' Named num [1:5] 0.725 0.179 0.981 0.011 0.018
  .. .. ..- attr(*, "names")= chr [1:5] "user.self" "sys.self" "elapsed" "user.child" ...
  .. ..$ region_info       :List of 3
  .. .. ..$ region_coord:'data.frame':  1 obs. of  3 variables:
  .. .. .. ..$ chrom: chr "11"
  .. .. .. ..$ start: int 77324757
  .. .. .. ..$ end  : int 82556425
  .. .. ..$ grange      :'data.frame':  1 obs. of  3 variables:
  .. .. .. ..$ chrom: chr "11"
  .. .. .. ..$ start: int 77324757
  .. .. .. ..$ end  : int 82556425
  .. .. ..$ region_name : chr [1:12] "chr11_77324757_82556425" "ENSG00000149269" "ENSG00000074201" "ENSG00000048649" ...
  ..$ ENSG00000188997:List of 5
  .. ..$ twas_weights      :List of 2
  .. .. ..$ mrmash_weights : Named num [1:14830] -1.31e-07 9.65e-08 -1.04e-05 -1.04e-05 -7.16e-07 ...
  .. .. .. ..- attr(*, "names")= chr [1:14830] "chr11:77325990:C:T" "chr11:77326116:G:A" "chr11:77326354:A:G" "chr11:77326475:G:C" ...
  .. .. ..$ mvsusie_weights: Named num [1:14830] -9.39e-08 4.32e-08 2.20e-06 2.20e-06 -5.78e-08 ...
  .. .. .. ..- attr(*, "names")= chr [1:14830] "chr11:77325990:C:T" "chr11:77326116:G:A" "chr11:77326354:A:G" "chr11:77326475:G:C" ...
  .. ..$ twas_predictions  :List of 2
  .. .. ..$ mrmash_predicted : Named num [1:737] 0.14819 0.14679 -0.00484 -0.02042 0.15488 ...
  .. .. .. ..- attr(*, "names")= chr [1:737] "MAP15387421" "MAP26637867" "MAP34726040" "MAP46246604" ...
  .. .. ..$ mvsusie_predicted: Named num [1:737] 0.1523 0.1519 -0.02 -0.0413 0.1534 ...
  .. .. .. ..- attr(*, "names")= chr [1:737] "MAP15387421" "MAP26637867" "MAP34726040" "MAP46246604" ...
  .. ..$ variant_names     : chr [1:14830] "chr11:77325990:C:T" "chr11:77326116:G:A" "chr11:77326354:A:G" "chr11:77326475:G:C" ...
  .. ..$ total_time_elapsed: 'proc_time' Named num [1:5] 0.725 0.179 0.981 0.011 0.018
  .. .. ..- attr(*, "names")= chr [1:5] "user.self" "sys.self" "elapsed" "user.child" ...
  .. ..$ region_info       :List of 3
  .. .. ..$ region_coord:'data.frame':  1 obs. of  3 variables:
  .. .. .. ..$ chrom: chr "11"
  .. .. .. ..$ start: int 77324757
  .. .. .. ..$ end  : int 82556425
  .. .. ..$ grange      :'data.frame':  1 obs. of  3 variables:
  .. .. .. ..$ chrom: chr "11"
  .. .. .. ..$ start: int 77324757
  .. .. .. ..$ end  : int 82556425
  .. .. ..$ region_name : chr [1:12] "chr11_77324757_82556425" "ENSG00000149269" "ENSG00000074201" "ENSG00000048649" ...
  ..$ ENSG00000033327:List of 5
  .. ..$ twas_weights      :List of 2
  .. .. ..$ mrmash_weights : Named num [1:14830] -7.89e-07 -2.21e-06 -3.37e-05 -3.37e-05 1.50e-06 ...
  .. .. .. ..- attr(*, "names")= chr [1:14830] "chr11:77325990:C:T" "chr11:77326116:G:A" "chr11:77326354:A:G" "chr11:77326475:G:C" ...
  .. .. ..$ mvsusie_weights: Named num [1:14830] -1.02e-07 -1.15e-07 -3.05e-06 -3.05e-06 1.75e-07 ...
  .. .. .. ..- attr(*, "names")= chr [1:14830] "chr11:77325990:C:T" "chr11:77326116:G:A" "chr11:77326354:A:G" "chr11:77326475:G:C" ...
  .. ..$ twas_predictions  :List of 2
  .. .. ..$ mrmash_predicted : Named num [1:737] 0.11803 0.12448 -0.00522 -0.00542 0.12055 ...
  .. .. .. ..- attr(*, "names")= chr [1:737] "MAP15387421" "MAP26637867" "MAP34726040" "MAP46246604" ...
  .. .. ..$ mvsusie_predicted: Named num [1:737] 0.126205 0.127183 -0.005335 0.000314 0.12657 ...
  .. .. .. ..- attr(*, "names")= chr [1:737] "MAP15387421" "MAP26637867" "MAP34726040" "MAP46246604" ...
  .. ..$ variant_names     : chr [1:14830] "chr11:77325990:C:T" "chr11:77326116:G:A" "chr11:77326354:A:G" "chr11:77326475:G:C" ...
  .. ..$ total_time_elapsed: 'proc_time' Named num [1:5] 0.725 0.179 0.981 0.011 0.018
  .. .. ..- attr(*, "names")= chr [1:5] "user.self" "sys.self" "elapsed" "user.child" ...
  .. ..$ region_info       :List of 3
  .. .. ..$ region_coord:'data.frame':  1 obs. of  3 variables:
  .. .. .. ..$ chrom: chr "11"
  .. .. .. ..$ start: int 77324757
  .. .. .. ..$ end  : int 82556425
  .. .. ..$ grange      :'data.frame':  1 obs. of  3 variables:
  .. .. .. ..$ chrom: chr "11"
  .. .. .. ..$ start: int 77324757
  .. .. .. ..$ end  : int 82556425
  .. .. ..$ region_name : chr [1:12] "chr11_77324757_82556425" "ENSG00000149269" "ENSG00000074201" "ENSG00000048649" ...

Minimal Working Example Steps#

ii. Run the Fine-Mapping with mvSuSiE#

sos run pipeline/mnm_regression.ipynb mnm_genes \
    --name ROSMAP_Ast_mega_eQTL \
    --genoFile data/mnm_genes/ROSMAP_NIA_WGS.leftnorm.bcftools_qc.plink_qc.11.bed \
    --phenoFile data/mnm_genes/snuc_pseudo_bulk.Ast.mega.normalized.log2cpm.region_list.txt \
    --covFile data/mnm_genes/snuc_pseudo_bulk.Ast.mega.normalized.log2cpm.rosmap_cov.ROSMAP_NIA_WGS.leftnorm.bcftools_qc.plink_qc.snuc_pseudo_bulk_mega.related.plink_qc.extracted.pca.projected.Marchenko_PC.gz \
    --customized-association-windows data/mnm_genes/extended_TADB.bed \
    --phenotype-names Ast_mega_eQTL \
    --max-cv-variants 5000 \
    --ld_reference_meta_file data/ld_meta_file_with_bim.tsv \
    --independent_variant_list data/mnm_genes/ld_pruned_variants.txt.gz \
    --fine_mapping_meta data/mnm_genes/combined_data_updated.tsv \
    --phenoIDFile data/mnm_genes/phenoIDFile_extended_TADB.bed \
    --region-name chr11_77324757_82556425 \
    --skip-analysis-pip-cutoff 0 \
    --maf 0.01 \
    --coverage 0.95 \
    --pheno_id_map_file data/mnm_genes/pheno_id_map_file.txt \
    --prior-canonical-matrices \
    --twas-cv-folds 0 \
    --trans-analysis \
    --cwd output/mnm_regression/mnm_genes -s build

Anticipated Results#

For each gene and region, multivariate multigene finemapping will produce a file containing results for the top hits and a file containing twas weights produced by susie.

ROSMAP_Ast_DeJager_eQTL.chr11_77324757_86627922.multigene_bvrs.rds:

  • for each region name, includes:

  1. mrmash_fitted

  2. reweighted_mixture_prior

  3. reweighted_mixture_prior_cv

  4. mvsusie_fitted

  5. variant_names

  6. analysis_script

  7. other_quantitites

  8. analysis_script

  9. top_loci

  10. susie_result_trimmed

  11. total_time_elapsed

  12. region_info

ROSMAP_Ast_DeJager_eQTL.chr11_77324757_86627922.multigene_data.rds:(from the –save-data argument)

ROSMAP_Ast_DeJager_eQTL.chr11_77324757_86627922.multigene_twas_weights.rds:

  • for each region name and for each gene within that region, includes:

  1. twas_weights - from mrmash and mvsusie

  2. twas_predictions - from mrmash and mvsusie

  3. variant_names

  4. total_time_elapsed

  5. region_info