Univariate Fine-Mapping and TWAS with SuSiE#
Description#
Our pipeline is capable of performing univariate fine-mapping with SuSiE with TWAS weights. The TWAS portion makes use of TWAS weights, linkage disequilibrium data and GWAS summary statistics. Preset variants used are taken from the linkage disequilibrium data and used only for TWAS. TWAS cross validation tell us which of the four methods (enet, lasso, mrash, SuSiE) are best to use. By default, we limit to under 5000 variants for cross validation. In cross validation, the data is split into five parts. Training is done on four parts, and prediction is done on the fifth. Linear regression is used to assess the results and get r squared and pvalues.
Fine mapping with SuSiE follows the formulat y=xb+e where x has many highly correlated variables due to linkage disequilibrium. Therefore, true effects (b), are very sparse. The SuSiE wrapper looks for five independent signals in each region to increase convergence speed. However, if five signals are found, then the the upper limit is increased. SuSiE does not allow for the inclusion of covariates. Therefore, covariates are regressed in.
Input:#
--genoFile
: path to a plink bed file containing genotypes.
--phenoFile
: at least one tab delimited file containing chr, start, end, ID and path for the regions. For example:
#chr start end ID path
chr12 389272 389273 ENSG00000120647 $PATH/Ast.log2cpm.bed.gz
chr12 389319 389320 ENSG00000073614 $PATH/Ast.log2cpm.bed.gz
--covFile
: at least one tab delimited file containing covariates in the rows and samples in the columns.
--customized-association-windows
: a tab delimited file containing chr, start, end, and gene_id. For example:
#chr start end gene_id
chr1 0 6480000 ENSG00000008128
chr1 0 6480000 ENSG00000008130
----phenotype-names
: the names of the phenotypes if multiple are included. There should be one for each phenotype file you include.
--max-cv-variants
: maximum number of variants for cross-validation.
--ld_reference_meta_file
: path to file containing chrom, start, end and path for linkage disequilibrium region information. For example:
#chrom start end path
chr1 101384274 104443097 $PATH/chr1_101384274_104443097.cor.xz.bim
chr1 104443097 106225286 $PATH/chr1_104443097_106225286.cor.xz.bim
--region-name
: if you only wish to analyze one region, then include the gene_id of a region found in the customized-association-windows
file
Output#
*.univariate_bvrs.rds
> str(readRDS("/restricted/projectnb/xqtl/xqtl_protocol/toy_xqtl_protocol/output/mnm_regression/susie_twas/fine_mapping/test_susie_twas.chr11_ENSG00000073921.univariate_bvsr.rds"))
List of 1
$ ENSG00000073921:List of 1
..$ test_pheno_ENSG00000073921:List of 10
.. ..$ variant_names : chr [1:9970] "chr11:84957209:G:C" "chr11:84957879:A:G" "chr11:84958236:C:A" "chr11:84958954:T:C" ...
.. ..$ analysis_script : chr "options(warn=1)\nlibrary(pecotmr)\nphenotype_files = c(\"/restricted/projectnb/xqtl/xqtl_protocol/toy_xqtl_prot"| __truncated__
.. ..$ other_quantities :List of 1
.. .. ..$ dropped_samples:List of 3
.. .. .. ..$ X : NULL
.. .. .. ..$ y : NULL
.. .. .. ..$ covar: NULL
.. ..$ sumstats :List of 5
.. .. ..$ betahat : Named num [1:9970] 0.00479 0.00573 0.09947 0.02035 0.02035 ...
.. .. .. ..- attr(*, "names")= chr [1:9970] "chr11:84957209:G:C" "chr11:84957879:A:G" "chr11:84958236:C:A" "chr11:84958954:T:C" ...
.. .. ..$ sebetahat: Named num [1:9970] 0.0563 0.0568 0.0791 0.0551 0.0551 ...
.. .. .. ..- attr(*, "names")= chr [1:9970] "chr11:84957209:G:C" "chr11:84957879:A:G" "chr11:84958236:C:A" "chr11:84958954:T:C" ...
.. .. ..$ z_scores : Named num [1:9970] 0.085 0.101 1.258 0.369 0.369 ...
.. .. .. ..- attr(*, "names")= chr [1:9970] "chr11:84957209:G:C" "chr11:84957879:A:G" "chr11:84958236:C:A" "chr11:84958954:T:C" ...
.. .. ..$ p_values : Named num [1:9970] 0.932 0.92 0.208 0.712 0.712 ...
.. .. .. ..- attr(*, "names")= chr [1:9970] "chr11:84957209:G:C" "chr11:84957879:A:G" "chr11:84958236:C:A" "chr11:84958954:T:C" ...
.. .. ..$ q_values : Named num [1:9970] 0.922 0.92 0.633 0.859 0.859 ...
.. .. .. ..- attr(*, "names")= chr [1:9970] "chr11:84957209:G:C" "chr11:84957879:A:G" "chr11:84958236:C:A" "chr11:84958954:T:C" ...
.. ..$ sample_names : chr [1:150] "sample0" "sample1" "sample10" "sample100" ...
.. ..$ top_loci :'data.frame': 3 obs. of 9 variables:
.. .. ..$ variant_id : chr [1:3] "chr11:85179671:G:A" "chr11:85266101:G:C" "chr11:85314573:G:A"
.. .. ..$ betahat : num [1:3] -0.743 -0.747 -0.747
.. .. ..$ sebetahat : num [1:3] 0.147 0.147 0.147
.. .. ..$ z : num [1:3] -5.06 -5.1 -5.1
.. .. ..$ maf : num [1:3] 0.0235 0.0233 0.0233
.. .. ..$ pip : num [1:3] 0.292 0.349 0.349
.. .. ..$ cs_coverage_0.95: num [1:3] 1 1 1
.. .. ..$ cs_coverage_0.7 : num [1:3] 1 1 1
.. .. ..$ cs_coverage_0.5 : num [1:3] 0 1 1
.. ..$ susie_result_trimmed :List of 12
.. .. ..$ pip : num [1:9970] 0.000431 0.00043 0.000673 0.000431 0.000431 ...
.. .. ..$ sets :List of 5
.. .. .. ..$ cs :List of 1
.. .. .. .. ..$ L1: int [1:3] 653 910 1069
.. .. .. ..$ purity :'data.frame': 1 obs. of 3 variables:
.. .. .. .. ..$ min.abs.corr : num 1
.. .. .. .. ..$ mean.abs.corr : num 1
.. .. .. .. ..$ median.abs.corr: num 1
.. .. .. ..$ cs_index : int 1
.. .. .. ..$ coverage : num 0.989
.. .. .. ..$ requested_coverage: num 0.95
.. .. ..$ cs_corr : logi NA
.. .. ..$ sets_secondary :List of 2
.. .. .. ..$ coverage_0.7:List of 2
.. .. .. .. ..$ sets :List of 5
.. .. .. .. .. ..$ cs :List of 1
.. .. .. .. .. .. ..$ L1: int [1:3] 653 910 1069
.. .. .. .. .. ..$ purity :'data.frame': 1 obs. of 3 variables:
.. .. .. .. .. .. ..$ min.abs.corr : num 1
.. .. .. .. .. .. ..$ mean.abs.corr : num 1
.. .. .. .. .. .. ..$ median.abs.corr: num 1
.. .. .. .. .. ..$ cs_index : int 1
.. .. .. .. .. ..$ coverage : num 0.989
.. .. .. .. .. ..$ requested_coverage: num 0.7
.. .. .. .. ..$ cs_corr: logi NA
.. .. .. ..$ coverage_0.5:List of 2
.. .. .. .. ..$ sets :List of 5
.. .. .. .. .. ..$ cs :List of 1
.. .. .. .. .. .. ..$ L1: int [1:2] 910 1069
.. .. .. .. .. ..$ purity :'data.frame': 1 obs. of 3 variables:
.. .. .. .. .. .. ..$ min.abs.corr : num 1
.. .. .. .. .. .. ..$ mean.abs.corr : num 1
.. .. .. .. .. .. ..$ median.abs.corr: num 1
.. .. .. .. .. ..$ cs_index : int 1
.. .. .. .. .. ..$ coverage : num 0.697
.. .. .. .. .. ..$ requested_coverage: num 0.5
.. .. .. .. ..$ cs_corr: logi NA
.. .. ..$ alpha : num [1:8, 1:9970] 1.52e-07 6.35e-05 6.05e-05 5.93e-05 5.99e-05 ...
.. .. .. ..- attr(*, "dimnames")=List of 2
.. .. .. .. ..$ : NULL
.. .. .. .. ..$ : chr [1:9970] "chr11:84957209:G:C" "chr11:84957879:A:G" "chr11:84958236:C:A" "chr11:84958954:T:C" ...
.. .. ..$ lbf_variable : num [1:8, 1:9970] -1.697 -0.402 -0.437 -0.451 -0.444 ...
.. .. .. ..- attr(*, "dimnames")=List of 2
.. .. .. .. ..$ : NULL
.. .. .. .. ..$ : chr [1:9970] "chr11:84957209:G:C" "chr11:84957879:A:G" "chr11:84958236:C:A" "chr11:84958954:T:C" ...
.. .. ..$ V : num [1:8] 0.019 0.00097 0.0011 0.00115 0.00112 ...
.. .. ..$ niter : int 10
.. .. ..$ max_L : int 8
.. .. ..$ X_column_scale_factors: Named num [1:9970] 0.517 0.512 0.366 0.528 0.528 ...
.. .. .. ..- attr(*, "names")= chr [1:9970] "chr11:84957209:G:C" "chr11:84957879:A:G" "chr11:84958236:C:A" "chr11:84958954:T:C" ...
.. .. ..$ mu : num [1:8, 1:9970] 0.00227 0.00643 0.00672 0.00683 0.00677 ...
.. .. .. ..- attr(*, "dimnames")=List of 2
.. .. .. .. ..$ : NULL
.. .. .. .. ..$ : chr [1:9970] "chr11:84957209:G:C" "chr11:84957879:A:G" "chr11:84958236:C:A" "chr11:84958954:T:C" ...
.. .. ..$ mu2 : num [1:8, 1:9970] 0.000638 0.000432 0.000455 0.000464 0.00046 ...
.. .. .. ..- attr(*, "dimnames")=List of 2
.. .. .. .. ..$ : NULL
.. .. .. .. ..$ : chr [1:9970] "chr11:84957209:G:C" "chr11:84957879:A:G" "chr11:84958236:C:A" "chr11:84958954:T:C" ...
.. .. ..- attr(*, "class")= chr "susie"
.. ..$ total_time_elapsed : 'proc_time' Named num [1:5] 32.189 0.339 43.517 0 0
.. .. ..- attr(*, "names")= chr [1:5] "user.self" "sys.self" "elapsed" "user.child" ...
.. ..$ region_info :List of 3
.. .. ..$ region_coord:'data.frame': 1 obs. of 3 variables:
.. .. .. ..$ chrom: chr "11"
.. .. .. ..$ start: int 85957174
.. .. .. ..$ end : int 86069881
.. .. ..$ grange :'data.frame': 1 obs. of 3 variables:
.. .. .. ..$ chrom: chr "11"
.. .. .. ..$ start: int 84957175
.. .. .. ..$ end : int 87360000
.. .. ..$ region_name : chr [1:2] "ENSG00000073921" "ENSG00000073921"
.. ..$ preset_variants_result:List of 9
.. .. ..$ susie_fitted :List of 22
.. .. .. ..$ alpha : num [1:8, 1:6454] 0.00015 0.000148 0.000148 0.000148 0.000149 ...
.. .. .. .. ..- attr(*, "dimnames")=List of 2
.. .. .. .. .. ..$ : NULL
.. .. .. .. .. ..$ : chr [1:6454] "chr11:84957209:G:C" "chr11:84958236:C:A" "chr11:84958954:T:C" "chr11:84959602:T:C" ...
.. .. .. ..$ mu : num [1:8, 1:6454] 0.000134 0.000169 0.000177 0.000169 0.000152 ...
.. .. .. .. ..- attr(*, "dimnames")=List of 2
.. .. .. .. .. ..$ : NULL
.. .. .. .. .. ..$ : chr [1:6454] "chr11:84957209:G:C" "chr11:84958236:C:A" "chr11:84958954:T:C" "chr11:84959602:T:C" ...
.. .. .. ..$ mu2 : num [1:8, 1:6454] 5.39e-05 6.83e-05 7.19e-05 6.85e-05 6.13e-05 ...
.. .. .. .. ..- attr(*, "dimnames")=List of 2
.. .. .. .. .. ..$ : NULL
.. .. .. .. .. ..$ : chr [1:6454] "chr11:84957209:G:C" "chr11:84958236:C:A" "chr11:84958954:T:C" "chr11:84959602:T:C" ...
.. .. .. ..$ Xr : num [1:150] 0.000696 -0.000919 -0.002423 -0.00093 0.000305 ...
.. .. .. ..$ KL : num [1:8] 0.0345 0.0445 0.047 0.0446 0.0395 ...
.. .. .. ..$ lbf : num [1:8] 0.000194 0.000317 0.000353 0.000319 0.000253 ...
.. .. .. ..$ lbf_variable : num [1:8, 1:6454] -0.0334 -0.0427 -0.045 -0.0428 -0.0381 ...
.. .. .. .. ..- attr(*, "dimnames")=List of 2
.. .. .. .. .. ..$ : NULL
.. .. .. .. .. ..$ : chr [1:6454] "chr11:84957209:G:C" "chr11:84958236:C:A" "chr11:84958954:T:C" "chr11:84959602:T:C" ...
.. .. .. ..$ sigma2 : num 0.124
.. .. .. ..$ V : num [1:8] 5.76e-05 7.44e-05 7.87e-05 7.46e-05 6.61e-05 ...
.. .. .. ..$ pi : num [1:6454] 0.000155 0.000155 0.000155 0.000155 0.000155 ...
.. .. .. ..$ null_index : num 0
.. .. .. ..$ correct_zR_discrepancy:List of 5
.. .. .. .. ..$ to_correct : logi FALSE
.. .. .. .. ..$ outlier_index : logi(0)
.. .. .. .. ..$ is_init : logi TRUE
.. .. .. .. ..$ outlier_stabilize : num 5
.. .. .. .. ..$ outlier_stable_count: num 0
.. .. .. ..$ force_iterate : logi FALSE
.. .. .. ..$ converged : logi TRUE
.. .. .. ..$ elbo : num [1:3] -56.5 -56.5 -56.5
.. .. .. ..$ niter : int 3
.. .. .. ..$ intercept : num -0.00273
.. .. .. ..$ fitted : Named num [1:150] 0.000696 -0.000919 -0.002423 -0.00093 0.000305 ...
.. .. .. .. ..- attr(*, "names")= chr [1:150] "sample0" "sample1" "sample10" "sample100" ...
.. .. .. ..$ sets :List of 3
.. .. .. .. ..$ cs : NULL
.. .. .. .. ..$ coverage : NULL
.. .. .. .. ..$ requested_coverage: num 0.95
.. .. .. ..$ pip : Named num [1:6454] 0.0012 0.00124 0.0012 0.0012 0.0012 ...
.. .. .. .. ..- attr(*, "names")= chr [1:6454] "chr11:84957209:G:C" "chr11:84958236:C:A" "chr11:84958954:T:C" "chr11:84959602:T:C" ...
.. .. .. ..$ X_column_scale_factors: Named num [1:6454] 0.664 0.438 0.677 0.702 0.677 ...
.. .. .. .. ..- attr(*, "names")= chr [1:6454] "chr11:84957209:G:C" "chr11:84958236:C:A" "chr11:84958954:T:C" "chr11:84959602:T:C" ...
.. .. .. ..$ time_elapsed : 'proc_time' Named num [1:5] 0.851 0.008 0.861 0 0
.. .. .. .. ..- attr(*, "names")= chr [1:5] "user.self" "sys.self" "elapsed" "user.child" ...
.. .. .. ..- attr(*, "class")= chr "susie"
.. .. ..$ variant_names : chr [1:6454] "chr11:84957209:G:C" "chr11:84958236:C:A" "chr11:84958954:T:C" "chr11:84959602:T:C" ...
.. .. ..$ analysis_script : chr "options(warn=1)\nlibrary(pecotmr)\nphenotype_files = c(\"/restricted/projectnb/xqtl/xqtl_protocol/toy_xqtl_prot"| __truncated__
.. .. ..$ other_quantities :List of 1
.. .. .. ..$ dropped_samples:List of 3
.. .. .. .. ..$ X : NULL
.. .. .. .. ..$ y : NULL
.. .. .. .. ..$ covar: NULL
.. .. ..$ sumstats :List of 5
.. .. .. ..$ betahat : Named num [1:6454] 0.0029 0.06935 0.01237 -0.00911 0.01237 ...
.. .. .. .. ..- attr(*, "names")= chr [1:6454] "chr11:84957209:G:C" "chr11:84958236:C:A" "chr11:84958954:T:C" "chr11:84959602:T:C" ...
.. .. .. ..$ sebetahat: Named num [1:6454] 0.0438 0.0661 0.0429 0.0414 0.0429 ...
.. .. .. .. ..- attr(*, "names")= chr [1:6454] "chr11:84957209:G:C" "chr11:84958236:C:A" "chr11:84958954:T:C" "chr11:84959602:T:C" ...
.. .. .. ..$ z_scores : Named num [1:6454] 0.0662 1.0485 0.288 -0.22 0.288 ...
.. .. .. .. ..- attr(*, "names")= chr [1:6454] "chr11:84957209:G:C" "chr11:84958236:C:A" "chr11:84958954:T:C" "chr11:84959602:T:C" ...
.. .. .. ..$ p_values : Named num [1:6454] 0.947 0.294 0.773 0.826 0.773 ...
.. .. .. .. ..- attr(*, "names")= chr [1:6454] "chr11:84957209:G:C" "chr11:84958236:C:A" "chr11:84958954:T:C" "chr11:84959602:T:C" ...
.. .. .. ..$ q_values : Named num [1:6454] 0.998 0.892 0.991 0.998 0.991 ...
.. .. .. .. ..- attr(*, "names")= chr [1:6454] "chr11:84957209:G:C" "chr11:84958236:C:A" "chr11:84958954:T:C" "chr11:84959602:T:C" ...
.. .. ..$ sample_names : chr [1:150] "sample0" "sample1" "sample10" "sample100" ...
.. .. ..$ susie_result_trimmed:List of 12
.. .. .. ..$ pip : num [1:6454] 0.0012 0.00124 0.0012 0.0012 0.0012 ...
.. .. .. ..$ sets :List of 3
.. .. .. .. ..$ cs : NULL
.. .. .. .. ..$ coverage : NULL
.. .. .. .. ..$ requested_coverage: num 0.95
.. .. .. ..$ cs_corr : logi NA
.. .. .. ..$ sets_secondary :List of 2
.. .. .. .. ..$ coverage_0.7:List of 2
.. .. .. .. .. ..$ sets :List of 3
.. .. .. .. .. .. ..$ cs : NULL
.. .. .. .. .. .. ..$ coverage : NULL
.. .. .. .. .. .. ..$ requested_coverage: num 0.7
.. .. .. .. .. ..$ cs_corr: logi NA
.. .. .. .. ..$ coverage_0.5:List of 2
.. .. .. .. .. ..$ sets :List of 3
.. .. .. .. .. .. ..$ cs : NULL
.. .. .. .. .. .. ..$ coverage : NULL
.. .. .. .. .. .. ..$ requested_coverage: num 0.5
.. .. .. .. .. ..$ cs_corr: logi NA
.. .. .. ..$ alpha : num [1:8, 1:6454] 0.00015 0.000148 0.000148 0.000148 0.000149 ...
.. .. .. .. ..- attr(*, "dimnames")=List of 2
.. .. .. .. .. ..$ : NULL
.. .. .. .. .. ..$ : chr [1:6454] "chr11:84957209:G:C" "chr11:84958236:C:A" "chr11:84958954:T:C" "chr11:84959602:T:C" ...
.. .. .. ..$ lbf_variable : num [1:8, 1:6454] -0.0334 -0.0427 -0.045 -0.0428 -0.0381 ...
.. .. .. .. ..- attr(*, "dimnames")=List of 2
.. .. .. .. .. ..$ : NULL
.. .. .. .. .. ..$ : chr [1:6454] "chr11:84957209:G:C" "chr11:84958236:C:A" "chr11:84958954:T:C" "chr11:84959602:T:C" ...
.. .. .. ..$ V : num [1:8] 5.76e-05 7.44e-05 7.87e-05 7.46e-05 6.61e-05 ...
.. .. .. ..$ niter : int 3
.. .. .. ..$ max_L : int 8
.. .. .. ..$ X_column_scale_factors: Named num [1:6454] 0.664 0.438 0.677 0.702 0.677 ...
.. .. .. .. ..- attr(*, "names")= chr [1:6454] "chr11:84957209:G:C" "chr11:84958236:C:A" "chr11:84958954:T:C" "chr11:84959602:T:C" ...
.. .. .. ..$ mu : num [1:8, 1:6454] 0.000134 0.000169 0.000177 0.000169 0.000152 ...
.. .. .. .. ..- attr(*, "dimnames")=List of 2
.. .. .. .. .. ..$ : NULL
.. .. .. .. .. ..$ : chr [1:6454] "chr11:84957209:G:C" "chr11:84958236:C:A" "chr11:84958954:T:C" "chr11:84959602:T:C" ...
.. .. .. ..$ mu2 : num [1:8, 1:6454] 5.39e-05 6.83e-05 7.19e-05 6.85e-05 6.13e-05 ...
.. .. .. .. ..- attr(*, "dimnames")=List of 2
.. .. .. .. .. ..$ : NULL
.. .. .. .. .. ..$ : chr [1:6454] "chr11:84957209:G:C" "chr11:84958236:C:A" "chr11:84958954:T:C" "chr11:84959602:T:C" ...
.. .. .. ..- attr(*, "class")= chr "susie"
.. .. ..$ total_time_elapsed : 'proc_time' Named num [1:5] 10.991 0.012 11.046 0 0
.. .. .. ..- attr(*, "names")= chr [1:5] "user.self" "sys.self" "elapsed" "user.child" ...
.. .. ..$ region_info :List of 3
.. .. .. ..$ region_coord:'data.frame': 1 obs. of 3 variables:
.. .. .. .. ..$ chrom: chr "11"
.. .. .. .. ..$ start: int 85957174
.. .. .. .. ..$ end : int 86069881
.. .. .. ..$ grange :'data.frame': 1 obs. of 3 variables:
.. .. .. .. ..$ chrom: chr "11"
.. .. .. .. ..$ start: int 84957175
.. .. .. .. ..$ end : int 87360000
.. .. .. ..$ region_name : chr [1:2] "ENSG00000073921" "ENSG00000073921"
*.univariate_data.rds
> str(readRDS("output/mnm_regression/susie_twas/data/test_susie_twas.chr11_ENSG00000073921.univariate_data.rds"))
List of 1
$ ENSG00000073921:List of 10
..$ residual_Y :List of 1
.. ..$ test_pheno: num [1:150, 1] 0.1241 0.2981 -0.6054 0.0435 0.0388 ...
.. .. ..- attr(*, "dimnames")=List of 2
.. .. .. ..$ : chr [1:150] "sample0" "sample1" "sample10" "sample100" ...
.. .. .. ..$ : chr "ENSG00000073921"
..$ residual_X :List of 1
.. ..$ : num [1:150, 1:9970] -0.2386 -0.8293 0.4173 0.7408 -0.0231 ...
.. .. ..- attr(*, "dimnames")=List of 2
.. .. .. ..$ : chr [1:150] "sample0" "sample1" "sample10" "sample100" ...
.. .. .. ..$ : chr [1:9970] "chr11:84957209:G:C" "chr11:84957879:A:G" "chr11:84958236:C:A" "chr11:84958954:T:C" ...
..$ residual_Y_scalar: num 1
..$ residual_X_scalar: num 1
..$ dropped_sample :List of 3
.. ..$ X :List of 1
.. .. ..$ : chr(0)
.. ..$ Y :List of 1
.. .. ..$ : chr "strand"
.. ..$ covar:List of 1
.. .. ..$ : chr(0)
..$ maf :List of 1
.. ..$ : Named num [1:9970] 0.33 0.327 0.103 0.343 0.343 ...
.. .. ..- attr(*, "names")= chr [1:9970] "chr11:84957209:G:C" "chr11:84957879:A:G" "chr11:84958236:C:A" "chr11:84958954:T:C" ...
..$ X : num [1:150, 1:9970] 0 0 1 1 1 0 0 1 1 1 ...
.. ..- attr(*, "dimnames")=List of 2
.. .. ..$ : chr [1:150] "sample0" "sample1" "sample10" "sample100" ...
.. .. ..$ : chr [1:9970] "chr11:84957209:G:C" "chr11:84957879:A:G" "chr11:84958236:C:A" "chr11:84958954:T:C" ...
..$ chrom : chr "chr11"
..$ grange : chr [1:2] "85957174" "86069881"
..$ X_variance :List of 1
.. ..$ : Named num [1:9970] 0.267 0.262 0.134 0.279 0.279 ...
.. .. ..- attr(*, "names")= chr [1:9970] "chr11:84957209:G:C" "chr11:84957879:A:G" "chr11:84958236:C:A" "chr11:84958954:T:C" ...
*.univariate_twas_weights.rds
> str(readRDS("/restricted/projectnb/xqtl/xqtl_protocol/toy_xqtl_protocol/output/mnm_regression/susie_twas/twas_weights/test_susie_twas.chr11_ENSG00000073921.univariate_twas_weights.rds"))
List of 1
$ ENSG00000073921:List of 1
..$ test_pheno_ENSG00000073921:List of 7
.. ..$ susie_weights_intermediate:List of 4
.. .. ..$ mu : num [1:8, 1:6454] 0.000134 0.000169 0.000177 0.000169 0.000152 ...
.. .. .. ..- attr(*, "dimnames")=List of 2
.. .. .. .. ..$ : NULL
.. .. .. .. ..$ : chr [1:6454] "chr11:84957209:G:C" "chr11:84958236:C:A" "chr11:84958954:T:C" "chr11:84959602:T:C" ...
.. .. ..$ lbf_variable : num [1:8, 1:6454] -0.0334 -0.0427 -0.045 -0.0428 -0.0381 ...
.. .. .. ..- attr(*, "dimnames")=List of 2
.. .. .. .. ..$ : NULL
.. .. .. .. ..$ : chr [1:6454] "chr11:84957209:G:C" "chr11:84958236:C:A" "chr11:84958954:T:C" "chr11:84959602:T:C" ...
.. .. ..$ X_column_scale_factors: Named num [1:6454] 0.664 0.438 0.677 0.702 0.677 ...
.. .. .. ..- attr(*, "names")= chr [1:6454] "chr11:84957209:G:C" "chr11:84958236:C:A" "chr11:84958954:T:C" "chr11:84959602:T:C" ...
.. .. ..$ pip : Named num [1:6454] 0.0012 0.00124 0.0012 0.0012 0.0012 ...
.. .. .. ..- attr(*, "names")= chr [1:6454] "chr11:84957209:G:C" "chr11:84958236:C:A" "chr11:84958954:T:C" "chr11:84959602:T:C" ...
.. ..$ twas_weights :List of 6
.. .. ..$ enet_weights : num [1:6454, 1] 0 0 0 0 0 0 0 0 0 0 ...
.. .. .. ..- attr(*, "dimnames")=List of 2
.. .. .. .. ..$ : chr [1:6454] "chr11:84957209:G:C" "chr11:84958236:C:A" "chr11:84958954:T:C" "chr11:84959602:T:C" ...
.. .. .. .. ..$ : chr "ENSG00000073921"
.. .. ..$ lasso_weights : num [1:6454, 1] 0 0 0 0 0 0 0 0 0 0 ...
.. .. .. ..- attr(*, "dimnames")=List of 2
.. .. .. .. ..$ : chr [1:6454] "chr11:84957209:G:C" "chr11:84958236:C:A" "chr11:84958954:T:C" "chr11:84959602:T:C" ...
.. .. .. .. ..$ : chr "ENSG00000073921"
.. .. ..$ bayes_r_weights: num [1:6454, 1] 4.31e-05 4.73e-04 2.44e-04 -8.48e-05 2.30e-04 ...
.. .. .. ..- attr(*, "dimnames")=List of 2
.. .. .. .. ..$ : chr [1:6454] "chr11:84957209:G:C" "chr11:84958236:C:A" "chr11:84958954:T:C" "chr11:84959602:T:C" ...
.. .. .. .. ..$ : chr "ENSG00000073921"
.. .. ..$ bayes_l_weights: num [1:6454, 1] 2.04e-05 3.66e-04 1.03e-04 -1.54e-04 3.47e-04 ...
.. .. .. ..- attr(*, "dimnames")=List of 2
.. .. .. .. ..$ : chr [1:6454] "chr11:84957209:G:C" "chr11:84958236:C:A" "chr11:84958954:T:C" "chr11:84959602:T:C" ...
.. .. .. .. ..$ : chr "ENSG00000073921"
.. .. ..$ mrash_weights : num [1:6454, 1] 6.01e-05 2.05e-04 7.02e-05 -4.71e-05 7.02e-05 ...
.. .. .. ..- attr(*, "dimnames")=List of 2
.. .. .. .. ..$ : chr [1:6454] "chr11:84957209:G:C" "chr11:84958236:C:A" "chr11:84958954:T:C" "chr11:84959602:T:C" ...
.. .. .. .. ..$ : chr "ENSG00000073921"
.. .. ..$ susie_weights : num [1:6454, 1] 2.55e-07 5.89e-06 1.00e-06 -7.53e-07 1.00e-06 ...
.. .. .. ..- attr(*, "dimnames")=List of 2
.. .. .. .. ..$ : chr [1:6454] "chr11:84957209:G:C" "chr11:84958236:C:A" "chr11:84958954:T:C" "chr11:84959602:T:C" ...
.. .. .. .. ..$ : chr "ENSG00000073921"
.. ..$ twas_predictions :List of 6
.. .. ..$ enet_predicted : num [1:150, 1] 0 0 0 0 0 0 0 0 0 0 ...
.. .. .. ..- attr(*, "dimnames")=List of 2
.. .. .. .. ..$ : chr [1:150] "sample0" "sample1" "sample10" "sample100" ...
.. .. .. .. ..$ : chr "ENSG00000073921"
.. .. ..$ lasso_predicted : num [1:150, 1] 0 0 0 0 0 0 0 0 0 0 ...
.. .. .. ..- attr(*, "dimnames")=List of 2
.. .. .. .. ..$ : chr [1:150] "sample0" "sample1" "sample10" "sample100" ...
.. .. .. .. ..$ : chr "ENSG00000073921"
.. .. ..$ bayes_r_predicted: num [1:150, 1] 0.1971 0.1333 -0.0176 0.1234 0.2194 ...
.. .. .. ..- attr(*, "dimnames")=List of 2
.. .. .. .. ..$ : chr [1:150] "sample0" "sample1" "sample10" "sample100" ...
.. .. .. .. ..$ : chr "ENSG00000073921"
.. .. ..$ bayes_l_predicted: num [1:150, 1] 0.20151 0.13565 -0.00533 0.13823 0.23318 ...
.. .. .. ..- attr(*, "dimnames")=List of 2
.. .. .. .. ..$ : chr [1:150] "sample0" "sample1" "sample10" "sample100" ...
.. .. .. .. ..$ : chr "ENSG00000073921"
.. .. ..$ mrash_predicted : num [1:150, 1] 0.08183 0.05465 -0.00379 0.05346 0.09283 ...
.. .. .. ..- attr(*, "dimnames")=List of 2
.. .. .. .. ..$ : chr [1:150] "sample0" "sample1" "sample10" "sample100" ...
.. .. .. .. ..$ : chr "ENSG00000073921"
.. .. ..$ susie_predicted : num [1:150, 1] 0.003423 0.001808 0.000304 0.001797 0.003032 ...
.. .. .. ..- attr(*, "dimnames")=List of 2
.. .. .. .. ..$ : chr [1:150] "sample0" "sample1" "sample10" "sample100" ...
.. .. .. .. ..$ : chr "ENSG00000073921"
.. ..$ twas_cv_result :List of 4
.. .. ..$ sample_partition:'data.frame': 150 obs. of 2 variables:
.. .. .. ..$ Sample: chr [1:150] "sample108" "sample52" "sample103" "sample29" ...
.. .. .. ..$ Fold : int [1:150] 1 1 1 1 1 1 1 1 1 1 ...
.. .. ..$ prediction :List of 4
.. .. .. ..$ bayes_r_predicted: num [1:150, 1] 0.3606 -0.0263 0.0687 0.0989 0.1158 ...
.. .. .. .. ..- attr(*, "dimnames")=List of 2
.. .. .. .. .. ..$ : chr [1:150] "sample0" "sample1" "sample10" "sample100" ...
.. .. .. .. .. ..$ : chr "ENSG00000073921"
.. .. .. ..$ bayes_l_predicted: num [1:150, 1] 0.2274 -0.0389 0.0379 0.0513 0.0706 ...
.. .. .. .. ..- attr(*, "dimnames")=List of 2
.. .. .. .. .. ..$ : chr [1:150] "sample0" "sample1" "sample10" "sample100" ...
.. .. .. .. .. ..$ : chr "ENSG00000073921"
.. .. .. ..$ mrash_predicted : num [1:150, 1] 0.0692 -0.0021 0.0183 0.0231 0.0353 ...
.. .. .. .. ..- attr(*, "dimnames")=List of 2
.. .. .. .. .. ..$ : chr [1:150] "sample0" "sample1" "sample10" "sample100" ...
.. .. .. .. .. ..$ : chr "ENSG00000073921"
.. .. .. ..$ susie_predicted : num [1:150, 1] 0.05794 -0.00105 -0.01131 0 0 ...
.. .. .. .. ..- attr(*, "dimnames")=List of 2
.. .. .. .. .. ..$ : chr [1:150] "sample0" "sample1" "sample10" "sample100" ...
.. .. .. .. .. ..$ : chr "ENSG00000073921"
.. .. ..$ performance :List of 4
.. .. .. ..$ bayes_r_performance: num [1, 1:6] 0.022483 0.000505 -0.006248 0.784784 0.40514 ...
.. .. .. .. ..- attr(*, "dimnames")=List of 2
.. .. .. .. .. ..$ : chr "ENSG00000073921"
.. .. .. .. .. ..$ : chr [1:6] "corr" "rsq" "adj_rsq" "pval" ...
.. .. .. ..$ bayes_l_performance: num [1, 1:6] -8.04e-03 6.47e-05 -6.69e-03 9.22e-01 3.91e-01 ...
.. .. .. .. ..- attr(*, "dimnames")=List of 2
.. .. .. .. .. ..$ : chr "ENSG00000073921"
.. .. .. .. .. ..$ : chr [1:6] "corr" "rsq" "adj_rsq" "pval" ...
.. .. .. ..$ mrash_performance : num [1, 1:6] 0.05206 0.00271 -0.00403 0.52697 0.35394 ...
.. .. .. .. ..- attr(*, "dimnames")=List of 2
.. .. .. .. .. ..$ : chr "ENSG00000073921"
.. .. .. .. .. ..$ : chr [1:6] "corr" "rsq" "adj_rsq" "pval" ...
.. .. .. ..$ susie_performance : num [1, 1:6] -0.03878 0.0015 -0.00524 0.6375 0.35463 ...
.. .. .. .. ..- attr(*, "dimnames")=List of 2
.. .. .. .. .. ..$ : chr "ENSG00000073921"
.. .. .. .. .. ..$ : chr [1:6] "corr" "rsq" "adj_rsq" "pval" ...
.. .. ..$ time_elapsed : 'proc_time' Named num [1:5] 188.023 0.581 189.175 0 0
.. .. .. ..- attr(*, "names")= chr [1:5] "user.self" "sys.self" "elapsed" "user.child" ...
.. ..$ total_time_elapsed : 'proc_time' Named num [1:5] 242.761 0.972 249.471 0.013 0.024
.. .. ..- attr(*, "names")= chr [1:5] "user.self" "sys.self" "elapsed" "user.child" ...
.. ..$ variant_names : chr [1:6454] "chr11:84957209:G:C" "chr11:84958236:C:A" "chr11:84958954:T:C" "chr11:84959602:T:C" ...
.. ..$ region_info :List of 3
.. .. ..$ region_coord:'data.frame': 1 obs. of 3 variables:
.. .. .. ..$ chrom: chr "11"
.. .. .. ..$ start: int 85957174
.. .. .. ..$ end : int 86069881
.. .. ..$ grange :'data.frame': 1 obs. of 3 variables:
.. .. .. ..$ chrom: chr "11"
.. .. .. ..$ start: int 84957175
.. .. .. ..$ end : int 87360000
.. .. ..$ region_name : chr [1:2] "ENSG00000073921" "ENSG00000073921"
Minimal Working Example Steps#
i. Run the Fine-Mapping and TWAS with SuSiE#
sos run pipeline/mnm_regression.ipynb susie_twas \
--name test_susie_twas \
--genoFile output/genotype_by_chrom/wgs.merged.plink_qc.1.bed \
--phenoFile output/phenotype/phenotype_by_chrom_for_cis/bulk_rnaseq.phenotype_by_chrom_files.region_list.txt \
--covFile output/covariate/bulk_rnaseq_tmp_matrix.low_expression_filtered.outlier_removed.tmm.expression.covariates.wgs.merged.plink_qc.plink_qc.prune.pca.Marchenko_PC.gz \
--customized-association-windows reference_data/TAD/TADB_enhanced_cis.bed \
--phenotype-names test_pheno \
--max-cv-variants 5000 --ld_reference_meta_file data/ld_meta_file_with_bim.tsv \
--region-name ENSG00000049246 ENSG00000054116 ENSG00000116678 \
--save-data \
--cwd output/mnm_regression/susie_twas
INFO: Running get_analysis_regions:
Loading customized association analysis window from reference_data/TAD/TADB_enhanced_cis.bed
INFO: get_analysis_regions is completed.
INFO: get_analysis_regions output: regional_data
INFO: Running susie_twas:
INFO: susie_twas (index=0) is completed.
INFO: susie_twas (index=1) is completed.
INFO: susie_twas (index=2) is completed.
INFO: susie_twas output: /restricted/projectnb/xqtl/xqtl_protocol/toy_xqtl_protocol/output/mnm_regression/susie_twas/fine_mapping/test_susie_twas.chr1_ENSG00000049246.univariate_bvsr.rds /restricted/projectnb/xqtl/xqtl_protocol/toy_xqtl_protocol/output/mnm_regression/susie_twas/twas_weights/test_susie_twas.chr1_ENSG00000049246.univariate_twas_weights.rds... (6 items in 3 groups)
INFO: Workflow susie_twas (ID=w487c342397d09ccb) is executed successfully with 2 completed steps and 4 completed substeps.
Anticipated Results#
Univariate finemapping will produce a file containing results for the top hits and a file containing twas weights produced by susie.
ROSMAP_mega_eQTL.chr11_ENSG00000073921.univariate_bvrs.rds
:
For each gene of interest and phenotype, this file contains:
susie_fitted
variant_names
analysis_script
other quantities - information on dropped samples, if any
summary statistics (beta and se) for each variant
sample names
summary statistics for top loci
susie results trimmed - includes pip, sets, cs_corr, sets_secondary, alpha, lbf_variable, V, niter, X_column_scale_factors, mu, mu2
total time elapsed
region info
preset_variants_results
ROSMAP_mega_eQTL.chr11_ENSG00000073921.univariate_data.rds
(from the –save-data argument):
For each gene of interest, contains residuals for each sample and phenotype
see pecotmr code for description at
load_regional_association_data
function
ROSMAP_mega_eQTL.chr11_ENSG00000073921.univariate_twas_weights.rds
:
For each gene of interest and phenotype, this file contains:
twas_weights - weights for enet, lasso, bayes r, mrash and susie methods
twas_predictions - twas predictions for enet, lasso, bayes r, mrash and susie methods
twas_cv_result
total_time_elapsed
variant_names
region_info