Enrichment analysis of molecular QTL in genetic variants associated with complex traits
Ru Feng
2026-03-01
Source:vignettes/xqtl_enrichment.Rmd
xqtl_enrichment.RmdThis vignette demonstrates the enrichment analysis part of the
pecotmr package, which largely follows from
fastENLOC (https://github.com/xqwen/fastenloc) but uses
susieR fitted objects as input to estimate priors for use
with the coloc package (coloc v5, aka SuSiE-coloc). The
main differences are: 1) now enrichment is based on all QTL variants
whether or not they are inside signal clusters; 2) Causal QTL are
sampled from SuSiE single effects, not signal clusters; 3) Allow a
variant to be QTL for not only multiple conditions (eg cell types) but
also multiple regions (eg genes).
The example uses de-identified data shipped with the package. All sample names, variant positions, and identifiers are synthetic.
Set up environment and Load data
Load the pecotmr package for this enrichment
analysis.
Load fine-mapping results from the package example data and write
them to temporary RDS files, since xqtl_enrichment_wrapper
expects file paths.
Input data requirements
GWAS data requirements
In the GWAS data, the file is stored as a list, with the fine mapping results already stored under the first layer.
names(gwas[[1]])## [1] "alpha" "pip" "V" "sets"
Posterior inclusion probabilities of GWAS data:
plot(gwas[[1]]$pip, ylab = "PIP", xlab = "Variant index", main = "GWAS PIPs", pch = 16, cex = 0.6)
QTL data requirements
In the QTL data, the file is stored as a list and is more
complicated. The first layer is named as the molecular trait region ID,
followed by different contexts as the second layer. The variant names
and fine mapping results can be found under these contexts
(variant_names and susie_result_trimmed).
names(qtl)## [1] "region_1"
names(qtl[["region_1"]])## [1] "context_1"
names(qtl[["region_1"]][["context_1"]])## [1] "susie_result_trimmed" "variant_names"
names(qtl[["region_1"]][["context_1"]][["susie_result_trimmed"]])## [1] "alpha" "pip" "V" "sets"
Posterior inclusion probabilities of QTL data:
plot(qtl[["region_1"]][["context_1"]][["susie_result_trimmed"]]$pip,
ylab = "PIP", xlab = "Variant index", main = "QTL PIPs", pch = 16, cex = 0.6)
Perform enrichment analysis
The input for xqtl_enrichment_wrapper requires the paths
for GWAS and QTL fine mapping data. You need to specify the list name of
SuSiE results in each dataset as xqtl_finemapping_obj and
the list name of variant names as xqtl_varname_obj. For
example, if SuSiE results for QTL data are under
qtl[["region_1"]][["context_1"]][["susie_result_trimmed"]],
then xqtl_finemapping_obj would be
c("context_1", "susie_result_trimmed"), and
xqtl_varname_obj would be
c("context_1", "variant_names") since the variant names are
stored directly under context_1. For GWAS data, the fine
mapping results and variant names are stored under the first layer
directly, so you do not need to specify
gwas_finemapping_obj and gwas_varname_obj.
However, if your data does not follow this structure, you may need to
specify these parameters accordingly.
enrich_res <- xqtl_enrichment_wrapper(
gwas_files = gwas_path, xqtl_files = qtl_path,
xqtl_finemapping_obj = c("context_1", "susie_result_trimmed"),
xqtl_varname_obj = c("context_1", "variant_names")
)## Warning in compute_qtl_enrichment(gwas_pip = dat$gwas_pip, susie_qtl_regions =
## dat$xqtl_data, : num_gwas is not provided. Estimating pi_gwas from the data.
## Note that this estimate may be biased if the input gwas_pip does not contain
## genome-wide variants.
## Estimated pi_gwas: 0.00141
## Warning in compute_qtl_enrichment(gwas_pip = dat$gwas_pip, susie_qtl_regions =
## dat$xqtl_data, : pi_qtl is not provided. Estimating pi_qtl from the data. Note
## that this estimate may be biased if either 1) the input susie_qtl_regions does
## not have enough data, or 2) the single effects only include variables inside of
## credible sets or signal clusters.
## Estimated pi_qtl: 0.00071
## Fine-mapped GWAS and QTL data loaded successfully for enrichment analysis!
## Proportion of xQTL missing from GWAS variants: 0 in MI round 0
## Proportion of xQTL missing from GWAS variants: 0 in MI round 1
## Proportion of xQTL missing from GWAS variants: 0 in MI round 2
## Proportion of xQTL missing from GWAS variants: 0 in MI round 3
## Proportion of xQTL missing from GWAS variants: 0 in MI round 4
## Proportion of xQTL missing from GWAS variants: 0 in MI round 5
## Proportion of xQTL missing from GWAS variants: 0 in MI round 6
## Proportion of xQTL missing from GWAS variants: 0 in MI round 7
## Proportion of xQTL missing from GWAS variants: 0 in MI round 8
## Proportion of xQTL missing from GWAS variants: 0 in MI round 9
## Proportion of xQTL missing from GWAS variants: 0 in MI round 10
## Proportion of xQTL missing from GWAS variants: 0 in MI round 11
## Proportion of xQTL missing from GWAS variants: 0 in MI round 12
## Proportion of xQTL missing from GWAS variants: 0 in MI round 13
## Proportion of xQTL missing from GWAS variants: 0 in MI round 14
## Proportion of xQTL missing from GWAS variants: 0 in MI round 15
## Proportion of xQTL missing from GWAS variants: 0 in MI round 16
## Proportion of xQTL missing from GWAS variants: 0 in MI round 17
## Proportion of xQTL missing from GWAS variants: 0 in MI round 18
## Proportion of xQTL missing from GWAS variants: 0 in MI round 19
## Proportion of xQTL missing from GWAS variants: 0 in MI round 20
## Proportion of xQTL missing from GWAS variants: 0 in MI round 21
## Proportion of xQTL missing from GWAS variants: 0 in MI round 22
## Proportion of xQTL missing from GWAS variants: 0 in MI round 23
## Proportion of xQTL missing from GWAS variants: 0 in MI round 24
## EM updates completed!
print(enrich_res[[1]])## $`Alternative (coloc) p1`
## [1] 0.001413425
##
## $`Alternative (coloc) p12`
## [1] 1.000238e-06
##
## $`Alternative (coloc) p2`
## [1] 0.0007061743
##
## $`Effective MI rounds`
## [1] 24
##
## $`Enrichment (no shrinkage)`
## [1] -7.950078
##
## $`Enrichment (w/ shrinkage)`
## [1] -7.953078e-06
##
## $Intercept
## [1] -6.559616
##
## $`sd (intercept)`
## [1] 0.5002662
##
## $`sd (no shrinkage)`
## [1] 999.8109
##
## $`sd (w/ shrinkage)`
## [1] 0.9999995
## NULL
The enrichment analysis output is stored as a 2-layer list. The first layer contains the enrichment results, which will be used as priors in coloc analysis. The second layer stores the QTL variants that were not detected in the GWAS dataset.
Session information
Here are some details about the computing environment, including the versions of R, and the R packages, used to generate these results.
## R version 4.4.3 (2025-02-28)
## Platform: x86_64-conda-linux-gnu
## Running under: Ubuntu 24.04.3 LTS
##
## Matrix products: default
## BLAS/LAPACK: /tmp/pixi/.pixi/envs/r44/lib/libopenblasp-r0.3.30.so; LAPACK version 3.12.0
##
## locale:
## [1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8
## [4] LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8
## [7] LC_PAPER=C.UTF-8 LC_NAME=C LC_ADDRESS=C
## [10] LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C
##
## time zone: Etc/UTC
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] pecotmr_0.3.21
##
## loaded via a namespace (and not attached):
## [1] tidyselect_1.2.1 viridisLite_0.4.3 bigsnpr_1.12.21
## [4] dplyr_1.2.0 farver_2.1.2 viridis_0.6.5
## [7] fastmap_1.2.0 reshape_0.8.10 bigassertr_0.1.7
## [10] flock_0.7 digest_0.6.39 lifecycle_1.0.5
## [13] magrittr_2.0.4 compiler_4.4.3 rlang_1.1.7
## [16] sass_0.4.10 rngtools_1.5.2 tools_4.4.3
## [19] yaml_2.3.12 data.table_1.17.8 knitr_1.51
## [22] doRNG_1.8.6.3 htmlwidgets_1.6.4 bit_4.6.0
## [25] plyr_1.8.9 RColorBrewer_1.1-3 rmio_0.4.0
## [28] coloc_5.2.3 withr_3.0.2 bigsparser_0.7.3
## [31] purrr_1.2.1 bigstatsr_1.6.2 BiocGenerics_0.52.0
## [34] desc_1.4.3 grid_4.4.3 stats4_4.4.3
## [37] susieR_0.15.51 future_1.69.0 ggplot2_3.5.2
## [40] globals_0.19.0 scales_1.4.0 iterators_1.0.14
## [43] cli_3.6.5 rmarkdown_2.30 crayon_1.5.3
## [46] ragg_1.5.0 generics_0.1.4 otel_0.2.0
## [49] future.apply_1.20.2 tzdb_0.5.0 cachem_1.1.0
## [52] stringr_1.6.0 parallel_4.4.3 matrixStats_1.5.0
## [55] vctrs_0.7.1 Matrix_1.7-4 jsonlite_2.0.0
## [58] IRanges_2.40.0 hms_1.1.4 S4Vectors_0.44.0
## [61] mixsqp_0.3-54 bit64_4.6.0-1 irlba_2.3.7
## [64] listenv_0.10.0 systemfonts_1.3.1 foreach_1.5.2
## [67] tidyr_1.3.2 jquerylib_0.1.4 bigparallelr_0.3.2
## [70] glue_1.8.0 parallelly_1.46.1 pkgdown_2.2.0
## [73] codetools_0.2-20 cowplot_1.2.0 stringi_1.8.7
## [76] gtable_0.3.6 doFuture_1.2.1 tibble_3.3.1
## [79] pillar_1.11.1 furrr_0.3.1 htmltools_0.5.9
## [82] R6_2.6.1 textshaping_1.0.4 doParallel_1.0.17
## [85] vroom_1.6.7 evaluate_1.0.5 lattice_0.22-9
## [88] readr_2.2.0 bslib_0.10.0 Rcpp_1.1.1
## [91] gridExtra_2.3 xfun_0.56 fs_1.6.6
## [94] pkgconfig_2.0.3