Joint multi-outcome fine-mapping
Gao Wang, Anjing Liu and William Denault
Source:vignettes/mfsusie_intro.Rmd
mfsusie_intro.Rmdmfsusie() extends fsusie() from one
functional response to many. Each outcome m is a matrix
n x T_m; outcomes are typically functional (DNAm tracks,
RNA-seq coverage, ATAC-seq profiles) and T_m may differ
across outcomes. The fit returns one credible set list and one PIP
vector per variable, combining evidence from every outcome. Scalar
outcomes (T_m = 1) are supported through the same interface
but susieR::susie() is typically the better tool for that
case; the strength of mfsusie() is joint fine-mapping
across multiple functional layers.
Multi-outcome example
multiomic_example is a four-outcome simulated cis-QTL
example. We use the two functional outcomes here: DNAm with
T = 64 irregular CpGs and RNA-seq with T = 32
exon-body positions. Two causal SNPs are shared across both outcomes;
per-outcome shapes and signs differ. Genotypes use the
susieR::N3finemapping LD scaffold for realistic
correlation.
Fit
fit <- mfsusie(multiomic_example$X, Y_list, pos = pos_list,
L = 15, L_greedy = 5,
verbose = TRUE)
#> HINT: ncol(Y) is not 2^J or positions are unevenly spaced; interpolated to a regular dyadic grid.
#> iter ELBO delta sigma2 mem V extras
#> 1 -77467.8280 - [0.998, 0.998, 0.999] 0.16 GB [1.00e+00, 1.00e+00, 1.00e+00, 1.00e+00, 1.00e+00] pi_null=[1.00, 1.00]
#> iter 2: max|dPIP|=1.19e-02, V=[1.00e+00, 1.00e+00, 1.00e+00, 1.00e+00, 1.00e+00] [mem: 0.16 GB]
#> iter 3: max|dPIP|=4.48e-04, V=[1.00e+00, 1.00e+00, 1.00e+00, 1.00e+00, 1.00e+00] [mem: 0.16 GB]
#> iter 4: max|dPIP|=1.04e-05, V=[1.00e+00, 1.00e+00, 1.00e+00, 1.00e+00, 1.00e+00] -- converged [mem: 0.16 GB]
#> [L_greedy] 1 round, greedy_lbf_cutoff=0.100, final L=5
#> round L min(lbf) action
#> 1 5 0.000 saturated
fit$niter; fit$converged
#> [1] 4
#> [1] TRUE
length(fit$sets$cs)
#> [1] 2
fit$pip[multiomic_example$causal_snps]
#> [1] 1.0000000 0.9938718Post-hoc per-outcome causal-configuration assessment
For each credible set, ask which outcomes share the causal variant.
susie_post_outcome_configuration(fit, by = "outcome", method = ...)
runs one of two complementary post-hoc analyses on the joint mfsusie
fit; pick the one you want via method and call again to get
the other:
-
method = "susiex"(default) – combinatorial enumeration: posterior probability of each of the2^Mbinary “which-outcomes-share-the-causal” patterns, plus a per-outcome marginal probability of being active. One list element per CS index shared across all outcomes (so forMoutcomes that shareKCSes,length(configs$susiex) == K). -
method = "coloc_pairwise"– standard H0/H1/H2/H3/H4 coloc posteriors for every outcome pair x CS pair (one data.frame).
summary() tidies either result into a colour-coded
table: green for active / shared, yellow for ambiguous, dim for
below-threshold. By default it filters to rows that show signal (pass
signal_only = FALSE to see everything).
configs <- susieR::susie_post_outcome_configuration(fit, by = "outcome",
method = "susiex")
summary(configs)#>
#> SuSiEx: per-trait marginal P(active) per CS tuple
#> prob_thresh = 0.80, ambiguous = [0.50, 0.80)
#>
#> CS tuple trait_1_outcome_1 trait_1_outcome_2 top pattern top P
#> -------- ----------------- ----------------- ----------- -----
#> (1,1) 1.000 1.000 11 1.000
#> (2,2) 1.000 1.000 11 1.000
For the coloc_pairwise analysis the table has one row
per (outcome pair) x (CS pair), which on a 4-outcome fit with 3 shared
CSes means choose(4, 2) * 3 * 3 = 54 rows. We pass
signal_only = TRUE (the default) so H0-dominant rows drop
out and we see only the pairs that carry colocalisation evidence:
pairwise <- susieR::susie_post_outcome_configuration(fit, by = "outcome",
method = "coloc_pairwise")
summary(pairwise)#>
#> Coloc pairwise: dominant hypothesis per (trait, trait', l1, l2)
#> H0 no signal | H1 trait1-only | H2 trait2-only | H3 distinct | H4 shared
#>
#> trait1 trait2 l1 l2 hit1 hit2 PP.H0 PP.H1 PP.H2 PP.H3 PP.H4 verdict
#> ----------------- ----------------- -- -- ------ ------ ----- ----- ----- ----- ----- -----------
#> trait_1_outcome_1 trait_1_outcome_2 1 1 snp_97 snp_97 0.000 0.000 0.000 0.001 0.999 H4 shared
#> trait_1_outcome_1 trait_1_outcome_2 1 2 snp_97 snp_42 0.000 0.000 0.000 1.000 0.000 H3 distinct
#> trait_1_outcome_1 trait_1_outcome_2 2 1 snp_42 snp_97 0.000 0.000 0.000 1.000 0.000 H3 distinct
#> trait_1_outcome_1 trait_1_outcome_2 2 2 snp_42 snp_42 0.000 0.000 0.000 0.000 1.000 H4 shared
Index the raw objects directly when you need full per-CS detail:
configs$susiex[[1L]]$marginal_prob # per-outcome marginal P(active)
#> trait_1_outcome_1 trait_1_outcome_2
#> 1 1
configs$susiex[[1L]]$config_prob # 2^M configuration probabilities
#> [1] 2.295544e-260 1.042913e-108 2.201089e-152 1.000000e+00
head(pairwise$coloc_pairwise) # raw coloc data.frame
#> trait1 trait2 l1 l2 hit1 hit2 PP.H0
#> 1 trait_1_outcome_1 trait_1_outcome_2 1 1 snp_97 snp_97 4.416536e-255
#> 2 trait_1_outcome_1 trait_1_outcome_2 1 2 snp_97 snp_42 5.760360e-184
#> 3 trait_1_outcome_1 trait_1_outcome_2 2 1 snp_42 snp_97 1.436498e-160
#> 4 trait_1_outcome_1 trait_1_outcome_2 2 2 snp_42 snp_42 9.408969e-95
#> PP.H1 PP.H2 PP.H3 PP.H4
#> 1 2.127384e-107 5.212850e-151 0.0005119812 9.994880e-01
#> 2 2.774685e-36 2.076041e-148 1.0000000000 1.066998e-38
#> 3 8.472402e-105 1.695502e-56 1.0000000000 9.699808e-62
#> 4 5.549370e-39 3.391004e-59 0.0000000000 1.000000e+00Inspecting the fit
mfsusie_plot() shows everything in one call: PIPs in the
top-left tile (CS-colored), then one effect-curve panel per outcome with
TI-smoothed curves, 95% credible bands, and affected-region bars at the
bottom of each panel.
fit_s <- mf_post_smooth(fit, method = "TI")When the fit has both M > 1 outcomes and several
credible sets, facet_cs = "stack" lays out one effect panel
per (outcome, CS) pair stacked vertically below the PIP
panel, which keeps the per-CS curves at full width. The default
facet_cs = "auto" falls back to a per-outcome overlay grid
when M > 1; switch to "stack" when you want
a separated view per CS.
mfsusie_plot_dimensions() returns a
(width, height) recommendation in inches sized to the
number of panels the fit will produce. Pass it through to the chunk so
each cell keeps a legible amount of vertical space.
plot_dims <- mfsusie_plot_dimensions(fit_s, facet_cs = "stack")
plot_dims
#> $width
#> [1] 8
#>
#> $height
#> [1] 15
#>
#> $n_cells
#> [1] 5
mfsusie_plot(fit_s, facet_cs = "stack",
effect_variables = multiomic_example$causal_snps)
fit$sets$cs
#> $L1
#> [1] 97
#>
#> $L2
#> [1] 42coef(fit) returns a list of length M.
Element m is an L x T_m matrix containing the
per-effect curve for outcome m.
Choosing prior_variance_scope
-
"per_outcome"(default; used above): one mixture-of-normals prior shared across every wavelet scale of an outcome. Appropriate when scales are exchangeable a priori, and roughlyS_m-fold cheaper per IBSS iter (one mixsqp call per outcome instead of one per (outcome, scale)). -
"per_scale": one mixture per (outcome, scale), letting different scales adapt independently. More flexible at the cost of anS_m-fold larger M-step; useful when shape varies systematically with scale.
Switch between them by changing one argument; the rest of the API and the fit object structure are identical.
Wavelet-domain preprocessing
Two arguments control preprocessing of the per-outcome wavelet matrix before the IBSS loop:
-
wavelet_magnitude_cutoff(default0): wavelet columns withmedian(|column|) <= wavelet_magnitude_cutoffare masked, contributing(Bhat = 0, Shat = 1)to the SER step. Use a small positive value when the response has sparse non-zero entries (e.g., ATAC-seq read pileups). -
wavelet_qnorm(defaultTRUE): applies a column-wise rank-based normal quantile transform to the wavelet matrix. Useful when the per-outcome residuals are heavy-tailed or visibly non-Gaussian. Set toFALSEto skip the transform when the wavelet coefficients are already approximately Gaussian.
Both flags route through mf_adjust_for_covariates() as
well.
When to reach for mfSuSiE over fSuSiE+COLOC
Use mfsusie() when:
- A variable plausibly affects multiple distinct phenotypes (different tissues, different molecular layers).
- The phenotypes have heterogeneous shapes (some scalar, some
functional, some at different sampling resolutions). mfSuSiE does not
require a shared
T. - You want a single credible set list combining all the evidence, rather than ad-hoc colocalization across separate fsusie fits. ## Multi-trait data with missing samples
In multi-tissue or multi-cell-type studies some samples are not
profiled for every outcome. Set the corresponding rows of
Y[[m]] to NA before calling
mfsusie(); the package automatically restricts every
regression and variance update to the observed rows for that trait.
Both wavelet_qnorm = TRUE (the default) and
wavelet_qnorm = FALSE handle missing rows correctly.
set.seed(42)
n <- 80; p <- 60; T_m <- 8; M <- 3
X_miss <- matrix(rbinom(n * p, 2, 0.3), n, p)
X_miss <- scale(X_miss)
# Plant one causal SNP shared across all three traits.
causal_j <- 5L
signal <- as.numeric(X_miss[, causal_j]) * 2
Y_miss <- lapply(seq_len(M), function(m)
matrix(signal, n, T_m) + matrix(rnorm(n * T_m, sd = 0.5), n, T_m))
# Trait 2 is missing 20 % of rows; trait 3 is missing 30 %.
Y_miss[[2]][sample(n, 16), ] <- NA
Y_miss[[3]][sample(n, 24), ] <- NA
# Per-trait observed sample sizes after masking.
sapply(Y_miss, function(y) sum(complete.cases(y)))
#> [1] 80 64 56
wavelet_qnorm = FALSE
t_noq <- system.time(
fit_noq <- mfsusie(X_miss, Y_miss, L = 10,
wavelet_qnorm = FALSE, verbose = FALSE)
)
cat(sprintf("Runtime: %.1f s niter: %d CS: %d\n",
t_noq["elapsed"], fit_noq$niter, length(fit_noq$sets$cs)))
#> Runtime: 0.2 s niter: 2 CS: 1
fit_noq$pip[causal_j] # causal PIP
#> [1] 1
wavelet_qnorm = TRUE (default)
t_q <- system.time(
fit_q <- mfsusie(X_miss, Y_miss, L = 10, verbose = FALSE)
)
cat(sprintf("Runtime: %.1f s niter: %d CS: %d\n",
t_q["elapsed"], fit_q$niter, length(fit_q$sets$cs)))
#> Runtime: 0.2 s niter: 2 CS: 1
fit_q$pip[causal_j]
#> [1] 1Runtime summary
data.frame(
path = c("qnorm=FALSE", "qnorm=TRUE"),
runtime = c(t_noq["elapsed"], t_q["elapsed"]),
niter = c(fit_noq$niter, fit_q$niter),
n_cs = c(length(fit_noq$sets$cs), length(fit_q$sets$cs)),
causal_pip = c(fit_noq$pip[causal_j], fit_q$pip[causal_j])
)
#> path runtime niter n_cs causal_pip
#> 1 qnorm=FALSE 0.233 2 1 1
#> 2 qnorm=TRUE 0.224 2 1 1Session info
This is the version of R and the packages that were used to generate these results.
sessionInfo()
#> R version 4.4.3 (2025-02-28)
#> Platform: x86_64-conda-linux-gnu
#> Running under: Ubuntu 24.04.4 LTS
#>
#> Matrix products: default
#> BLAS/LAPACK: /home/runner/work/mfsusieR/mfsusieR/.pixi/envs/r44/lib/libopenblasp-r0.3.32.so; LAPACK version 3.12.0
#>
#> locale:
#> [1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8
#> [4] LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8
#> [7] LC_PAPER=C.UTF-8 LC_NAME=C LC_ADDRESS=C
#> [10] LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: Etc/UTC
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] susieR_0.16.1 mfsusieR_0.0.2
#>
#> loaded via a namespace (and not attached):
#> [1] generics_0.1.4 sass_0.4.10 ashr_2.2-63
#> [4] lattice_0.22-9 magrittr_2.0.5 digest_0.6.39
#> [7] evaluate_1.0.5 grid_4.4.3 RColorBrewer_1.1-3
#> [10] fastmap_1.2.0 plyr_1.8.9 jsonlite_2.0.0
#> [13] Matrix_1.7-5 reshape_0.8.10 mixsqp_0.3-54
#> [16] fansi_1.0.7 scales_1.4.0 truncnorm_1.0-9
#> [19] invgamma_1.2 textshaping_1.0.5 jquerylib_0.1.4
#> [22] cli_3.6.6 rlang_1.2.0 zigg_0.0.2
#> [25] crayon_1.5.3 LaplacesDemon_16.1.8 cachem_1.1.0
#> [28] yaml_2.3.12 otel_0.2.0 tools_4.4.3
#> [31] SQUAREM_2026.1 parallel_4.4.3 dplyr_1.2.1
#> [34] wavethresh_4.7.3 ggplot2_4.0.3 Rfast_2.1.5.2
#> [37] vctrs_0.7.3 R6_2.6.1 matrixStats_1.5.0
#> [40] lifecycle_1.0.5 fs_2.1.0 htmlwidgets_1.6.4
#> [43] MASS_7.3-65 ragg_1.5.2 irlba_2.3.7
#> [46] pkgconfig_2.0.3 desc_1.4.3 pillar_1.11.1
#> [49] pkgdown_2.2.0 RcppParallel_5.1.11-2 bslib_0.10.0
#> [52] gtable_0.3.6 glue_1.8.1 Rcpp_1.1.1-1.1
#> [55] systemfonts_1.3.2 tidyselect_1.2.1 tibble_3.3.1
#> [58] xfun_0.57 knitr_1.51 dichromat_2.0-0.1
#> [61] farver_2.1.2 htmltools_0.5.9 rmarkdown_2.31
#> [64] compiler_4.4.3 S7_0.2.2