Skip to contents

Fit the multi-functional, multi-outcome Sum of Single Effects (mfSuSiE) regression model. Each outcome m carries a per-sample functional response Y_m of length T_m; T_m may differ across outcomes, and either T_m = 1 (univariate response) or T_m > 1 (functional response on a regular grid) is supported; for the latter mfsusieR runs a per-outcome wavelet decomposition, then dispatches into susieR's IBSS workhorse with mfsusieR-specific S3 methods that handle the per-(scale, outcome) mixture-of-normals prior and residual variance.

Usage

mfsusie(
  X,
  Y,
  pos = NULL,
  L = 20,
  prior_variance_grid = NULL,
  prior_variance_scope = c("per_outcome", "per_scale", "per_scale_normal",
    "per_scale_laplace"),
  null_prior_init = 0,
  cross_outcome_prior = NULL,
  prior_weights = NULL,
  residual_variance = NULL,
  residual_variance_scope = c("per_outcome", "per_scale"),
  standardize = TRUE,
  intercept = TRUE,
  max_iter = 50,
  tol = 1e-04,
  coverage = 0.95,
  min_abs_corr = 0.5,
  L_greedy = 5,
  greedy_lbf_cutoff = 0.1,
  estimate_prior_variance = TRUE,
  convergence_method = c("pip", "elbo"),
  pip_stall_window = 5L,
  estimate_residual_variance = TRUE,
  verbose = TRUE,
  track_fit = FALSE,
  max_padded_log2 = 10,
  wavelet_basis_order = 10,
  wavelet_family = "DaubLeAsymm",
  wavelet_magnitude_cutoff = 0,
  wavelet_standardize = TRUE,
  wavelet_qnorm = FALSE,
  control_mixsqp = NULL,
  mixture_null_weight = NULL,
  alpha_thin_eps = 5e-05,
  model_init = NULL,
  small_sample_correction = FALSE,
  max_inner_em_steps = 5L,
  attach_smoothing_inputs = TRUE
)

Arguments

X

numeric matrix n x p of covariates (e.g., genotype dosages).

Y

list of length M; each element a numeric matrix n x T_m (or a length-n vector when T_m = 1).

pos

optional list of length M; each element a numeric vector of length T_m recording the sampling positions for that outcome. Defaults to seq_len(T_m) per outcome.

L

integer, maximum number of single effects to fit. Default 20. With the default L_greedy = 5, the IBSS workhorse grows the effect count from 5 toward this cap in steps of 5 until the fit saturates (min(lbf) < greedy_lbf_cutoff); set L_greedy = NULL to fit the full L effects in one round.

prior_variance_grid

optional length-K vector of mixture variances for the scale-mixture-of-normals prior. When NULL, the data-driven path (compute_marginal_bhat_shat + ash) selects the grid per outcome.

prior_variance_scope

one of "per_outcome" (default), "per_scale", "per_scale_normal", or "per_scale_laplace". Controls the mixture-weight layout and the M-step solver. "per_outcome" collapses across scales (one weight vector per outcome) and fits the K+1-component scale-mixture-of-normals prior with mixsqp; this is the default because it is roughly S_m-fold cheaper per IBSS iter (M mixsqp calls per iter instead of MS_m). "per_scale" keeps a separate K+1 weight vector per (outcome, scale), still fit with mixsqp. "per_scale_normal" and "per_scale_laplace" switch the per-(outcome, scale) prior to a 2-component point- spike- and-slab fit by ebnm::ebnm_point_normal() or ebnm::ebnm_point_laplace() respectively; they are the right choice when each wavelet scale has a single characteristic effect-size scale and the K+1-component mixsqp fit is under-determined on coarse scales.

null_prior_init

numeric in [0, 1], initial pi[null] for the scale-mixture prior on b_l at IBSS iter 1. The EM M-step overwrites it within a few iterations. Default 0.

cross_outcome_prior

optional object that controls how the per-outcome log-Bayes factors are combined into the joint log-Bayes factor used by the SER step. Each IBSS effect first computes a length-p log-BF per outcome m (one entry per variable, summed over wavelet positions); these are then reduced to a single length-p joint log-BF before the posterior alpha is taken as a softmax over variables. Defaults to outcome independence (cross_outcome_prior_independent()), under which the joint log-BF is the elementwise sum of the per-outcome log-BFs; equivalently, the joint BF is the product of per-outcome BFs. The argument is an extension point for non-independence combiners (e.g. modality-covariance priors) registered as S3 methods on combine_outcome_lbfs; only the independence combiner is shipped, so most users leave this NULL.

prior_weights

optional length-p numeric vector, the variable-selection prior. Defaults to uniform 1/p.

residual_variance

optional list of length M, initial residual variance per outcome. Defaults to the per-outcome sample variance.

residual_variance_scope

"per_outcome" (default) or "per_scale". Controls the sigma2 update shape: a single scalar per outcome (default) or one scalar per wavelet scale per outcome.

standardize

logical, scale X columns to unit variance.

intercept

logical, center X columns to mean zero.

max_iter

integer, maximum IBSS iterations. Default 50.

tol

numeric, ELBO change tolerance for convergence.

coverage

numeric in (0, 1), credible-set coverage.

min_abs_corr

numeric, minimum variable-to-variable correlation inside a credible set (CS purity threshold).

L_greedy

integer or NULL. When non-NULL (default 5), run susieR's greedy outer loop, growing the effect count from L_greedy up to L in linear steps until the fit saturates (min(lbf) < greedy_lbf_cutoff). Set NULL to fit the full L effects in one round.

greedy_lbf_cutoff

numeric saturation threshold for the greedy outer loop. Default 0.1.

estimate_prior_variance

logical. When TRUE (default), run the per-effect empirical-Bayes update of the mixture weights pi_V[[m]] per (outcome, scale) using the mixsqp solver. When FALSE, hold the prior fixed at the initial prior_variance_grid / null_prior_init; useful when collapsing mfsusie() to susieR::susie() for sanity checks.

convergence_method

one of "pip" (default) or "elbo". "pip" stops when max(abs(prev_alpha - alpha)) < tol, with stall detection via pip_stall_window. "elbo" stops when the per-iter ELBO change is below tol.

pip_stall_window

integer, number of consecutive iterations without PIP improvement after which "pip" convergence declares done. Default 5.

estimate_residual_variance

logical. When TRUE (default), update sigma2 per IBSS iteration via update_variance_components. When FALSE, hold sigma2 at the initial value across iterations; useful for sensitivity analysis.

verbose

logical, default TRUE.

track_fit

logical, retain a per-iteration tracking list on the fit. Default FALSE.

max_padded_log2

integer, log2 cap on the post-remap grid length per outcome. Default 10.

wavelet_basis_order

integer; selects the wavelet basis member within wavelet_family. For Daubechies families this equals the number of vanishing moments (higher = smoother, longer-support filter). Forwarded to wavethresh::wd's filter.number; see also filter.select. Default 10.

wavelet_family

character; selects the wavelet family. Forwarded to wavethresh::wd's family. Default "DaubLeAsymm" (Daubechies least-asymmetric, a.k.a. Symmlet).

wavelet_magnitude_cutoff

non-negative numeric. After the wavelet decomposition, each wavelet column t of the response has its median absolute value median(|Y_wd[, t]|) compared against this cutoff; columns at or below are zeroed in the data and treated as Bhat = 0, Shat = 1 at every IBSS iteration. Useful for sparse-coverage assays where high-frequency wavelet coefficients are dominated by zeros. Default 0; only columns with a strictly-zero absolute-value median are masked.

wavelet_standardize

logical. When TRUE, centers and scales each packed wavelet coefficient column before the IBSS loop, after the position-space DWT and before optional wavelet_qnorm. This mirrors the established functional fine-mapping preprocessing while preserving invertible coefficient back-transforms. Default TRUE.

wavelet_qnorm

logical. When TRUE, applies a column-wise rank-based normal quantile transform to the wavelet-domain response before the IBSS loop. Useful for non-Gaussian wavelet coefficients arising from heavy-tailed assays. Default FALSE.

control_mixsqp

optional named list of mixsqp control arguments forwarded to the per-(outcome, scale) M-step.

mixture_null_weight

numeric in [0, 1] or NULL, ratio of null-pseudo-weight to data weight in the mixsqp M-step. Regularizes the EB mixture prior toward null. 0 is unregularized MLE. Internally scaled by M so the null:data balance is invariant to outcome count; single- outcome fits are unchanged. NULL (default) resolves to 0.05 internally. Ignored on prior_variance_scope \in {"per_scale_normal", "per_scale_laplace"} (the spike weight pi_0 is fit by ebnm); a non-NULL value passed on those paths emits a warning.

alpha_thin_eps

numeric, threshold below which a variable's per-effect posterior alpha[l, j] is dropped from the M-step input. Applies to every M-step solver (mixsqp for per_outcome / per_scale, ebnm for per_scale_normal / per_scale_laplace). Set to 0 to use every variable. Default 5e-5.

model_init

optional mfsusie fit object from a prior call. When supplied, the IBSS loop is seeded from the supplied alpha, mu, mu2, KL, lbf, V, pi_V, sigma2, fitted, and intercept rather than the cold- start zero state. The supplied fit must have the same L as the requested L; otherwise the call errors. Useful for resuming a long-running fit after a per-call iteration budget. Default NULL (cold start).

small_sample_correction

logical. When TRUE, replaces the per-variable Wakefield Normal marginal Bayes factor in the SER step with a Johnson 2005 scaled Student-t marginal with df = n - 1 degrees of freedom. Useful for small sample sizes where the Wakefield approximation under-propagates residual-variance uncertainty into the per-variable BF, inflating PIPs at null variants. The correction acts on variable selection probabilities only; posterior moments given inclusion are unchanged. Default FALSE.

max_inner_em_steps

integer, number of M-step + alpha-update sweeps run inside the post-loglik prior hook per (effect, outer iter), to keep mixture pi and alpha in sync per effect per outer iter, an issue unique to the prior update methods in mixsqp based model. The per_scale_normal path already converges in few outer iters (each scale's ebnm prior is fit independently and decouples the pi-alpha drift), so the inner sweep adds little there but does no harm.

attach_smoothing_inputs

logical. When TRUE (default), the fit carries Y_grid (post-remap position-space Y) and X_eff (per-effect alpha-weighted aggregate of X) so that mf_post_smooth(fit) runs without re-passing the data. Set FALSE to drop these and call mf_post_smooth(fit, X = X, Y = Y, ...) instead; useful when sharing fits where the per-individual data should not travel with the fit.

Value

A list of class c("mfsusie", "susie") carrying:

alpha

L x p posterior inclusion probabilities.

mu, mu2

list[L] of list[M] of p x T_basis[m] matrices: per-effect, per-outcome posterior mean and second moment in the wavelet domain.

pi_V

list[M] of S_m x K mixture-weight matrices.

G_prior

list[M] of list[S_m] ash-shaped prior records (mutated by the IBSS loop).

sigma2

list[M]. Element m is either a scalar (when residual_variance_scope = "per_outcome", the default) or a length-S_m vector of per-scale residual variances (when residual_variance_scope = "per_scale"). The richer per-scale shape is mfsusieR-specific; downstream code reading sigma2 should handle both.

elbo

numeric vector of ELBO values per iteration.

niter, converged

IBSS termination metadata.

pip

length-p posterior inclusion probabilities.

sets

credible sets via susie_get_cs.

fitted

list[M] of running per-outcome fits in the wavelet domain. Used by model_init to warm-start a follow-up call.

Y_grid

list[M], the post-remap position-space response on the padded grid. Attached when attach_smoothing_inputs = TRUE (default).

X_eff

list[L] of per-effect alpha-weighted X aggregates. Attached when attach_smoothing_inputs = TRUE.

lbf_variable_outcome

L x p x M array of per-(effect, variant, outcome) log Bayes factors. Parallels lbf_variable (which is L x p, the joint composite summed across scales and outcomes); lbf_variable_outcome keeps the M axis intact. Always attached; consumed by susie_post_outcome_configuration(fit, by = "outcome").

References

Manuscript: methods/online_method.tex (mfsusieR algorithm).