Fit the multi-functional, multi-outcome Sum of Single Effects
(mfSuSiE) regression model. Each outcome m carries a per-sample
functional response Y_m of length T_m; T_m may differ across
outcomes, and either T_m = 1 (univariate response) or T_m > 1
(functional response on a regular grid) is supported; for the latter
mfsusieR runs a per-outcome wavelet decomposition, then dispatches into susieR's
IBSS workhorse with mfsusieR-specific S3 methods that handle the
per-(scale, outcome) mixture-of-normals prior and residual
variance.
Usage
mfsusie(
X,
Y,
pos = NULL,
L = 20,
prior_variance_grid = NULL,
prior_variance_scope = c("per_outcome", "per_scale", "per_scale_normal",
"per_scale_laplace"),
null_prior_init = 0,
cross_outcome_prior = NULL,
prior_weights = NULL,
residual_variance = NULL,
residual_variance_scope = c("per_outcome", "per_scale"),
standardize = TRUE,
intercept = TRUE,
max_iter = 50,
tol = 1e-04,
coverage = 0.95,
min_abs_corr = 0.5,
L_greedy = 5,
greedy_lbf_cutoff = 0.1,
estimate_prior_variance = TRUE,
convergence_method = c("pip", "elbo"),
pip_stall_window = 5L,
estimate_residual_variance = TRUE,
verbose = TRUE,
track_fit = FALSE,
max_padded_log2 = 10,
wavelet_basis_order = 10,
wavelet_family = "DaubLeAsymm",
wavelet_magnitude_cutoff = 0,
wavelet_standardize = TRUE,
wavelet_qnorm = FALSE,
control_mixsqp = NULL,
mixture_null_weight = NULL,
alpha_thin_eps = 5e-05,
model_init = NULL,
small_sample_correction = FALSE,
max_inner_em_steps = 5L,
attach_smoothing_inputs = TRUE
)Arguments
- X
numeric matrix
n x pof covariates (e.g., genotype dosages).- Y
list of length
M; each element a numeric matrixn x T_m(or a length-n vector whenT_m = 1).- pos
optional list of length
M; each element a numeric vector of lengthT_mrecording the sampling positions for that outcome. Defaults toseq_len(T_m)per outcome.- L
integer, maximum number of single effects to fit. Default
20. With the defaultL_greedy = 5, the IBSS workhorse grows the effect count from 5 toward this cap in steps of 5 until the fit saturates (min(lbf) < greedy_lbf_cutoff); setL_greedy = NULLto fit the fullLeffects in one round.- prior_variance_grid
optional length-K vector of mixture variances for the scale-mixture-of-normals prior. When
NULL, the data-driven path (compute_marginal_bhat_shat+ash) selects the grid per outcome.- prior_variance_scope
one of
"per_outcome"(default),"per_scale","per_scale_normal", or"per_scale_laplace". Controls the mixture-weight layout and the M-step solver."per_outcome"collapses across scales (one weight vector per outcome) and fits the K+1-component scale-mixture-of-normals prior withmixsqp; this is the default because it is roughlyS_m-fold cheaper per IBSS iter (M mixsqp calls per iter instead of MS_m)."per_scale"keeps a separate K+1 weight vector per (outcome, scale), still fit withmixsqp."per_scale_normal"and"per_scale_laplace"switch the per-(outcome, scale) prior to a 2-component point- spike- and-slab fit byebnm::ebnm_point_normal()orebnm::ebnm_point_laplace()respectively; they are the right choice when each wavelet scale has a single characteristic effect-size scale and the K+1-component mixsqp fit is under-determined on coarse scales.- null_prior_init
numeric in
[0, 1], initialpi[null]for the scale-mixture prior onb_lat IBSS iter 1. The EM M-step overwrites it within a few iterations. Default0.- cross_outcome_prior
optional object that controls how the per-outcome log-Bayes factors are combined into the joint log-Bayes factor used by the SER step. Each IBSS effect first computes a length-
plog-BF per outcomem(one entry per variable, summed over wavelet positions); these are then reduced to a single length-pjoint log-BF before the posterioralphais taken as a softmax over variables. Defaults to outcome independence (cross_outcome_prior_independent()), under which the joint log-BF is the elementwise sum of the per-outcome log-BFs; equivalently, the joint BF is the product of per-outcome BFs. The argument is an extension point for non-independence combiners (e.g. modality-covariance priors) registered as S3 methods oncombine_outcome_lbfs; only the independence combiner is shipped, so most users leave thisNULL.- prior_weights
optional length-p numeric vector, the variable-selection prior. Defaults to uniform
1/p.- residual_variance
optional list of length
M, initial residual variance per outcome. Defaults to the per-outcome sample variance.- residual_variance_scope
"per_outcome"(default) or"per_scale". Controls the sigma2 update shape: a single scalar per outcome (default) or one scalar per wavelet scale per outcome.- standardize
logical, scale
Xcolumns to unit variance.- intercept
logical, center
Xcolumns to mean zero.- max_iter
integer, maximum IBSS iterations. Default
50.- tol
numeric, ELBO change tolerance for convergence.
- coverage
numeric in (0, 1), credible-set coverage.
- min_abs_corr
numeric, minimum variable-to-variable correlation inside a credible set (CS purity threshold).
- L_greedy
integer or
NULL. When non-NULL(default5), run susieR's greedy outer loop, growing the effect count fromL_greedyup toLin linear steps until the fit saturates (min(lbf) < greedy_lbf_cutoff). SetNULLto fit the fullLeffects in one round.- greedy_lbf_cutoff
numeric saturation threshold for the greedy outer loop. Default 0.1.
- estimate_prior_variance
logical. When
TRUE(default), run the per-effect empirical-Bayes update of the mixture weightspi_V[[m]]per (outcome, scale) using themixsqpsolver. WhenFALSE, hold the prior fixed at the initialprior_variance_grid/null_prior_init; useful when collapsingmfsusie()tosusieR::susie()for sanity checks.- convergence_method
one of
"pip"(default) or"elbo"."pip"stops whenmax(abs(prev_alpha - alpha)) < tol, with stall detection viapip_stall_window."elbo"stops when the per-iter ELBO change is belowtol.- pip_stall_window
integer, number of consecutive iterations without PIP improvement after which
"pip"convergence declares done. Default5.- estimate_residual_variance
logical. When
TRUE(default), updatesigma2per IBSS iteration viaupdate_variance_components. WhenFALSE, holdsigma2at the initial value across iterations; useful for sensitivity analysis.- verbose
logical, default
TRUE.- track_fit
logical, retain a per-iteration tracking list on the fit. Default
FALSE.- max_padded_log2
integer, log2 cap on the post-remap grid length per outcome. Default 10.
- wavelet_basis_order
integer; selects the wavelet basis member within
wavelet_family. For Daubechies families this equals the number of vanishing moments (higher = smoother, longer-support filter). Forwarded towavethresh::wd'sfilter.number; see alsofilter.select. Default10.- wavelet_family
character; selects the wavelet family. Forwarded to
wavethresh::wd'sfamily. Default"DaubLeAsymm"(Daubechies least-asymmetric, a.k.a. Symmlet).- wavelet_magnitude_cutoff
non-negative numeric. After the wavelet decomposition, each wavelet column
tof the response has its median absolute valuemedian(|Y_wd[, t]|)compared against this cutoff; columns at or below are zeroed in the data and treated asBhat = 0,Shat = 1at every IBSS iteration. Useful for sparse-coverage assays where high-frequency wavelet coefficients are dominated by zeros. Default0; only columns with a strictly-zero absolute-value median are masked.- wavelet_standardize
logical. When
TRUE, centers and scales each packed wavelet coefficient column before the IBSS loop, after the position-space DWT and before optionalwavelet_qnorm. This mirrors the established functional fine-mapping preprocessing while preserving invertible coefficient back-transforms. DefaultTRUE.- wavelet_qnorm
logical. When
TRUE, applies a column-wise rank-based normal quantile transform to the wavelet-domain response before the IBSS loop. Useful for non-Gaussian wavelet coefficients arising from heavy-tailed assays. DefaultFALSE.- control_mixsqp
optional named list of
mixsqpcontrol arguments forwarded to the per-(outcome, scale) M-step.- mixture_null_weight
numeric in
[0, 1]orNULL, ratio of null-pseudo-weight to data weight in the mixsqp M-step. Regularizes the EB mixture prior toward null.0is unregularized MLE. Internally scaled byMso the null:data balance is invariant to outcome count; single- outcome fits are unchanged.NULL(default) resolves to0.05internally. Ignored onprior_variance_scope \in {"per_scale_normal", "per_scale_laplace"}(the spike weightpi_0is fit by ebnm); a non-NULLvalue passed on those paths emits a warning.- alpha_thin_eps
numeric, threshold below which a variable's per-effect posterior
alpha[l, j]is dropped from the M-step input. Applies to every M-step solver (mixsqpforper_outcome/per_scale,ebnmforper_scale_normal/per_scale_laplace). Set to0to use every variable. Default5e-5.- model_init
optional
mfsusiefit object from a prior call. When supplied, the IBSS loop is seeded from the suppliedalpha,mu,mu2,KL,lbf,V,pi_V,sigma2,fitted, andinterceptrather than the cold- start zero state. The supplied fit must have the sameLas the requestedL; otherwise the call errors. Useful for resuming a long-running fit after a per-call iteration budget. DefaultNULL(cold start).- small_sample_correction
logical. When
TRUE, replaces the per-variable Wakefield Normal marginal Bayes factor in the SER step with a Johnson 2005 scaled Student-t marginal withdf = n - 1degrees of freedom. Useful for small sample sizes where the Wakefield approximation under-propagates residual-variance uncertainty into the per-variable BF, inflating PIPs at null variants. The correction acts on variable selection probabilities only; posterior moments given inclusion are unchanged. DefaultFALSE.- max_inner_em_steps
integer, number of M-step + alpha-update sweeps run inside the post-loglik prior hook per (effect, outer iter), to keep mixture pi and alpha in sync per effect per outer iter, an issue unique to the prior update methods in mixsqp based model. The
per_scale_normalpath already converges in few outer iters (each scale's ebnm prior is fit independently and decouples the pi-alpha drift), so the inner sweep adds little there but does no harm.- attach_smoothing_inputs
logical. When
TRUE(default), the fit carriesY_grid(post-remap position-space Y) andX_eff(per-effect alpha-weighted aggregate of X) so thatmf_post_smooth(fit)runs without re-passing the data. SetFALSEto drop these and callmf_post_smooth(fit, X = X, Y = Y, ...)instead; useful when sharing fits where the per-individual data should not travel with the fit.
Value
A list of class c("mfsusie", "susie") carrying:
alphaL x pposterior inclusion probabilities.mu,mu2list[L]oflist[M]ofp x T_basis[m]matrices: per-effect, per-outcome posterior mean and second moment in the wavelet domain.pi_Vlist[M]ofS_m x Kmixture-weight matrices.G_priorlist[M]oflist[S_m]ash-shaped prior records (mutated by the IBSS loop).sigma2list[M]. Element m is either a scalar (whenresidual_variance_scope = "per_outcome", the default) or a length-S_mvector of per-scale residual variances (whenresidual_variance_scope = "per_scale"). The richer per-scale shape is mfsusieR-specific; downstream code readingsigma2should handle both.elbonumeric vector of ELBO values per iteration.
niter,convergedIBSS termination metadata.
piplength-p posterior inclusion probabilities.
setscredible sets via
susie_get_cs.fittedlist[M]of running per-outcome fits in the wavelet domain. Used bymodel_initto warm-start a follow-up call.Y_gridlist[M], the post-remap position-space response on the padded grid. Attached whenattach_smoothing_inputs = TRUE(default).X_efflist[L]of per-effect alpha-weighted X aggregates. Attached whenattach_smoothing_inputs = TRUE.lbf_variable_outcomeL x p x Marray of per-(effect, variant, outcome) log Bayes factors. Parallelslbf_variable(which isL x p, the joint composite summed across scales and outcomes);lbf_variable_outcomekeeps the M axis intact. Always attached; consumed bysusie_post_outcome_configuration(fit, by = "outcome").