Fit the multi-functional, multi-outcome Sum of Single Effects
(mfSuSiE) regression model. Each outcome m carries a per-sample
functional response Y_m of length T_m; T_m may differ across
outcomes, and either T_m = 1 (univariate response) or T_m > 1
(functional response on a regular grid) is supported. mfsusieR runs
a per-outcome wavelet decomposition, then dispatches into susieR's
IBSS workhorse with mfsusieR-specific S3 methods that handle the
per-(scale, outcome) mixture-of-normals prior and residual
variance.
Usage
mfsusie(
X,
Y,
pos = NULL,
L = 20,
prior_variance_grid = NULL,
prior_variance_scope = c("per_outcome", "per_scale"),
null_prior_weight = 2,
cross_outcome_prior = NULL,
prior_weights = NULL,
residual_variance = NULL,
residual_variance_scope = c("per_outcome", "per_scale"),
standardize = TRUE,
intercept = TRUE,
max_iter = 50,
tol = 1e-04,
coverage = 0.95,
min_abs_corr = 0.5,
L_greedy = 5,
greedy_lbf_cutoff = 0.1,
estimate_prior_variance = TRUE,
convergence_method = c("pip", "elbo"),
pip_stall_window = 5L,
estimate_residual_variance = TRUE,
verbose = FALSE,
track_fit = FALSE,
max_padded_log2 = 10,
wavelet_basis_order = 10,
wavelet_family = "DaubLeAsymm",
wavelet_magnitude_cutoff = 0,
wavelet_qnorm = TRUE,
control_mixsqp = NULL,
mixsqp_null_penalty = 0.1,
mixsqp_alpha_eps = 1e-06,
model_init = NULL,
small_sample_correction = FALSE,
attach_smoothing_inputs = TRUE
)Arguments
- X
numeric matrix
n x pof covariates (e.g., genotype dosages).- Y
list of length
M; each element a numeric matrixn x T_m(or a length-n vector whenT_m = 1).- pos
optional list of length
M; each element a numeric vector of lengthT_mrecording the sampling positions for that outcome. Defaults toseq_len(T_m)per outcome.- L
integer, maximum number of single effects to fit. Default
20. With the defaultL_greedy = 5, the IBSS workhorse grows the effect count from 5 toward this cap in steps of 5 until the fit saturates (min(lbf) < greedy_lbf_cutoff); setL_greedy = NULLto fit the fullLeffects in one round.- prior_variance_grid
optional length-K vector of mixture variances for the scale-mixture-of-normals prior. When
NULL, the data-driven path (compute_marginal_bhat_shat+ash) selects the grid per outcome.- prior_variance_scope
"per_outcome"(default) or"per_scale". Controls whether the mixture-weight matrixpi_V[[m]]collapses across scales (one weight vector per outcome) or keeps a separate row per wavelet scale. The per-outcome scope is the default because it is roughlyS_m-fold cheaper per IBSS iter (M mixsqp calls per iter instead of M*S_m); switch to"per_scale"only when you need scale-specific mixture weights for power on shape- varying signals.- null_prior_weight
numeric, weight on the null prior component. Default 2.
- cross_outcome_prior
optional object that controls how the per-outcome log-Bayes factors are combined into the joint log-Bayes factor used by the SER step. Each IBSS effect first computes a length-
plog-BF per outcomem(one entry per variable, summed over wavelet positions); these are then reduced to a single length-pjoint log-BF before the posterioralphais taken as a softmax over variables. Defaults to outcome independence (cross_outcome_prior_independent()), under which the joint log-BF is the elementwise sum of the per-outcome log-BFs; equivalently, the joint BF is the product of per-outcome BFs. The argument is an extension point for non-independence combiners (e.g. modality-covariance priors) registered as S3 methods oncombine_outcome_lbfs; only the independence combiner is shipped, so most users leave thisNULL.- prior_weights
optional length-p numeric vector, the variable-selection prior. Defaults to uniform
1/p.- residual_variance
optional list of length
M, initial residual variance per outcome. Defaults to the per-outcome sample variance.- residual_variance_scope
"per_outcome"(default) or"per_scale". Controls the sigma2 update shape: a single scalar per outcome (default) or one scalar per wavelet scale per outcome.- standardize
logical, scale
Xcolumns to unit variance.- intercept
logical, center
Xcolumns to mean zero.- max_iter
integer, maximum IBSS iterations. Default
50.- tol
numeric, ELBO change tolerance for convergence.
- coverage
numeric in (0, 1), credible-set coverage.
- min_abs_corr
numeric, minimum variable-to-variable correlation inside a credible set (CS purity threshold).
- L_greedy
integer or
NULL. When non-NULL(default5), run susieR's greedy outer loop, growing the effect count fromL_greedyup toLin linear steps until the fit saturates (min(lbf) < greedy_lbf_cutoff). SetNULLto fit the fullLeffects in one round.- greedy_lbf_cutoff
numeric saturation threshold for the greedy outer loop. Default 0.1.
- estimate_prior_variance
logical. When
TRUE(default), run the per-effect empirical-Bayes update of the mixture weightspi_V[[m]]per (outcome, scale) using themixsqpsolver. WhenFALSE, hold the prior fixed at the initialprior_variance_grid/null_prior_weight; useful when collapsingmfsusie()tosusieR::susie()for sanity checks.- convergence_method
one of
"pip"(default) or"elbo". Selects the IBSS convergence criterion."pip"stops whenmax(abs(prev_alpha - alpha))falls belowtol(PIP-difference convergence) with stall detection – the default because alpha is robust to the small per-iteration ELBO oscillations that arise from mixsqp's approximate M-step (Generalized EM residual:ELBO(t+1) >= ELBO(t) - O(eps_mixsqp * L * M * S_m), Neal & Hinton 1998)."elbo"stops when the change in ELBO falls belowtol; the per-iteration ELBO is a coherent variational free energy thanks to theget_objective.mfsusiepost-iteration KL refresh (refresh_lbf_kl.mf_individual) which evaluates per- effectKL[l]against the iter-final pi_V rather than the pi_V state at the moment effect l was updated.- pip_stall_window
integer. Number of consecutive iterations without PIP-difference improvement after which the PIP-based convergence path declares convergence even when
tolhas not been reached. Default 5. Only consulted whenconvergence_method = "pip"(or when ELBO produces non-finite values and the PIP fallback fires).- estimate_residual_variance
logical. When
TRUE(default), updatesigma2per IBSS iteration viaupdate_variance_components. WhenFALSE, holdsigma2at the initial value across iterations; useful for sensitivity analysis.- verbose
logical.
- track_fit
logical, retain a per-iteration tracking list on the fit. Default
FALSE.- max_padded_log2
integer, log2 cap on the post-remap grid length per outcome. Default 10.
- wavelet_basis_order
integer; selects the wavelet basis member within
wavelet_family. For Daubechies families this equals the number of vanishing moments (higher = smoother, longer-support filter). Forwarded towavethresh::wd'sfilter.number; see alsofilter.select. Default10.- wavelet_family
character; selects the wavelet family. Forwarded to
wavethresh::wd'sfamily. Default"DaubLeAsymm"(Daubechies least-asymmetric, a.k.a. Symmlet).- wavelet_magnitude_cutoff
non-negative numeric. After the wavelet decomposition, each wavelet column
tof the response has its median absolute valuemedian(|Y_wd[, t]|)compared against this cutoff; columns at or below are zeroed in the data and treated asBhat = 0,Shat = 1at every IBSS iteration. Useful for sparse-coverage assays where high-frequency wavelet coefficients are dominated by zeros. Default0; only columns with a strictly-zero absolute-value median are masked.- wavelet_qnorm
logical. When
TRUE(default), applies a column-wise rank-based normal quantile transform to the wavelet-domain response before the IBSS loop. Useful for non-Gaussian wavelet coefficients arising from heavy-tailed assays.- control_mixsqp
optional named list of
mixsqpcontrol arguments forwarded to the per-(outcome, scale) M-step.- mixsqp_null_penalty
numeric, per-coefficient pseudo-count on the null component of the mixsqp M-step. The total null pseudo-count fed to mixsqp is
mixsqp_null_penalty * idx_size, versus a total data weight ofidx_sizefrom thezeta- weighted variable rows; the ratiomixsqp_null_penalty : 1is therefore the null:data weight ratio at the M-step. Default0.1corresponds to a 10% null:data ratio, anchored toashr::ash.workhorse(nullweight = 10)over a typicalidx_size = 128(ratio ~0.08). For multi-outcome fits the penalty is scaled by the number of outcomesMso that the null:data balance stays invariant as the per-effect variable posterior concentrates linearly inM; single-outcome fits are unchanged. Default0.1.- mixsqp_alpha_eps
numeric, threshold below which a variable's per-effect posterior
alpha[l, j]is dropped from the mixsqp M-step input. The truncation error on the M-step gradient is bounded bysum_{j outside} alpha_j * max_k(L_jk), well under floating-point precision for typical concentrated SuSiE posteriors. Set to0to use every variable regardless ofalpha[l, j]. Default1e-6.- model_init
optional
mfsusiefit object from a prior call. When supplied, the IBSS loop is seeded from the suppliedalpha,mu,mu2,KL,lbf,V,pi_V,sigma2,fitted, andinterceptrather than the cold- start zero state. The supplied fit must have the sameLas the requestedL; otherwise the call errors. Useful for resuming a long-running fit after a per-call iteration budget. DefaultNULL(cold start).- small_sample_correction
logical. When
TRUE, replaces the per-variable Wakefield Normal marginal Bayes factor in the SER step with a Johnson 2005 scaled Student-t marginal withdf = n - 1degrees of freedom. Useful for small sample sizes where the Wakefield approximation under-propagates residual-variance uncertainty into the per-variable BF, inflating PIPs at null variants. The correction acts on variable selection probabilities only; posterior moments given inclusion are unchanged. DefaultFALSE.- attach_smoothing_inputs
logical. When
TRUE(default), the fit carriesY_grid(post-remap position-space Y) andX_eff(per-effect alpha-weighted aggregate of X) so thatmf_post_smooth(fit)runs without re-passing the data. SetFALSEto drop these and callmf_post_smooth(fit, X = X, Y = Y, ...)instead; useful when sharing fits where the per-individual data should not travel with the fit.
Value
A list of class c("mfsusie", "susie") carrying:
alphaL x pposterior inclusion probabilities.mu,mu2list[L]oflist[M]ofp x T_basis[m]matrices: per-effect, per-outcome posterior mean and second moment in the wavelet domain.pi_Vlist[M]ofS_m x Kmixture-weight matrices.G_priorlist[M]oflist[S_m]ash-shaped prior records (mutated by the IBSS loop).sigma2list[M]. Element m is either a scalar (whenresidual_variance_scope = "per_outcome", the default) or a length-S_mvector of per-scale residual variances (whenresidual_variance_scope = "per_scale"). The richer per-scale shape is mfsusieR-specific; downstream code readingsigma2should handle both.elbonumeric vector of ELBO values per iteration.
niter,convergedIBSS termination metadata.
piplength-p posterior inclusion probabilities.
setscredible sets via
susie_get_cs.fittedlist[M]of running per-outcome fits in the wavelet domain. Used bymodel_initto warm-start a follow-up call.Y_gridlist[M], the post-remap position-space response on the padded grid. Attached whenattach_smoothing_inputs = TRUE(default).X_efflist[L]of per-effect alpha-weighted X aggregates. Attached whenattach_smoothing_inputs = TRUE.lbf_variable_outcomeL x p x Marray of per-(effect, variant, outcome) log Bayes factors. Parallelslbf_variable(which isL x p, the joint composite summed across scales and outcomes);lbf_variable_outcomekeeps the M axis intact. Always attached; consumed bysusie_post_outcome_configuration(fit, by = "outcome").