Skip to contents

This wrapper keeps the direct colocboost() argument surface. All ColocBoost inputs and model parameters are supplied through .... When no QC options are requested, the call is passed directly to colocboost(). When QC options are requested, the wrapper inspects named X/Y and/or sumstat/LD/X_ref arguments in ..., runs the relevant reusable QC step, and then calls ColocBoost on the cleaned inputs. If the required named inputs are not available, QC is skipped with a warning and the original ColocBoost call is used.

Usage

colocboost_analysis(
  ...,
  missing_rate_thresh = NULL,
  maf_cutoff = NULL,
  xvar_cutoff = NULL,
  ld_reference_meta_file = NULL,
  pip_cutoff_to_skip_ind = NULL,
  keep_indel = TRUE,
  pip_cutoff_to_skip_sumstat = NULL,
  qc_method = NULL,
  impute = FALSE,
  impute_opts = list(rcond = 0.01, R2_threshold = 0.6, minimum_ld = 5, lamb = 0.01),
  LD_reference_info = NULL,
  variant_convention = c("A2_A1", "A1_A2")
)

Arguments

...

Arguments passed to colocboost(), including data inputs such as X, Y, sumstat, LD, X_ref, dict_YX, dict_sumstatLD, outcome_names, and all ColocBoost model/post-processing options. QC can only inspect inputs that are supplied by name.

missing_rate_thresh, maf_cutoff, xvar_cutoff, ld_reference_meta_file, pip_cutoff_to_skip_ind

Individual-level QC controls. If all are NULL, individual-level QC is not run.

keep_indel, pip_cutoff_to_skip_sumstat, qc_method, impute, impute_opts

Summary-statistic QC controls. qc_method = "none" runs basic allele harmonization without LD-mismatch outlier detection. Imputation is only run when impute = TRUE.

LD_reference_info

Optional LD reference information for summary-statistic QC. This is only needed when the native LD matrix row/column names or X_ref column names are missing or are not parseable genomic variant IDs. It can be a .bim/.pvar/.pvar.zst file path, a data.frame with variant metadata, or a load_LD_matrix() result. This is a QC-only argument and is not passed to colocboost().

variant_convention

Allele order used by native ColocBoost-style sumstat$variant and LD/X_ref names when deriving QC inputs: "A2_A1" for pecotmr canonical chr:pos:A2:A1, or "A1_A2" for chr:pos:A1:A2.

Value

The object returned by colocboost().

Details

Use colocboost_analysis() the same way you would use colocboost(): pass the native ColocBoost arguments by name, for example X, Y, sumstat, LD, X_ref, dict_YX, dict_sumstatLD, outcome_names, focal_outcome_idx, effect_est, effect_se, effect_n, M, and other ColocBoost model or post-processing options. These arguments are forwarded unchanged unless one or more QC controls are requested.

Individual-level QC is only attempted when at least one individual QC control is non-NULL and named X and Y inputs are available in .... Summary-statistic QC is only attempted when qc_method, pip_cutoff_to_skip_sumstat, impute = TRUE, or LD_reference_info is supplied and named sumstat plus either LD, X_ref, or LD_reference_info are available. qc_method = "none" means run basic allele/variant harmonization only; it does not run SLALOM/DENTIST LD-mismatch QC. RAISS imputation is controlled separately by impute = TRUE.

If no QC controls are supplied, this function is a thin direct call to colocboost(...). When QC removes outcomes, outcome_names and focal_outcome_idx are updated to match the post-QC outcome order. If the requested focal outcome is removed by QC, focal_outcome_idx is set to NULL with a warning.

Examples

if (FALSE) { # \dontrun{
# Direct ColocBoost call without QC.
fit <- colocboost_analysis(X = X, Y = Y, M = 500)

# Summary-statistic input with basic allele/variant harmonization only.
fit <- colocboost_analysis(sumstat = sumstat, LD = LD,
                           qc_method = "none", M = 500)

# Summary-statistic input with LD-mismatch QC and RAISS imputation.
fit <- colocboost_analysis(sumstat = sumstat, LD = LD,
                           qc_method = "slalom", impute = TRUE)

# Use richer LD metadata from load_LD_matrix() for QC, while still passing
# ColocBoost's native LD input.
ld_data <- load_LD_matrix(ld_meta_file, region)
fit <- colocboost_analysis(sumstat = sumstat, LD = ld_data$LD_matrix,
                           LD_reference_info = ld_data, qc_method = "none")

# Individual-level input with explicit genotype QC thresholds.
fit <- colocboost_analysis(X = X, Y = Y,
                           missing_rate_thresh = 0.1,
                           maf_cutoff = 0.0005)
} # }