Skip to contents

Detect outliers in GWAS summary statistics using LD-based iterative imputation. Provide either an LD correlation matrix R or a genotype matrix X (from which LD and sample size are derived automatically).

Usage

dentist_single_window(
  zScore,
  R = NULL,
  X = NULL,
  nSample = NULL,
  pValueThreshold = 5e-08,
  propSVD = 0.4,
  gcControl = FALSE,
  nIter = 10,
  gPvalueThreshold = 0.05,
  duprThreshold = 0.99,
  ncpus = 1,
  correct_chen_et_al_bug = TRUE
)

Arguments

zScore

Numeric vector of z-scores.

R

Square LD correlation matrix. Provide either R or X.

X

Genotype matrix (samples x SNPs). If provided, LD is computed via compute_LD(X) and nSample defaults to nrow(X).

nSample

Number of samples in the LD reference panel (NOT the GWAS sample size). Controls the SVD truncation rank. Required when R is provided; inferred from X when X is provided.

pValueThreshold

P-value threshold for outlier detection. Default is 5e-8.

propSVD

SVD truncation proportion. Default is 0.4.

gcControl

Logical; apply genomic control. Default is FALSE.

nIter

Number of iterations. Default is 10.

gPvalueThreshold

Grouping p-value threshold. Default is 0.05.

duprThreshold

Duplicate r-squared threshold. Default is 0.99.

ncpus

Number of CPU cores. Default is 1.

correct_chen_et_al_bug

Correct the original DENTIST operator! bug. Default is TRUE.

Value

Data frame with columns: original_z, imputed_z, iter_to_correct, rsq, is_duplicate, outlier_stat, outlier.

See also