Detect outliers in GWAS summary statistics using LD-based iterative imputation.
Provide either an LD correlation matrix R or a genotype matrix X
(from which LD and sample size are derived automatically).
Usage
dentist_single_window(
zScore,
R = NULL,
X = NULL,
nSample = NULL,
pValueThreshold = 5e-08,
propSVD = 0.4,
gcControl = FALSE,
nIter = 10,
gPvalueThreshold = 0.05,
duprThreshold = 0.99,
ncpus = 1,
correct_chen_et_al_bug = TRUE
)Arguments
- zScore
Numeric vector of z-scores.
- R
Square LD correlation matrix. Provide either
RorX.- X
Genotype matrix (samples x SNPs). If provided, LD is computed via
compute_LD(X)andnSampledefaults tonrow(X).- nSample
Number of samples in the LD reference panel (NOT the GWAS sample size). Controls the SVD truncation rank. Required when
Ris provided; inferred fromXwhenXis provided.- pValueThreshold
P-value threshold for outlier detection. Default is 5e-8.
- propSVD
SVD truncation proportion. Default is 0.4.
- gcControl
Logical; apply genomic control. Default is FALSE.
- nIter
Number of iterations. Default is 10.
- gPvalueThreshold
Grouping p-value threshold. Default is 0.05.
- duprThreshold
Duplicate r-squared threshold. Default is 0.99.
- ncpus
Number of CPU cores. Default is 1.
- correct_chen_et_al_bug
Correct the original DENTIST operator! bug. Default is TRUE.
Value
Data frame with columns: original_z, imputed_z, iter_to_correct, rsq, is_duplicate, outlier_stat, outlier.