This function is a part of the statistical library for SNP imputation from: https://gitlab.pasteur.fr/statistical-genetics/raiss/-/blob/master/raiss/stat_models.py It is R implementation of the imputation model described in the paper by Bogdan Pasaniuc, Noah Zaitlen, et al., titled "Fast and accurate imputation of summary statistics enhances evidence of functional enrichment", published in Bioinformatics in 2014.
Usage
raiss(
ref_panel,
known_zscores,
LD_matrix = NULL,
genotype_matrix = NULL,
lamb = 0.01,
rcond = 0.01,
svd_tol = 1e-08,
R2_threshold = 0.6,
minimum_ld = 5,
verbose = TRUE
)Arguments
- ref_panel
A data frame containing 'chrom', 'pos', 'variant_id', 'A1', and 'A2'.
- known_zscores
A data frame containing 'chrom', 'pos', 'variant_id', 'A1', 'A2', and 'z' values.
- LD_matrix
Either a square matrix or a list of matrices for LD blocks. Provide either
LD_matrixorgenotype_matrix, not both.- genotype_matrix
A centered and scaled genotype matrix (n x p) as an alternative to
LD_matrix. Column order must match the variant order inref_panel. When provided, the imputation uses an SVD-based approach that avoids forming the p x p LD matrix.- lamb
Regularization term added to the diagonal of the LD_matrix.
- rcond
Threshold for filtering eigenvalues in the pseudo-inverse computation (only used with LD_matrix path).
- svd_tol
Relative tolerance for filtering small singular values (only used with genotype_matrix path).
- R2_threshold
R square threshold below which SNPs are filtered from the output.
- minimum_ld
Minimum LD score threshold for SNP filtering.
- verbose
Logical indicating whether to print progress information.
Value
A list containing filtered and unfiltered results, and filtered LD matrix (LD_mat is NULL when using genotype_matrix path).
Details
This function can process either a single LD matrix or a list of LD matrices for different blocks. For a list of matrices, it processes each block separately and combines the results. Alternatively, it can accept a genotype matrix X directly, avoiding the need to form the p x p LD matrix (memory and compute savings when n << p).