Skip to contents

This function performs an end-to-end RSS analysis pipeline, including data loading, preprocessing, quality control, imputation, and SuSiE RSS analysis. It provides flexibility in specifying various analysis options and parameters.

Usage

rss_analysis_pipeline(
  sumstat_path,
  column_file_path,
  LD_data,
  n_sample = 0,
  n_case = 0,
  n_control = 0,
  region = NULL,
  skip_region = NULL,
  extract_region_name = NULL,
  region_name_col = NULL,
  qc_method = c("dentist", "slalom"),
  finemapping_method = c("susie_rss", "single_effect", "bayesian_conditional_regression"),
  finemapping_opts = list(init_L = 5, max_L = 20, l_step = 5, coverage = c(0.95, 0.7,
    0.5), signal_cutoff = 0.025, min_abs_corr = 0.8),
  impute = TRUE,
  impute_opts = list(rcond = 0.01, R2_threshold = 0.6, minimum_ld = 5, lamb = 0.01),
  pip_cutoff_to_skip = 0,
  remove_indels = FALSE,
  comment_string = "#",
  diagnostics = FALSE
)

Arguments

sumstat_path

File path to the summary statistics.

column_file_path

File path to the column file for mapping.

LD_data

A list containing combined LD variants data that is generated by load_LD_matrix.

n_sample

User-specified sample size. If unknown, set as 0 to retrieve from the sumstat file.

n_case

User-specified number of cases.

n_control

User-specified number of controls.

region

The region where tabix use to subset the input dataset.

skip_region

A character vector specifying regions to be skipped in the analysis (optional). Each region should be in the format "chrom:start-end" (e.g., "1:1000000-2000000").

extract_region_name

User-specified gene/phenotype name used to further subset the phenotype data.

region_name_col

Filter this specific column for the extract_region_name.

qc_method

Quality control method to use. Options are "dentist" or "slalom" (default: "dentist").

finemapping_opts

A list of fine-mapping options: init_L, max_L, l_step, coverage, signal_cutoff, and min_abs_corr (minimum absolute correlation for credible set purity, default 0.8; susieR default is 0.5).

impute

Logical; if TRUE, performs imputation for outliers identified in the analysis (default: TRUE).

impute_opts

A list of imputation options including rcond, R2_threshold, and minimum_ld (default: list(rcond = 0.01, R2_threshold = 0.6, minimum_ld = 5)).

pip_cutoff_to_skip

PIP cutoff to skip imputation (default: 0).

L

Initial number of causal configurations to consider in the analysis (default: 8).

max_L

Maximum number of causal configurations to consider when dynamically adjusting L (default: 20).

l_step

Step size for increasing L when the limit is reached during dynamic adjustment (default: 5).

analysis_method

Analysis method to use. Options are "susie_rss", "single_effect", or "bayesian_conditional_regression" (default: "susie_rss").

coverage

Coverage levels for SuSiE RSS analysis (default: c(0.95, 0.7, 0.5)).

signal_cutoff

Signal cutoff for susie_post_processor (default: 0.025).

Value

A list containing the final_result and input_rss_data. - final_result: A list containing the results of various SuSiE RSS analyses. - input_rss_data: A processed data frame containing summary statistics after preprocessing.