Skip to contents

Unified entry point for loading LD data from a metadata TSV file.

Usage

load_LD_matrix(
  LD_meta_file_path,
  region,
  extract_coordinates = NULL,
  return_genotype = FALSE,
  n_sample = NULL
)

Arguments

LD_meta_file_path

Path to the LD metadata TSV file.

region

Region of interest: "chr:start-end" string or data.frame with chrom/start/end.

extract_coordinates

Optional data.frame with columns "chrom" and "pos" for specific coordinates extraction (only for pre-computed LD blocks).

return_genotype

Controls what LD_matrix contains in the return value. FALSE (default): always return correlation matrix R. TRUE: return genotype matrix X (only valid for PLINK sources). "auto": return X for PLINK sources, R for pre-computed sources.

n_sample

Optional sample size for computing variance (= 2*p*(1-p)*n/(n-1)). If NULL, ref_panel will not include variance or n_nomiss columns. Only used for PLINK genotype sources.

Value

A list with:

LD_variants

Character vector of variant IDs (canonical format).

LD_matrix

LD correlation matrix R (or genotype matrix X when return_genotype is TRUE or "auto" with PLINK source).

ref_panel

Data.frame with variant metadata (chrom, pos, A2, A1, variant_id, and optionally allele_freq, variance, n_nomiss).

is_genotype

Logical: TRUE if LD_matrix contains genotype X, FALSE if correlation R.

block_metadata

Data.frame with region/block info. For pre-computed LD: one row per block. For PLINK: a single row spanning the loaded region.

Details

The metadata TSV must have columns: chrom, start, end, path. Two formats:

  • Pre-computed LD blocks: many rows per chromosome with block boundaries in start/end and path pointing to .cor.xz files (optionally comma-separated with a .bim path).

  • PLINK genotype files: one row per chromosome with start=0, end=0, and path pointing to a per-chromosome PLINK prefix (.pgen/.pvar[.zst]/.psam or .bed/.bim/.fam). LD is computed on the fly via compute_LD().