Fine-mapping with pecotmr

Overview

fineMappingPipeline() is a single pipeline that can accept any of the following inputs:

Input class	What it represents	Methods supported
`QtlDataset`	Single-study individual-level data	susie, susieInf, susieAsh, mvsusie, fsusie, susieSer
`MultiStudyQtlDataset`	Multiple `QtlDataset` objects (optionally with an embedded `QtlSumStats`)	susie, susieInf, susieAsh, mvsusie, fsusie (individual-level only), susieSer (individual-level only)
`QtlSumStats`	QTL summary statistics annotated by `(study, context, trait)`	susie, susieInf, susieAsh, mvsusie
`GwasSumStats`	GWAS summary statistics annotated by `study`	susie, susieInf, susieAsh, mvsusie

Method arguments are the same across input classes; fineMappingPipeline() routes methods = "susie" to susieR::susie for individual-level data and to susieR::susie_rss for summary-statistics inputs automatically.

The output is always a FineMappingResult (a DFrame-subclass collection of per-tuple FineMappingEntry payloads).

library(pecotmr)

Bundled inputs

data(qtl_dataset_example,
     qtl_sumstats_example,
     gwas_sumstats_s4_example,
     multi_study_qtl_dataset_example)

qtl_dataset_example

## QtlDataset for study 'study1'
##   1 context(s): brain
##   1 unique traits across contexts
##   Genotypes: plink1 @ pecotmr://extdata/toy_canonical
##   Genotype covariates: 0 cols
##   Scale residuals: TRUE

The *_sumstats_example objects ship with a non-empty qcInfo slot so they bypass the summaryStatsQc() gate. In a real analysis you’d construct the raw QtlSumStats / GwasSumStats from your data, then run summaryStatsQc() to populate that slot (see the summary-statistics QC vignette).

Individual-level fine-mapping

fineMappingPipeline() always needs to know which genotype window to fit on. With an individual-level QtlDataset, pass cisWindow (in basepairs around each trait’s TSS), supply an explicit region, or set both per-call.

fmr <- fineMappingPipeline(qtl_dataset_example,
                            methods   = "susie",
                            cisWindow = 1e6)
fmr

## QtlFineMappingResult: 1 entries
##   1 studies, 1 contexts, 1 traits, 1 methods
##   LD sketch: NULL (individual-level fit)

Per-tuple PIPs:

pip <- getPip(fmr, study = "study1", context = "brain",
              trait = "ENSG_example", method = "susie")
head(sort(pip, decreasing = TRUE), 5)

## chr22:15611133:C:G chr22:15570730:C:T chr22:15569194:A:G chr22:15567276:C:T 
##          0.2111065          0.1427403          0.1361272          0.1270685 
## chr22:15594669:T:C 
##          0.1169922

Credible sets:

getCs(fmr, study = "study1", context = "brain",
      trait = "ENSG_example", method = "susie", coverage = 0.95)

##  [1] variant_id chrom      pos        A1         A2         N         
##  [7] af         beta       se         pip        logBF     
## <0 rows> (or 0-length row.names)

The top_loci table consolidates per-method PIPs, CS membership, and sumstats into one long-format data.frame:

head(getTopLoci(fmr, study = "study1", context = "brain",
                trait = "ENSG_example", method = "susie"), 5)

##           variant_id chrom      pos A1 A2   N        af        beta         se
## 1 chr22:15567276:C:T    22 15567276  T  C 165 0.1993865 0.033635519 0.09184189
## 2 chr22:15569194:A:G    22 15569194  G  A 165 0.2006173 0.036217636 0.09504913
## 3 chr22:15570730:C:T    22 15570730  T  C 165 0.2000000 0.038109622 0.09729834
## 4 chr22:15583103:G:A    22 15583103  A  G 165 0.2639752 0.006303249 0.03957304
## 5 chr22:15594252:T:C    22 15594252  C  T 165 0.1737805 0.021518343 0.07410824
##          pip    logBF   cs_95   cs_70   cs_50 cs_95_purity cs_70_purity
## 1 0.12706853 5.418456 susie_0 susie_0 susie_0            0            0
## 2 0.13612718 5.487319 susie_0 susie_0 susie_0            0            0
## 3 0.14274027 5.534756 susie_0 susie_0 susie_0            0            0
## 4 0.02713469 3.874542 susie_0 susie_0 susie_0            0            0
## 5 0.08392514 5.003655 susie_0 susie_0 susie_0            0            0
##   cs_50_purity within_cs_pip method gene event grange_start grange_end
## 1            0            NA  susie <NA>  <NA>           NA         NA
## 2            0            NA  susie <NA>  <NA>           NA         NA
## 3            0            NA  susie <NA>  <NA>           NA         NA
## 4            0            NA  susie <NA>  <NA>           NA         NA
## 5            0            NA  susie <NA>  <NA>           NA         NA

Multi-study fine-mapping

A MultiStudyQtlDataset runs the pipeline per-study and concatenates the results into a single QtlFineMappingResult:

msFmr <- fineMappingPipeline(multi_study_qtl_dataset_example,
                              methods   = "susie",
                              cisWindow = 1e6)
table(msFmr$study)

## 
## study1 study2 
##      1      1

QTL summary-statistics fine-mapping

fmrSs <- fineMappingPipeline(qtl_sumstats_example, methods = "susie")
fmrSs

## QtlFineMappingResult: 1 entries
##   1 studies, 1 contexts, 1 traits, 1 methods
##   LD sketch: plink1 @ pecotmr://extdata/toy_canonical

The QtlSumStats collection carries an ldSketch GenotypeHandle; the pipeline pulls the per-region LD from there automatically.

GWAS summary-statistics fine-mapping

fmrGwas <- fineMappingPipeline(gwas_sumstats_s4_example,
                                methods = "susie")
fmrGwas

## GwasFineMappingResult: 1 entries
##   1 studies, 1 methods
##   LD sketch: plink1 @ pecotmr://extdata/toy_canonical

The GWAS path is similar to the QTL-sumstats path but only annotates by study.

Common parameters

fineMappingPipeline(
  qtl_dataset_example,
  methods            = c("susie", "susieInf"),
  contexts           = "brain",
  traitId            = "ENSG_example",
  region             = "chr22:25000000-26000000",
  cisWindow          = 5e5,
  coverage           = 0.95,
  secondaryCoverage  = c(0.7, 0.5),
  signalCutoff       = 0.025,
  minAbsCorr         = 0.8,
  addSusieInf        = TRUE,
  naAction           = "drop",   # "drop" (default) or "impute"
  verbose            = 1)

naAction controls how phenotype NAs are handled by getResidualizedPhenotypes(): "drop" removes samples with any NA across the requested traits, "impute" mean-imputes each trait independently.
addSusieInf = TRUE runs SuSiE-inf first and uses its result to initialise the susie fit, which improves credible-set purity in regions with background polygenic signal.
secondaryCoverage adds secondary CS columns at the requested coverages (alongside the primary coverage); useful when you want both narrow and broad CS reports.

Joint multi-context / multi-trait fits

jointSpecification triggers the dispatch for cross-axis joint methods (mvsusie for cross-context, mvsusie or fsusie for multi-trait):

fineMappingPipeline(
  multi_study_qtl_dataset_example,
  methods            = "mvsusie",
  jointSpecification = list(axis = "context",
                             contexts = c("brain", "liver")))

See ?jointSpecification for the specification grammar.

Reading the result

FineMappingResult is a DFrame subclass: subsetting with [, concatenating with c(), and column access via $ all work as on a DataFrame. The per-tuple payload lives in the entry column as a FineMappingEntry:

entry <- fmr$entry[[1L]]
class(entry)

## [1] "FineMappingEntry"
## attr(,"package")
## [1] "pecotmr"

slotNames(entry)

## [1] "variantIds" "susieFit"   "topLoci"    "cvResult"

Each entry carries:

variantIds — the LD-aligned variant set the fit was computed on
trimmedFit — a slim list containing alpha, lbf_variable, mu, mu2, X_column_scale_factors, pip, and credible-set metadata
topLoci — the long-format per-variant summary used by downstream pipelines and vcfWriter
sumstats — optional sumstats passed through for reporting

Next steps

QtlFineMappingResult is used by colocPipeline for QTL-GWAS colocalization and by twasWeightsPipeline so that fine-mapping weights are contributed to the TWAS ensemble weights model alongside regularized regression methods.
Run the full causalInferencePipeline (TWAS + MR) with a FineMappingResult and GwasSumStats.

pecotmr authors

2026-07-30