Univariate Fine-Mapping of Functional (Epigenomic) Data with fSuSiE

Univariate Fine-Mapping of Functional (Epigenomic) Data with fSuSiE#

Description#

Univariate fine-mapping for functional (epigenomic) data is conducted with fSuSiE. This is similar to the normal univariate fine-mapping, with the main difference being the use of epigonmic data.

Input#

--genoFile: path to a text file contatining information on genotype files. For example:

#id     #path
21      $PATH/protocol_example.genotype.chr21_22.21.bed
22      $PATH/protocol_example.genotype.chr21_22.22.bed

--phenoFile: a tab delimited file containing chr, start, end, ID and path for the regions. For example:

#chr    start   end     ID      path
chr21   0       14120807        TADB_1297       $PATH/protocol_example.ha.bed.gz
chr21   10840000        16880069        TADB_1298       $PATH/protocol_example.ha.bed.gz

--covFile: path to a gzipped file containing covariates in the rows, and sample ids in the columns.
--customized-association-windows: a tab delimited file containing chr, start, end, and ID regions. For example:

#chr    start   end     ID
chr21   0       14120807        TADB_1297
chr21   10840000        16880069        TADB_1298

--region-name: if you only wish to analyze one region, then include the ID of a region found in the customized-association-windows file

Output#

  • *.fsusie_mixture_normal_TI__top_pc_weights.rds


Minimal Working Example Steps#

iii. Run the Fine-Mapping with fSuSiE#

sos run pipeline/mnm_regression.ipynb fsusie \
    --cwd output/fsusie/ \
    --name test_fsusie \
    --genoFile output/genotype_by_chrom/wgs.merged.plink_qc.genotype_by_chrom_files.txt \
    --phenoFile output/phenotype/phenotype_by_chrom_for_cis/bulk_rnaseq.phenotype_by_chrom_files.region_list.txt \
    --covFile output/covariate/bulk_rnaseq_tpm_matrix.low_expression_filtered.outlier_removed.tmm.expression.covariates.wgs.merged.plink_qc.plink_qc.prune.pca.Marchenko_PC.gz \
    --numThreads 8 \
    --customized-association-windows reference_data/TAD/TADB_enhanced_cis.bed \
    --save-data \
    --region-name ENSG00000049246 ENSG00000054116 ENSG00000116678 ENSG00000073921 ENSG00000186891

Anticipated Results#

Univariate finemapping for functional data will produce a file containing results for the top hits and a file containing residuals from SuSiE.

protocol_example_methylation.chr21_10840000_16880069.fsusie_mixture_normal_top_pc_weights.rds:

  • For each region of interest, this file contains:

    1. susie_on_top_pc - ?

    2. twas_weights - for each variant (for enet, lasso and mrash methods). no susie?

    3. twas predictions - for each sample (for enet, lasso, mrash methods)

    4. twas cross validation results - information on the best method. Data is split into five parts

    5. fsusie results - ?

    6. Y coordinates - ?

    7. fsusie summary - ?

    8. total time elapsed

    9. region info - information on the region specified

protocol_example_methylation.chr21_10840000_16880069.16_marks.dataset.rds:

  • For each gene of interest, contains residuals for each sample and phenotype

  • see pecotmr code for description at fsusie uses the load_regional_functional_data function, an explanation of the arguments can be found at the similar load_regional_association_data function