Univariate Fine-Mapping of Functional (Epigenomic) Data with fSuSiE#
Description#
Univariate fine-mapping for functional (epigenomic) data is conducted with fSuSiE. This is similar to the normal univariate fine-mapping, with the main difference being the use of epigonmic data.
Input#
--genoFile
: path to a text file contatining information on genotype files. For example:
#id #path
21 $PATH/protocol_example.genotype.chr21_22.21.bed
22 $PATH/protocol_example.genotype.chr21_22.22.bed
--phenoFile
: a tab delimited file containing chr, start, end, ID and path for the regions. For example:
#chr start end ID path
chr21 0 14120807 TADB_1297 $PATH/protocol_example.ha.bed.gz
chr21 10840000 16880069 TADB_1298 $PATH/protocol_example.ha.bed.gz
--covFile
: path to a gzipped file containing covariates in the rows, and sample ids in the columns.
--customized-association-windows
: a tab delimited file containing chr, start, end, and ID regions. For example:
#chr start end ID
chr21 0 14120807 TADB_1297
chr21 10840000 16880069 TADB_1298
--region-name
: if you only wish to analyze one region, then include the ID of a region found in the customized-association-windows
file
Output#
*.fsusie_mixture_normal_TI__top_pc_weights.rds
Minimal Working Example Steps#
iii. Run the Fine-Mapping with fSuSiE#
sos run pipeline/mnm_regression.ipynb fsusie \
--cwd output/fsusie/ \
--name test_fsusie \
--genoFile output/genotype_by_chrom/wgs.merged.plink_qc.genotype_by_chrom_files.txt \
--phenoFile output/phenotype/phenotype_by_chrom_for_cis/bulk_rnaseq.phenotype_by_chrom_files.region_list.txt \
--covFile output/covariate/bulk_rnaseq_tpm_matrix.low_expression_filtered.outlier_removed.tmm.expression.covariates.wgs.merged.plink_qc.plink_qc.prune.pca.Marchenko_PC.gz \
--numThreads 8 \
--customized-association-windows reference_data/TAD/TADB_enhanced_cis.bed \
--save-data \
--region-name ENSG00000049246 ENSG00000054116 ENSG00000116678 ENSG00000073921 ENSG00000186891
Anticipated Results#
Univariate finemapping for functional data will produce a file containing results for the top hits and a file containing residuals from SuSiE.
protocol_example_methylation.chr21_10840000_16880069.fsusie_mixture_normal_top_pc_weights.rds
:
For each region of interest, this file contains:
susie_on_top_pc - ?
twas_weights - for each variant (for enet, lasso and mrash methods). no susie?
twas predictions - for each sample (for enet, lasso, mrash methods)
twas cross validation results - information on the best method. Data is split into five parts
fsusie results - ?
Y coordinates - ?
fsusie summary - ?
total time elapsed
region info - information on the region specified
protocol_example_methylation.chr21_10840000_16880069.16_marks.dataset.rds
:
For each gene of interest, contains residuals for each sample and phenotype
see pecotmr code for description at fsusie uses the
load_regional_functional_data
function, an explanation of the arguments can be found at the similarload_regional_association_data
function