# Multivariate Fine-Mapping with mvSuSiE and mr.mash


Multivariate fine-mapping using mvSuSiE and mr.mash is also available in our pipeline.


## Input


    
`--genoFile`: path to a text file contatining information on genotype files. For example:
```
#id     #path
21      $PATH/protocol_example.genotype.chr21_22.21.bed
22      $PATH/protocol_example.genotype.chr21_22.22.bed
```
`--phenoFile`: a tab delimited file containing chr, start, end, ID and path for the regions. For example:
```
#chr    start   end     ID      path
chr21   0       14120807        TADB_1297       $PATH/protocol_example.ha.bed.gz
chr21   10840000        16880069        TADB_1298       $PATH/protocol_example.ha.bed.gz
```

`--covFile`: path to a gzipped file containing covariates in the rows, and sample ids in the columns.  
`--customized-association-windows`: a tab delimited file containing chr, start, end, and ID regions. For example:
```
#chr    start   end     ID
chr21   0       14120807        TADB_1297
chr21   10840000        16880069        TADB_1298
```
`--region-name`: if you only wish to analyze one region, then include the ID of a region found in the `customized-association-windows` file

`--mixture_prior`: rds file from mr.mash

## Minimal Working Example Steps

### iv. Run the Fine-Mapping with mvSuSiE

In [None]:
sos run $PATH/protocol/pipeline/mnm_regression.ipynb mnm \
    --name ROSMAP_mega_eQTL --cwd $PATH/output/ \
    --genoFile $PATH/genofile/ROSMAP_NIA_WGS.leftnorm.bcftools_qc.plink_qc.11.bed \
    --phenoFile $PATH/phenofile/Mic/analysis_ready/phenotype_preprocessing/snuc_pseudo_bulk.Mic.mega.normalized.log2cpm.region_list.txt \
                $PATH/phenofile/Ast/analysis_ready/phenotype_preprocessing/snuc_pseudo_bulk.Ast.mega.normalized.log2cpm.region_list.txt \
                $PATH/phenofile/Oli/analysis_ready/phenotype_preprocessing/snuc_pseudo_bulk.Oli.mega.normalized.log2cpm.region_list.txt \
                $PATH/phenofile/OPC/analysis_ready/phenotype_preprocessing/snuc_pseudo_bulk.OPC.mega.normalized.log2cpm.region_list.txt \
                $PATH/phenofile/Exc/analysis_ready/phenotype_preprocessing/snuc_pseudo_bulk.Exc.mega.normalized.log2cpm.region_list.txt \
                $PATH/phenofile/Inh/analysis_ready/phenotype_preprocessing/snuc_pseudo_bulk.Inh.mega.normalized.log2cpm.region_list.txt \
    --covFile $PATH/phenofile/Mic/analysis_ready/covariate_preprocessing/snuc_pseudo_bulk.Mic.mega.normalized.log2cpm.rosmap_cov.ROSMAP_NIA_WGS.leftnorm.bcftools_qc.plink_qc.snuc_pseudo_bulk_mega.related.plink_qc.extracted.pca.projected.Marchenko_PC.gz \
              $PATH/phenofile/Ast/analysis_ready/covariate_preprocessing/snuc_pseudo_bulk.Ast.mega.normalized.log2cpm.rosmap_cov.ROSMAP_NIA_WGS.leftnorm.bcftools_qc.plink_qc.snuc_pseudo_bulk_mega.related.plink_qc.extracted.pca.projected.Marchenko_PC.gz \
              $PATH/phenofile/Oli/analysis_ready/covariate_preprocessing/snuc_pseudo_bulk.Oli.mega.normalized.log2cpm.rosmap_cov.ROSMAP_NIA_WGS.leftnorm.bcftools_qc.plink_qc.snuc_pseudo_bulk_mega.related.plink_qc.extracted.pca.projected.Marchenko_PC.gz \
              $PATH/phenofile/OPC/analysis_ready/covariate_preprocessing/snuc_pseudo_bulk.OPC.mega.normalized.log2cpm.rosmap_cov.ROSMAP_NIA_WGS.leftnorm.bcftools_qc.plink_qc.snuc_pseudo_bulk_mega.related.plink_qc.extracted.pca.projected.Marchenko_PC.gz \
              $PATH/phenofile/Exc/analysis_ready/covariate_preprocessing/snuc_pseudo_bulk.Exc.mega.normalized.log2cpm.rosmap_cov.ROSMAP_NIA_WGS.leftnorm.bcftools_qc.plink_qc.snuc_pseudo_bulk_mega.related.plink_qc.extracted.pca.projected.Marchenko_PC.gz \
              $PATH/phenofile/Inh/analysis_ready/covariate_preprocessing/snuc_pseudo_bulk.Inh.mega.normalized.log2cpm.rosmap_cov.ROSMAP_NIA_WGS.leftnorm.bcftools_qc.plink_qc.snuc_pseudo_bulk_mega.related.plink_qc.extracted.pca.projected.Marchenko_PC.gz \
    --customized-association-windows $PATH/windows/TADB_enhanced_cis.coding.bed \
    --region-name ENSG00000073921 --save_data --no-skip-twas-weights \
    --phenotype-names Mic_mega_eQTL Ast_mega_eQTL Oli_mega_eQTL OPC_mega_eQTL Exc_mega_eQTL Inh_mega_eQTL \
    --mixture_prior /data/analysis_result/mash/mixture_prior.EZ.prior.rds \
    --max_cv_variants 5000 \
	--ld_reference_meta_file $PATH/ldref/ld_meta_file.tsv 

## Anticipated Results

For each gene, multivariate finemapping will produce a file containing results for the top hits and a file containing twas weights produced by susie.

`ROSMAP_mega_eQTL.chr11_ENSG00000073921.multivariate_bvrs.rds`:
* For each gene of interest, this file contains:
    1. mrmash_fitted
    2. reweighted_mixture_prior
    3. reweighted_mixture_prior_cv
    4. mvsusie_fitted
    5. variant_names
    6. analysis_script
    7. other_quantities
    8. context_names
    9. top_loci
    10. susie_result_trimmed
    11. total_time_elapsed
    12. region_info

`ROSMAP_mega_eQTL.chr11_ENSG00000073921.multivariate_data.rds`: (from the --save-data argument)
* see [pecotmr code](https://github.com/statfungen/pecotmr/blob/68d87ca1d0a059022bf4e55339621cbddc8993cc/R/file_utils.R#L461) for description 

`ROSMAP_mega_eQTL.chr11_ENSG00000073921.multivariate_twas_weights.rds`:
* For each gene of interest and phenotype, this file contains:
    1. twas_weights - weights mrmash and mvsusie methods
    2. twas_predictions - twas predictions for mrmash and mvsusie methods
    3. variant_names
    5. twas_cv_result
    6. total_time_elapsed
    8. region_info
