{ "cells": [ { "cell_type": "markdown", "metadata": { "kernel": "SoS" }, "source": [ "# MASH analysis pipeline with data-driven prior matrices\n", "\n", "In this notebook, we utilize the MASH prior, referred to as the [mixture_prior](https://github.com/statfungen/xqtl-protocol/blob/6c637645ce16aee2aa7dc86bbc334fb6bb66b9d9/code/multivariate/MASH/mixture_prior.ipynb#L4), from a previous step. Our objective is to conduct a multivariate analysis under the MASH model. After fitting the model, we subsequently compute the posteriors for our variables of interest." ] }, { "cell_type": "markdown", "metadata": { "kernel": "SoS" }, "source": [ "## Methods" ] }, { "cell_type": "markdown", "metadata": { "kernel": "SoS" }, "source": [ "### Multivariate adaptive shrinkage (MASH) analysis of eQTL data\n", "\n", "\n", "Since we published Urbut 2019, we have improved implementation of MASH algorithm and made a new R package, [`mashr`](https://github.com/stephenslab/mashr). Major improvements compared to Urbut 2019 are:\n", "\n", "1. Faster computation of likelihood and posterior quantities via matrix algebra tricks and a C++ implementation.\n", "2. Faster computation of MASH mixture via convex optimization.\n", "3. New ways to estimate prior in place of the `SFA` approach, see `mixture_prior.ipynb` for details.\n", "4. Improve estimate of residual variance $\\hat{V}$.\n", "\n", "At this point, the input data have already been converted from the original association summary statistics to a format convenient for analysis in MASH." ] }, { "cell_type": "markdown", "metadata": { "kernel": "SoS" }, "source": [ "## MWE Data\n", "\n", "Avaiable on [synapse.org](https://www.synapse.org/#!Synapse:syn52624471)" ] }, { "cell_type": "markdown", "metadata": { "kernel": "Bash" }, "source": [ "## Multivariate analysis with MASH model\n", "\n", "Using MWE with [prior previously generated](https://github.com/statfungen/xqtl-protocol/blob/main/code/multivariate/MASH/mixture_prior.ipynb)." ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "kernel": "Bash" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "INFO: Running \u001b[32mmash_1\u001b[0m: Fit MASH mixture model (time estimate: <15min for 70K by 49 matrix)\n", "INFO: \u001b[32mmash_1\u001b[0m (index=0) is \u001b[32mignored\u001b[0m due to saved signature\n", "INFO: \u001b[32mmash_1\u001b[0m output: \u001b[32m/home/gw/Documents/xQTL/output/mashr/protocol_example.EZ.mash_model.rds\u001b[0m\n", "INFO: Running \u001b[32mmash_2\u001b[0m: Compute posterior for the \"strong\" set of data as in Urbut et al 2017. This is optional because most of the time we want to apply the MASH model learned on much larger data-set.\n", "INFO: \u001b[32mmash_2\u001b[0m is \u001b[32mcompleted\u001b[0m.\n", "INFO: \u001b[32mmash_2\u001b[0m output: \u001b[32m/home/gw/Documents/xQTL/output/mashr/protocol_example.EZ.posterior.rds\u001b[0m\n", "INFO: Workflow mash (ID=wdb1a3918af0fd3c3) is executed successfully with 1 completed step and 1 ignored step.\n" ] } ], "source": [ "sos run pipeline/mash_fit.ipynb mash \\\n", " --output-prefix protocol_example \\\n", " --data output/protocol_example.mashr_input.rds \\\n", " --vhat-data output/mashr/protocol_example.ed_mixture.EZ.V_simple.rds \\\n", " --prior-data output/mashr/protocol_example.ed_mixture.EZ.prior.rds \\\n", " --compute-posterior \\\n", " --cwd output/mashr \\\n", " --container oras://ghcr.io/statfungen/stephenslab_apptainer:latest" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "kernel": "R" }, "outputs": [ { "data": { "text/html": [ "
ALL | Ast | End | Exc | Inh | Mic | OPC | Oli |
---|---|---|---|---|---|---|---|
1.000725e-06 | 1.206902e-01 | 0.38024501 | 2.341416e-05 | 0.021238026 | 2.438671e-02 | 2.822754e-01 | 4.950394e-02 |
6.195229e-01 | 6.725207e-01 | 0.65046008 | 6.835928e-01 | 0.582288619 | 6.287649e-01 | 6.903822e-01 | 4.929722e-01 |
5.502451e-08 | 6.952073e-05 | 0.42849153 | 3.848081e-09 | 0.001349139 | 2.974207e-01 | 2.267577e-01 | 2.193946e-01 |
1.141256e-02 | 3.673827e-01 | 0.29984283 | 8.920542e-03 | 0.062520281 | 1.557580e-01 | 1.438823e-01 | 3.652727e-01 |
4.803276e-01 | 1.793390e-01 | 0.01916176 | 1.752551e-01 | 0.384661176 | 3.138391e-01 | 4.888775e-01 | 3.793039e-01 |
3.240321e-01 | 1.006760e-04 | 0.28423516 | 2.692014e-10 | 0.464610846 | 8.359305e-04 | 8.654122e-21 | 1.358764e-09 |
4.744668e-01 | 3.330669e-16 | 0.35728521 | 2.052281e-05 | 0.437127648 | 3.887724e-02 | 1.843843e-05 | 1.374751e-06 |
3.166308e-03 | 0.000000e+00 | 0.34573821 | 2.111164e-04 | 0.424465824 | 1.686171e-02 | 9.992007e-16 | 2.119553e-07 |
4.826046e-04 | 3.121070e-02 | 0.31763916 | 3.181940e-01 | 0.000107790 | 6.340894e-19 | 2.832298e-02 | 3.138958e-02 |
2.426264e-01 | 3.950080e-01 | 0.16192905 | 1.023271e-01 | 0.284297524 | 3.167406e-04 | 1.449052e-02 | 4.348748e-01 |
3.999852e-09 | 8.353939e-02 | 0.47399869 | 2.161030e-06 | 0.017915140 | 1.450814e-01 | 2.240465e-01 | 4.253355e-01 |
8.830382e-06 | 5.623546e-03 | 0.23046666 | 7.952619e-08 | 0.465964274 | 9.821045e-02 | 6.871004e-02 | 2.039190e-01 |
4.616982e-05 | 2.434419e-02 | 0.32092168 | 2.148510e-03 | 0.001318834 | 4.224729e-04 | 1.671158e-01 | 2.667831e-01 |
1.929503e-01 | 2.808606e-01 | 0.00177875 | 2.362496e-02 | 0.327244770 | 3.820119e-01 | 3.156853e-01 | 1.851656e-01 |
2.744579e-02 | 5.054301e-02 | 0.15913565 | 1.581546e-04 | 0.086397136 | 3.507312e-01 | 9.995466e-02 | 1.706981e-01 |
2.349326e-01 | 2.796353e-01 | 0.24466478 | 2.804030e-01 | 0.251516032 | 3.344815e-01 | 1.402243e-02 | 3.982418e-01 |
1.711193e-09 | 6.569349e-03 | 0.03457580 | 4.631637e-01 | 0.422135645 | 3.970892e-01 | 2.515601e-01 | 5.054854e-14 |