{ "cells": [ { "cell_type": "markdown", "id": "4ab9c2ef-b677-4d31-a979-f89eef9bad3f", "metadata": { "kernel": "SoS", "tags": [] }, "source": [ "# Phenotype data simulation\n", "\n", "## Goal\n", "\n", "Here we use two strategies to simulate phenotype data (Y matrix) based on individual-level genotype data (X matrix)." ] }, { "cell_type": "markdown", "id": "36576543-2bc8-4a16-aee7-a2e46584202e", "metadata": { "kernel": "SoS" }, "source": [ "## Input\n", "\n", "`genofile`: plink file of real genotyope, `/mnt/vast/hpc/csg/FunGen_xQTL/ROSMAP/Genotype/plink_by_gene/extended_cis_before_winsorize_plink_files/*.bim`\n", "\n", "The other parameters can be found in simxQTL repo. `https://github.com/StatFunGen/simxQTL`." ] }, { "cell_type": "markdown", "id": "46219b10-544a-4799-8e20-c53ee667ab53", "metadata": { "kernel": "SoS" }, "source": [ "## Output\n", "\n", "An rds matrix, with genotype matrix X (dimension: m * n, m: number of sample, n: number of SNP ) and phenotype (trait) matrix (dimension: m * a, m : number of samples, a: number of simulated traits) \n", "\n", "Example output:" ] }, { "cell_type": "code", "execution_count": 2, "id": "d7af7a25-eb58-4629-b550-c1aa87f2249e", "metadata": { "kernel": "R", "tags": [] }, "outputs": [], "source": [ "result = readRDS(\"/home/hs3393/cb_Mar/simulation_data/real_simulation_5trait/sample_999_real_simulation_3_ncausal_5_trait.rds\")" ] }, { "cell_type": "code", "execution_count": 5, "id": "8320e406-bb6f-416d-89f3-72349f5ebc36", "metadata": { "kernel": "R", "tags": [] }, "outputs": [ { "data": { "text/html": [ "\n", "