Skip to contents

This function is adapted from those written by Peter Sørensen in the qgg package. The following prior distributions are provided:

Bayes N: Assigning a Gaussian prior to marker effects implies that the posterior means are the BLUP estimates (same as Ridge Regression).

Bayes L: Assigning a double-exponential or Laplace prior is the density used in the Bayesian LASSO

Bayes A: similar to ridge regression but t-distribution prior (rather than Gaussian) for the marker effects ; variance comes from an inverse-chi-square distribution instead of being fixed. Estimation via Gibbs sampling.

Bayes C: uses a “rounded spike” (low-variance Gaussian) at origin many small effects can contribute to polygenic component, reduces the dimensionality of the model (makes Gibbs sampling feasible).

Bayes R: Hierarchical Bayesian mixture model with 4 Gaussian components, with variances scaled by 0, 0.0001 , 0.001 , and 0.01 .

Usage

gbayes_rss(
  sumstats = NULL,
  LD = NULL,
  variant_ids = NULL,
  nit = 100,
  nburn = 0,
  nthin = 4,
  method = "bayesR",
  vg = NULL,
  vb = NULL,
  ve = NULL,
  ssg_prior = NULL,
  ssb_prior = NULL,
  sse_prior = NULL,
  lambda = NULL,
  h2 = NULL,
  pi = 0.001,
  updateB = TRUE,
  updateG = TRUE,
  updateE = TRUE,
  updatePi = TRUE,
  adjustE = TRUE,
  nug = 4,
  nub = 4,
  nue = 4,
  mask = NULL,
  ve_prior = NULL,
  vg_prior = NULL,
  algorithm = "mcmc",
  tol = 0.001,
  nit_local = NULL,
  nit_global = NULL
)

Arguments

sumstats

dataframe with marker summary statistics. Required: beta coefficient (beta), standard error of the beta coefficient (se), GWAS sample size (n). Optional: variant_id or rsid, alleles (A1 and A2), minor allele frequency (maf).

LD

is a the LD matrix corresponding to the same markers as in the stat dataframe

variant_ids

is an optional character vector of variant ids or rsids, provided outside of the rss dataframe

nit

is the number of iterations

nburn

is the number of burnin iterations

nthin

is the thinning parameter

method

specifies the methods used (method="bayesN","bayesA","bayesL","bayesC","bayesR")

vg

is a scalar or matrix of genetic (co)variances

vb

is a scalar or matrix of marker (co)variances

ve

is a scalar or matrix of residual (co)variances

ssg_prior

is a scalar or matrix of prior genetic (co)variances

ssb_prior

is a scalar or matrix of prior marker (co)variances

sse_prior

is a scalar or matrix of prior residual (co)variances

lambda

is a vector or matrix of lambda values

h2

is the trait heritability

pi

is the proportion of markers in each marker variance class

updateB

is a logical for updating marker (co)variances

updateG

is a logical for updating genetic (co)variances

updateE

is a logical for updating residual (co)variances

updatePi

is a logical for updating pi

adjustE

is a logical for adjusting residual variance

nug

is a scalar or vector of prior degrees of freedom for prior genetic (co)variances

nub

is a scalar or vector of prior degrees of freedom for marker (co)variances

nue

is a scalar or vector of prior degrees of freedom for prior residual (co)variances

mask

is a vector or matrix of TRUE/FALSE specifying if marker should be ignored

ve_prior

is a scalar or matrix of prior residual (co)variances

vg_prior

is a scalar or matrix of prior genetic (co)variances

algorithm

is the algorithm to use. Should take on values ("mcmc", "em-mcmc")

tol

is tolerance, i.e. convergence criteria used in gbayes

nit_local

is the number of local iterations

nit_global

is the number of global iterations

Value

Returns a list structure including

bm

vector of posterior means for marker effects

dm

vector of posterior means for marker inclusion probabilities

vbs

scalar or vector (t) of posterior means for marker variances

vgs

scalar or vector (t) of posterior means for genomic variances

ves

scalar or vector (t) of posterior means for residual variances

pis

vector of probabilites for each mcmc iteration

pim

posterior distribution probabilities

r

vector of residuals

b

vector of estimates from the final mcmc iteration

param

a list current parameters (same information as item listed above) used for restart of the analysis

stat

matrix (mxt) of marker information and effects used for genomic risk scoring

method

the method used

mask

which loci were masked from analysis

conv

dataframe of convergence metrics

post

posterior parameter estimates

ve

mean residual variance

vg

mean genomic variance