Skip to contents

This function filters a vector `z` based on a correlation matrix `LD` and a correlation threshold `rThreshold`. It keeps only one element among those having an absolute correlation value greater than the threshold.

Usage

find_duplicate_variants(z, LD, rThreshold)

Arguments

z

A numeric vector to be filtered.

LD

A square correlation matrix with dimensions equal to the length of `z`.

rThreshold

The correlation threshold for filtering.

Value

A list containing the following elements:

filteredZ

The filtered vector `z` based on the correlation threshold.

filteredLD

The filtered matrix `LD` based on the correlation threshold.

dupBearer

A vector indicating the duplicate status of each element in `z`.

corABS

A vector storing the absolute correlation values of duplicates.

sign

A vector storing the sign of the correlation values (-1 for negative, 1 for positive).

minValue

The minimum absolute correlation value encountered.

Examples

z <- c(1, 2, 3, 4, 5)
LD <- matrix(c(
  1.0, 0.8, 0.2, 0.1, 0.3,
  0.8, 1.0, 0.4, 0.2, 0.5,
  0.2, 0.4, 1.0, 0.6, 0.1,
  0.1, 0.2, 0.6, 1.0, 0.3,
  0.3, 0.5, 0.1, 0.3, 1.0
), nrow = 5, ncol = 5)
rThreshold <- 0.5

result <- find_duplicate_variants(z, LD, rThreshold)
print(result)
#> $filteredZ
#> [1] 1 3 5
#> 
#> $filteredLD
#>      [,1] [,2] [,3]
#> [1,]  1.0  0.2  0.3
#> [2,]  0.2  1.0  0.1
#> [3,]  0.3  0.1  1.0
#> 
#> $dupBearer
#> [1] -1  1 -1  2 -1
#> 
#> $corABS
#> [1] 0.0 0.8 0.0 0.6 0.0
#> 
#> $sign
#> [1] 1 1 1 1 1
#> 
#> $minValue
#> [1] 0.1
#>