This function aligns variant names from two strings containing variant names in the format of
"chr:pos:A1:A2" or "chr:pos_A1_A2". The first string should be the "source" and the second
should be the "reference".
Usage
align_variant_names(
source,
reference,
remove_indels = FALSE,
remove_build_suffix = TRUE
)
Arguments
- source
A character vector of variant names in the format "chr:pos:A2:A1" or "chr:pos_A2_A1".
- reference
A character vector of variant names in the format "chr:pos:A2:A1" or "chr:pos_A2_A1".
- remove_build_suffix
Whether to strip trailing genome build suffixes like ":b38" or "_b38" before alignment. Default TRUE.
Value
A list with two elements:
- aligned_variants: A character vector of aligned variant names.
- unmatched_indices: A vector of indices for the variants in the source that could not be matched.
Examples
source <- c("1:123:A:C", "2:456:G:T", "3:789:C:A")
reference <- c("1:123:A:C", "2:456:T:G", "4:101:G:C")
align_variant_names(source, reference)
#> $aligned_variants
#> [1] "1:123:A:C" "2:456:T:G" "3:789:C:A"
#>
#> $unmatched_indices
#> [1] 3
#>