Analysis and comparison of the STR genotypes called with HipSTR, STRait Razor and toaSTR by using next generation sequencing data in a Brazilian population sample

Valle-Silva, Guilherme
Frontanilla, Tamara Soledad
Ayala, Jesús
Donadi, Eduardo Antonio
Simões, Aguinaldo Luiz
Castelli, Erick C. [UNESP]
Mendes-Junior, Celso Teixeira

Short tandem repeats (STRs) are particularly difficult to genotype with rapid evolving next-generation sequencing (NGS) technology. Long amplicons containing repetitive sequences result in alignment and genotyping errors. Stutters arising from polymerase slippage often result in reads with additional or missing repeat copies. Many tools are available for analysis of STR markers from NGS data. This study has evaluated the concordance of the HipSTR, STRait Razor, and toaSTR tools for STR genotype calling; NGS data obtained from a highly genetically diverse Brazilian population sample have been used. We found that toaSTR can retrieve a larger number of genotypes (93.8%), whereas HipSTR (84.9%) and STRait Razor present much lower genotype calling (75.3%). Accuracy levels for genotype calling are very similar (identical genotypes ~95% and correct alleles ~ 97.5%) across the three methods. All the markers presenting the same genotype through the methods are in Hardy–Weinberg equilibrium. We found that combined match probability and combined exclusion power are 2.90 × 10−28 and 0.99999999982, respectively. Although toaSTR has varying locus-specific differences and better overall performance of toaSTR, the three programs are reliable genotyping tools. Notwithstanding, additional effort is necessary to improve the genotype calling accuracy of next-generation sequencing datasets.



Bioinformatics, Brazil, CODIS, Forensic genetics, Massively parallel sequencing, Short tandem repeats

Forensic Science International: Genetics, v. 58.