Benchmarking freely available HLA typing algorithms across varying genes, coverages and typing resolutions

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Identifying the specific human leukocyte antigen (HLA) allele combination of an individual is crucial in organ donation, risk assessment of autoimmune and infectious diseases and cancer immunotherapy. However, due to the high genetic polymorphism in this region, HLA typing requires specialized methods. We investigated the performance of five next-generation sequencing (NGS) based HLA typing tools with a non-restricted license namely HLA*LA, Optitype, HISAT-genotype, Kourami and STC-Seq. This evaluation was done for the five HLA loci, HLA-A, -B, -C, -DRB1 and -DQB1 using whole-exome sequencing (WES) samples from 829 individuals. The robustness of the tools to lower depth of coverage (DOC) was evaluated by subsampling and HLA typing 230 WES samples at DOC ranging from 1X to 100X. The HLA typing accuracy was measured across four typing resolutions. Among these, we present two clinically-relevant typing resolutions (P group and pseudo-sequence), which specifically focus on the peptide binding region. On average, across the five HLA loci examined, HLA*LA was found to have the highest typing accuracy. For the individual loci, HLA-A, -B and -C, Optitype’s typing accuracy was the highest and HLA*LA had the highest typing accuracy for HLA-DRB1 and -DQB1. The tools’ robustness to lower DOC data varied widely and further depended on the specific HLA locus. For all Class I loci, Optitype had a typing accuracy above 95% (according to the modification of the amino acids in the functionally relevant portion of the HLA molecule) at 50X, but increasing the DOC beyond even 100X could still improve the typing accuracy of HISAT-genotype, Kourami, and STC-seq across all five HLA loci as well as HLA*LA’s typing accuracy for HLA-DQB1. HLA typing is also used in studies of ancient DNA (aDNA), which is often based on sequencing data with lower quality and DOC. Interestingly, we found that Optitype’s typing accuracy is not notably impaired by short read length or by DNA damage, which is typical of aDNA, as long as the DOC is sufficiently high.

Related collections

Most cited references 57

Record: found
Abstract: found
Article: found

Is Open Access

The Sequence Alignment/Map format and SAMtools

Heng Li, Bob Handsaker, Alec Wysoker … (2009)

Summary: The Sequence Alignment/Map (SAM) format is a generic alignment format for storing read alignments against reference sequences, supporting short and long reads (up to 128 Mbp) produced by different sequencing platforms. It is flexible in style, compact in size, efficient in random access and is the format in which alignments from the 1000 Genomes Project are released. SAMtools implements various utilities for post-processing alignments in the SAM format, such as indexing, variant caller and alignment viewer, and thus provides universal tools for processing read alignments. Availability: http://samtools.sourceforge.net Contact: rd@sanger.ac.uk

0 comments Cited 13702 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

A global reference for human genetic variation

Lachlan Coin, Robert Garry, Oleksyk Taras (2017)

The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping. We characterized a broad spectrum of genetic variation, in total over 88 million variants (84.7 million single nucleotide polymorphisms (SNPs), 3.6 million short insertions/deletions (indels), and 60,000 structural variants), all phased onto high-quality haplotypes. This resource includes >99% of SNP variants with a frequency of >1% for a variety of ancestries. We describe the distribution of genetic variation across the global sample, and discuss the implications for common disease studies.

0 comments Cited 4079 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype

Daehwan Kim, Joseph M Paggi, Chanhee Park … (2021)

Rapid advances in next-generation sequencing technologies have dramatically changed our ability to perform genome-scale analyses. The human reference genome used for most genomic analyses represents only a small number of individuals, limiting its usefulness for genotyping. We designed a novel method, HISAT2, for representing and searching an expanded model of the human reference genome, in which a large catalogue of known genomic variants and haplotypes is incorporated into the data structure used for searching and alignment. This strategy for representing a population of genomes, along with a fast and memory-efficient search algorithm, enables more detailed and accurate variant analyses than previous methods. We demonstrate two initial applications of HISAT2: HLA typing, a critical need in human organ transplantation, and DNA fingerprinting, widely used in forensics. These applications are part of HISAT-genotype, with performance not only surpassing earlier computational methods, but matching or exceeding the accuracy of laboratory-based assays.

0 comments Cited 3391 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Contributors

Nikolas Hallberg Thuesen: URI : https://loop.frontiersin.org/people/1878491

Michael Schantz Klausen: URI : https://loop.frontiersin.org/people/1880145

Shyam Gopalakrishnan: URI : https://loop.frontiersin.org/people/816713

Thomas Trolle: URI : https://loop.frontiersin.org/people/1882281

Gabriel Renaud: URI : https://loop.frontiersin.org/people/724024

Journal

Journal ID (nlm-ta): Front Immunol

Journal ID (iso-abbrev): Front Immunol

Journal ID (publisher-id): Front. Immunol.

Title: Frontiers in Immunology

Publisher: Frontiers Media S.A.

ISSN (Electronic): 1664-3224

Publication date (Electronic): 08 November 2022

Publication date Collection: 2022

Volume: 13

Electronic Location Identifier: 987655

Affiliations

[1] ¹ Evaxion Biotech , Copenhagen, Denmark

[2] ² Department of Health Technology, Section for Bioinformatics, Technical University of Denmark , Lyngby, Denmark

[3] ³ Section for Hologenomics, Department of Biology, University of Copenhagen , Copenhagen, Denmark

Author notes

Edited by: Martin Maiers, National Marrow Donor Program, United States

Reviewed by: Nicolas Vince, INSERM U1064 Centre de Recherche en Transplantation et Immunologie, France; Jamie Duke, Children’s Hospital of Philadelphia, United States; Seik-Soon Khor, National Center For Global Health and Medicine, Japan

*Correspondence: Nikolas Hallberg Thuesen, nthu@ 123456evaxion-biotech.com

This article was submitted to Alloimmunity and Transplantation, a section of the journal Frontiers in Immunology

Article

DOI: 10.3389/fimmu.2022.987655

PMC ID: 9679531

PubMed ID: 36426357

SO-VID: 9f64d07f-6681-4d59-a8fa-9419e97f0d18

License:

This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

History

Date received : 06 July 2022

Date accepted : 10 October 2022

Page count

Figures: 5, Tables: 1, Equations: 0, References: 57, Pages: 15, Words: 9680

Funding

Funded by: Novo Nordisk Fonden , doi 10.13039/501100009708;

Award ID: NNF20OC0062491

Comments

Comment on this article

scite_

Cited by 6

See all cited by

Most referenced authors 1,684

See all reference authors

Benchmarking freely available HLA typing algorithms across varying genes, coverages and typing resolutions

Read this article at

Abstract

Related collections

HLA-G and immune tolerance in pregnancy

Most cited references 57

The Sequence Alignment/Map format and SAMtools

A global reference for human genetic variation

Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype

Author and article information

Contributors

Journal

Affiliations

Author notes

Article

History

Page count

Funding

Categories

Comments

Comment on this article

Similar content 30

Cited by 6

Most referenced authors 1,684