Full-Length Gene Sequencing Could Revolutionize HLA Typing

May 21, 2017

Focus on NGS

Swati Ranade, Ph.D.

Senior Manager, PacBio

The human leukocyte antigen (HLA) genes provide crucial information for matching a donor with a recipient patient before organ or hematopoietic stem cell transplantations. A histocompatible donor can reduce the chance of transplant rejection, giving the recipient years of extra life — but a bad match can be worse than no transplant, with an extensive list of complications, including death.

HLA typing, the technique used to establish histocompatibility, has been a challenging and evolving process. HLA genes are among the most complex and highly variable in the human genome, with more than 16,000 alleles identified so far for the two groups of genes included (class I and II)1. However, the low resolution of this complex region by traditional genotyping methods was challenging our ability to accurately understand the universe of natural variation in the HLA genes and how it might influence HLA compatibility in support of donor-patient matching.

Recently, some experts have turned to long-read DNA sequencing for complete resolution of the HLA gene alleles, offering an unprecedented view of genetic variation at this locus. Here, we look at the latest advances in understanding this region with single molecule, real-time (SMRT) sequencing technology.

HLA Complexity

We have known for decades that the HLA genes harbored quite a bit of variation, but with the introduction of higher-resolution analysis methods, our estimates of the possible number of variants increased significantly. Thirty years ago, when HLA typing happened using serology tests, the community knew of more than 100 HLA antigens — a number that seemed quite large at the time. Today, there are 40 times more alleles recognized for just one of the six class I HLA genes (the B gene). As another example, more than 700 subtypes of a single HLA classification that serology tests once produced can now be identified by DNA sequencing.

The HLA genes are hyper-polymorphic, and fully characterizing them will require advanced DNA analysis technology. Some scientists have deployed short-read sequencing to these genes, but their read length and systematic bias have proven limiting2. The genomic region is quite GC-rich, making it hard for platforms with GC bias to resolve the sequence. The genes are also very long, stretching to 3 kb for a class I gene and more than 10 kb for some class II genes. Short-read sequencing methods are not able to fully span these long genes and rely on less accurate imputation algorithms to estimate HLA types. Mistakes in this process could lead to incorrect typing, with potentially dangerous consequences.

Another problem for resolving the HLA genes is phasing. For optimal precision in genotyping, correctly phasing all complex polymorphisms seen in the HLA gene family to their correct alleles is key. Short-read sequencers cannot do this, particularly for distant variants, leading to ambiguity in typing.

Long-Read Sequencing

Lately, some of the leading HLA labs have adopted long-read, SMRT sequencing as an alternative approach to HLA typing. Long reads can capture full-length HLA gene sequences, phase alleles and detect rare and novel variants that are difficult to discover with other methods.

At the Anthony Nolan Research Institute, for instance, scientists assessed the feasibility of using SMRT sequencing for HLA analysis with a pilot study of seven previously typed samples3. Each full-length gene sequenced to a mean quality value of 70 or greater produced high-resolution sequence information for all exons and introns. The researchers determined that SMRT sequencing yielded concordant results for all samples, and also detected novel alleles that had been missed by the other typing technologies. Remarkably, the long-read sequence information even identified and corrected an error in the reference sequence database. The entire workflow took just three days, generating results more quickly than established methods.

In the paper reporting these results, the scientists wrote “The implications of this technology in the field of HLA typing could be enormous, allowing for true allelic HLA typing in a single experimental set-up and making redundant the need for multiple experiments on different typing platforms, cross-referencing of results and/or the need for re-sequencing using an allele-specific protocol.”

A follow-up study from the Anthony Nolan team demonstrated that SMRT sequencing was successful in accurately phasing all the HLA alleles. An assessment of 45 samples from various DNA sources determined that this approach was 100% accurate and reliable for phasing variants in the HLA genotyping process4.

SMRT sequencing is also now routinely used at HistoGenetics, a commercial laboratory that has been a pioneer in sequence-based HLA typing services. Scientists there are deploying long-read sequencing to type thousands of HLA samples each week; in a recent project, they used this approach to type 60,000 samples for the National Marrow Donor Program’s Be the Match registry. During a conference presentation earlier this year, HistoGenetics CEO Nezih Cereb said that sequencing full-length HLA genes are now the gold standard for accurate HLA typing, replacing current exon-based genotyping methods or short-read sequencing methods that only analyze portions of the genes or known variants5.

Looking Ahead

The advent of full-length gene sequencing promises to dramatically improve the accuracy and resolution of HLA typing for several important clinical applications, including drug hypersensitivity research, autoimmune disease studies, and transplantation. For the first time, long-read sequencing allows scientists to analyze and phase variants throughout this complex genomic region for a better understanding of the biology and its impact on clinical function. This comprehensive approach will offer new insight into the HLA region — its ability to detect novel alleles even in previously typed samples is early evidence of that — and could lead to greater precision in matching donor organs to transplant recipients.



1. HLA Alleles Numbers (online database). http://hla.alleles.org/nomenclature/stats.html

2. Hosomichi K, Shiina T, et al. The impact of next-generation sequencing technologies on HLA research. Journal of Human Genetics (2015) 60, 665–673; doi:10.1038/jhg.2015.102 http://www.nature.com/jhg/journal/v60/n11/full/jhg2015102a.html

3. Mayor N, Robinson J, Alasdair JM, et al. HLA typing for the next generation. PLoS One. 2015. DOI:10.1371/journal.pone.0127153

4. Mayor N. Single Molecule Real-Time (SMRT®) DNA Sequencing at Anthony Nolan. Webinar, July 2015. http://stream.dcasf.com/webinar/single-molecule-real-time-smrt-dna-sequencing-at-anthony-nolan/

5. Cereb N. High-Throughput HLA Class I Whole Gene and HLA Class II Long Range Typing on PacBio RSII and Sequel Platforms. Conference presentation, February 2017.