The human leukocyte antigen (HLA) system or complex is a gene complex encoding the major histocompatibility complex (MHC) proteins in humans. These cell-surface proteins are responsible for the regulation of the immune system in humans. The HLA gene complex resides on a 3 Mbp stretch within chromosome 6p21. HLA genes are highly polymorphic, which means that they have many different alleles, allowing them to fine-tune the adaptive immune system. The proteins encoded by certain genes are also known as antigens, as a result of their historic discovery as factors in organ transplants. Different classes have different functions:
HLAs corresponding to MHC class I (A, B, and C) present peptides from inside the cell. For example, if the cell is infected by a virus, the HLA system brings fragments of the virus to the surface of the cell so that the cell can be destroyed by the immune system. These peptides are produced from digested proteins that are broken down in the proteasomes. In general, these particular peptides are small polymers, about 9 amino acids in length. Foreign antigens presented by MHC class I attract killer T-cells (also called CD8 positive- or cytotoxic T-cells) that destroy cells. MHC class I proteins associate with ?2-microglobulin, which unlike the HLA proteins is encoded by a gene on chromosome 15.
HLAs corresponding to MHC class II (DP, DM, DOA, DOB, DQ, and DR) present antigens from outside of the cell to T-lymphocytes. These particular antigens stimulate the multiplication of T-helper cells (also called CD4 positive T cells), which in turn stimulate antibody-producing B-cells to produce antibodies to that specific antigen. Self-antigens are suppressed by regulatory T cells.
HLAs corresponding to MHC class III encode components of the complement system.
HLAs have other roles. They are important in disease defense. They are the major cause of organ transplant rejections. They may protect against or fail to protect (if down-regulated by an infection) against cancers. Mutations in HLA may be linked to autoimmune disease (examples: type I diabetes, coeliac disease). HLA may also be related to people's perception of the odor of other people, and may be involved in mate selection, as at least one study found a lower-than-expected rate of HLA similarity between spouses in an isolated community.
Aside from the genes encoding the 6 major antigen-presenting proteins, there are a large number of other genes, many involved in immune function, located on the HLA complex. Diversity of HLAs in the human population is one aspect of disease defense, and, as a result, the chance of two unrelated individuals with identical HLA molecules on all loci is extremely low. HLA genes have historically been identified as a result of the ability to successfully transplant organs between HLA-similar individuals.
The proteins encoded by HLAs are those on the outer part of body cells that are (in effect) unique to that person. The immune system uses the HLAs to differentiate self cells and non-self cells. Any cell displaying that person's HLA type belongs to that person and, therefore, is not an invader.
When a foreign pathogen enters the body, specific cells called antigen-presenting cells (APCs) engulf the pathogen through a process called phagocytosis. Proteins from the pathogen are digested into small pieces (peptides) and loaded onto HLA antigens (to be specific, MHC class II). They are then displayed by the antigen-presenting cells to CD4+ helper T cells, which then produce a variety of effects to eliminate the pathogen.
Through a similar process, proteins (both native and foreign, such as the proteins of virus) produced inside most cells are displayed on HLAs (to be specific, MHC class I) on the cell surface. Infected cells can be recognized and destroyed by CD8+ T cells.
The image off to the side shows a piece of a poisonous bacterial protein (SEI peptide) bound within the binding cleft portion of the HLA-DR1 molecule. In the illustration far below, a different view, one can see an entire DQ with a bound peptide in a similar cleft, as viewed from the side. Disease-related peptides fit into these "slots" much like a hand fits into a glove. When bound, peptides are presented to T cells. T cells require presentation via MHC molecules to recognize foreign antigens -- a requirement known as MHC restriction. These cells have receptors that are similar to B cell receptors, and each cell recognizes only a few class II-peptide combinations. Once a T cell recognizes a peptide within an MHC class II molecule, it can stimulate B-cells that also recognize the same molecule in their B cell receptors. Thus, T cells help B cells make antibodies to the same foreign antigens. Each HLA can bind many peptides, and each person has 3 HLA types and can have 4 isoforms of DP, 4 isoforms of DQ and 4 Isoforms of DR (2 of DRB1, and 2 of DRB3, DRB4, or DRB5) for a total of 12 isoforms. In such heterozygotes, it is difficult for disease-related proteins to escape detection.
Any cell displaying some other HLA type is "non-self" and is seen as an invader by the body's immune system, resulting in the rejection of the tissue bearing those cells. This is particularly important in the case of transplanted tissue, because it could lead to transplant rejection. Because of the importance of HLA in transplantation, the HLA loci are some of the most frequently typed by serology and PCR.
|HLA allele||Diseases with increased risk||Relative risk|
|Acute anterior uveitis||15|
|HLA-DR2||Systemic lupus erythematosus||2 to 3|
|Primary Sjögren syndrome||10|
|Diabetes mellitus type 1||5|
|Systemic lupus erythematosus||2 to 3|
|Diabetes mellitus type 1||6|
|Diabetes mellitus type 1||15|
|HLA-DQ2 and HLA-DQ8||Coeliac disease||7|
HLA types are inherited, and some of them are connected with autoimmune disorders and other diseases. People with certain HLA antigens are more likely to develop certain autoimmune diseases, such as type I diabetes, ankylosing spondylitis, celiac disease, SLE (systemic lupus erythematosus), myasthenia gravis, inclusion body myositis, Sjögren syndrome, and narcolepsy. HLA typing has led to some improvement and acceleration in the diagnosis of celiac disease and type 1 diabetes; however, for DQ2 typing to be useful, it requires either high-resolution B1*typing (resolving *02:01 from *02:02), DQA1*typing, or DR serotyping. Current serotyping can resolve, in one step, DQ8. HLA typing in autoimmunity is being increasingly used as a tool in diagnosis. In celiac disease, it is the only effective means of discriminating between first-degree relatives that are at risk from those that are not at risk, prior to the appearance of sometimes-irreversible symptoms such as allergies and secondary autoimmune disease.
Some HLA-mediated diseases are directly involved in the promotion of cancer. Gluten-sensitive enteropathy is associated with increased prevalence of enteropathy-associated T-cell lymphoma, and DR3-DQ2 homozygotes are within the highest risk group, with close to 80% of gluten-sensitive enteropathy-associated T-cell lymphoma cases. More often, however, HLA molecules play a protective role, recognizing increases in antigens that are not tolerated because of low levels in the normal state. Abnormal cells might be targeted for apoptosis, which is thought to mediate many cancers before diagnosis.
There is evidence for non-random mate choice with respect to certain genetic characteristics. This has led to a field known as Genetic matchmaking.
MHC class I proteins form a functional receptor on most nucleated cells of the body.
There are 3 major and 3 minor MHC class I genes in HLA.
Major MHC class I
There are 3 major and 2 minor MHC class II proteins encoded by the HLA. The genes of the class II combine to form heterodimeric () protein receptors that are typically expressed on the surface of antigen-presenting cells.
Major MHC class II
The other MHC class II proteins, DM and DO, are used in the internal processing of antigens, loading the antigenic peptides generated from pathogens onto the HLA molecules of antigen-presenting cell.
Modern HLA alleles are typically noted with a variety of levels of detail. Most designations begin with HLA- and the locus name, then * and some (even) number of digits specifying the allele. The first two digits specify a group of alleles. Older typing methodologies often could not completely distinguish alleles and so stopped at this level. The third through fourth digits specify a nonsynonymous allele. Digits five through six denote any synonymous mutations within the coding frame of the gene. The seventh and eighth digits distinguish mutations outside the coding region. Letters such as L, N, Q, or S may follow an allele's designation to specify an expression level or other non-genomic data known about it. Thus, a completely described allele may be up to 9 digits long, not including the HLA-prefix and locus notation.
MHC loci are some of the most genetically variable coding loci in mammals, and the human HLA loci are no exceptions. Despite the fact that the human population went through a constriction more than 150,000 years ago that was capable of fixing many loci, the HLA loci appear to have survived such a constriction with a great deal of variation. Of the 9 loci mentioned above, most retained a dozen or more allele-groups for each locus, far more preserved variation than the vast majority of human loci. This is consistent with a heterozygous or balancing selection coefficient for these loci. In addition, some HLA loci are among the fastest-evolving coding regions in the human genome. One mechanism of diversification has been noted in the study of Amazonian tribes of South America that appear to have undergone intense gene conversion between variable alleles and loci within each HLA gene class. Less frequently, longer-range productive recombinations through HLA genes have been noted producing chimeric genes.
Six loci have over 100 alleles that have been detected in the human population. Of these, the most variable are HLA B and HLA DRB1. As of 2012, the number of alleles that have been determined are listed in the table below. To interpret this table, it is necessary to consider that an allele is a variant of the nucleotide (DNA) sequence at a locus, such that each allele differs from all other alleles in at least one (single nucleotide polymorphism, SNP) position. Most of these changes result in a change in the amino acid sequences that result in slight to major functional differences in the protein.
There are issues that limit this variation. Certain alleles like DQA1*05:01 and DQA1*05:05 encode proteins with identically processed products. Other alleles like DQB1*0201 and DQB1*0202 produce proteins that are functionally similar. For class II (DR, DP and DQ), amino acid variants within the receptor's peptide-binding cleft tend to produce molecules with different binding capability.
Number of variant alleles at class I loci according to the IMGT-HLA database, last updated July 2014:
|MHC class I|
Number of variant alleles at class II loci (DM, DO, DP, DQ, and DR):
|MHC class II|
|HLA||-A1||-B1||-B3 to -B51||Theor. possible|
|1DRB3, DRB4, DRB5 have variable presence in humans|
The large extent of variability in HLA genes poses significant challenges in investigating the role of HLA genetic variations in diseases. Disease association studies typically treat each HLA allele as a single complete unit, which does not illuminate the parts of the molecule associated with disease. Karp D. R. et al. describes a novel sequence feature variant type (SFVT) approach for HLA genetic analysis that categorizes HLA proteins into biologically relevant smaller sequence features (SFs), and their variant types (VTs). Sequence features are combinations of amino acid sites defined based on structural information (e.g., beta-sheet 1), functional information (e.g., peptide antigen-binding), and polymorphism. These sequence features can be overlapping and continuous or discontinuous in the linear sequence. Variant types for each sequence feature are defined based upon all known polymorphisms in the HLA locus being described. SFVT categorization of HLA is applied in genetic association analysis so that the effects and roles of the epitopes shared by several HLA alleles can be identified. Sequence features and their variant types have been described for all classical HLA proteins; the international repository of HLA SFVTs will be maintained at IMGT/HLA database. A tool to convert HLA alleles into their component SFVTs can be found on the Immunology Database and Analysis Portal (ImmPort) website.
Although the number of individual HLA alleles that have been identified is large, approximately 40% of these alleles appear to be unique, having only been identified in single individuals. Roughly a third of alleles have been reported more than three times in unrelated individuals. Because of this variation in the rate at which of individual HLA alleles are detected, attempts have been made to categorize alleles at each expressed HLA locus in terms of their prevalence. The result is a catalog of common and well-documented (CWD) HLA alleles, and a catalogue of rare and very rare HLA alleles.
Common HLA alleles are defined as having been observed with a frequency of at least 0.001 in reference populations of at least 1500 individuals. Well-documented HLA alleles were originally defined as having been reported at least three times in unrelated individuals, and are now defined as having been detected at least five times in unrelated individuals via the application of a sequence-based typing (SBT) method, or at least three times via a SBT method and in a specific haplotype in unrelated individuals. Rare alleles are defined as those that have been reported one to four times, and very rare alleles as those reported only once.
While the current CWD and rare or very rare designations were developed using different datasets and different versions of the IMGT/HLA Database, the approximate fraction of alleles at each HLA locus in each category is shown below.
| % common
| % well-documented
| % rare
|No. very rare
| % very rare
| % alleles
There are two parallel systems of nomenclature that are applied to HLA. The first, and oldest, system is based on serological (antibody based) recognition. In this system, antigens were eventually assigned letters and numbers (e.g., HLA-B27 or, shortened, B27). A parallel system that allowed more refined definition of alleles was developed. In this system, an "HLA" is used in conjunction with a letter, *, and a four-or-more-digit number (e.g., HLA-B*08:01, A*68:01, A*24:02:01N N=Null) to designate a specific allele at a given HLA locus. HLA loci can be further classified into MHC class I and MHC class II (or rarely, D locus). Every two years, a nomenclature is put forth to aid researchers in interpreting serotypes to alleles.
In order to create a typing reagent, blood from animals or humans would be taken, the blood cells allowed to separate from the serum, and the serum diluted to its optimal sensitivity and used to type cells from other individuals or animals. Thus, serotyping became a way of crudely identifying HLA receptors and receptor isoforms. Over the years, serotyping antibodies became more refined as techniques for increasing sensitivity improved and new serotyping antibodies continue to appear. One of the goals of serotype analysis is to fill gaps in the analysis. It is possible to predict based on 'square root','maximum-likelihood' method, or analysis of familial haplotypes to account for adequately typed alleles. These studies using serotyping techniques frequently revealed, in particular for non-European or north East Asian populations a large number of null or blank serotypes. This was particularly problematic for the Cw locus until recently, and almost half of the Cw serotypes went untyped in the 1991 survey of the human population.
There are several types of serotypes. A broad antigen serotype is a crude measure of identity of cells. For example, HLA A9 serotype recognizes cells of A23- and A24-bearing individuals. It may also recognize cells that A23 and A24 miss because of small variations. A23 and A24 are split antigens, but antibodies specific to either are typically used more often than antibodies to broad antigens.
A representative cellular assay is the mixed lymphocyte culture (MLC) and used to determine the HLA class II types. The cellular assay is more sensitive in detecting HLA differences than serotyping. This is because minor differences unrecognized by alloantisera can stimulate T cells. This typing is designated as Dw types. Serotyped DR1 has cellularly defined as either of Dw1 or of Dw20 and so on for other serotyped DRs. Table shows associated cellular specificities for DR alleles. However, cellular typing has inconsistency in the reaction between cellular-type individuals, sometimes resulting differently from predicted. Together with difficulty of cellular assay in generating and maintaining cellular typing reagents, cellular assay is being replaced by DNA-based typing method.
Minor reactions to subregions that show similarity to other types can be observed to the gene products of alleles of a serotype group. The sequence of the antigens determines the antibody reactivities, and so having a good sequencing capability (or sequence-based typing) obviates the need for serological reactions. Therefore, different serotype reactions may indicate the need to sequence a person's HLA to determine a new gene sequence.
Broad antigen types are still useful, such as typing very diverse populations with many unidentified HLA alleles (Africa, Arabia, Southeastern Iran and Pakistan, India). Africa, Southern Iran, and Arabia show the difficulty in typing areas that were settled earlier. Allelic diversity makes it necessary to use broad antigen typing followed by gene sequencing because there is an increased risk of misidentifying by serotyping techniques.
In the end, a workshop, based on sequence, decides which new allele goes into which serogroup either by sequence or by reactivity. Once the sequence is verified, it is assigned a number. For example, a new allele of B44 may get a serotype (i.e. B44) and allele ID i.e. B*44:65, as it is the 65th B44 allele discovered. Marsh et al. (2005) can be considered a code book for HLA serotypes and genotypes, and a new book biannually with monthly updates in Tissue Antigens.
Gene typing is different from gene sequencing and serotyping. With this strategy, PCR primers specific to a variant region of DNA are used (called SSP-PCR). If a product of the right size is found, the assumption is that the HLA allele has been identified. New gene sequences often result in an increasing appearance of ambiguity. Because gene typing is based on SSP-PCR, it is possible that new variants, in particular in the class I and DRB1 loci, may be missed.
For example, SSP-PCR within the clinical situation is often used for identifying HLA phenotypes. An example of an extended phenotype for a person might be:
A*01:01/*03:01, C*07:01/*07:02, B*07:02/*08:01, DRB1*03:01/*15:01, DQA1*05:01/*01:02, DQB1*02:01/*06:02
In general, this is identical to the extended serotype: A1,A3,B7,B8,DR3,DR15(2), DQ2,DQ6(1)
For many populations, such as the Japanese or European populations, so many patients have been typed that new alleles are relatively rare, and thus SSP-PCR is more than adequate for allele resolution. Haplotypes can be obtained by typing family members in areas of the world where SSP-PCR is unable to recognize alleles and typing requires the sequencing of new alleles. Areas of the world where SSP-PCR or serotyping may be inadequate include Central Africa, Eastern Africa, parts of southern Africa, Arabia, S. Iran, Pakistan, and India.
An HLA haplotype is a series of HLA "genes" (loci-alleles) by chromosome, one passed from the mother and one from the father.
The phenotype exampled above is one of the more common in Ireland and is the result of two common genetic haplotypes:
A*01:01 ; C*07:01 ; B*08:01 ; DRB1*03:01 ; DQA1*05:01 ; DQB1*02:01 (By serotyping A1-Cw7-B8-DR3-DQ2)
which is called ' 'super B8' ' or ' 'ancestral haplotype' ' and
A*03:01 ; C*07:02 ; B*07:02 ; DRB1*15:01 ; DQA1*01:02 ; DQB1*06:02 (By serotyping A3-Cw7-B7-DR15-DQ6 or the older version "A3-B7-DR2-DQ1")
These haplotypes can be used to trace migrations in the human population because they are often much like a fingerprint of an event that has occurred in evolution. The Super-B8 haplotype is enriched in the Western Irish, declines along gradients away from that region, and is found only in areas of the world where Western Europeans have migrated. The "A3-B7-DR2-DQ1" is more widely spread, from Eastern Asia to Iberia. The Super-B8 haplotype is associated with a number of diet-associated autoimmune diseases. There are 100,000s of extended haplotypes, but only a few show a visible and nodal character in the human population.
Studies of humans and animals imply a heterozygous selection mechanism operating on these loci as an explanation for this variability. One credible mechanism is sexual selection in which females are able to detect males with different HLA relative to their own type. While the DQ and DP encoding loci have fewer alleles, combinations of A1:B1 can produce a theoretical potential of 7,755 DQ and 5,270 DP heterodimers, respectively. While nowhere near this number of isoforms exist in the human population, each individual can carry 4 variable DQ and DP isoforms, increasing the potential number of antigens that these receptors can present to the immune system.
Studies of the variable positions of DP, DR, and DQ reveal that peptide antigen contact residues on class II molecules are most frequently the site of variation in the protein primary structure. Therefore, through a combination of intense allelic variation and/or subunit pairing, the class II 'peptide' receptors are capable of binding an almost endless variation of peptides of 9 amino acids or longer in length, protecting interbreeding subpopulations from nascent or epidemic diseases. Individuals in a population frequently have different haplotypes, and this results in many combinations, even in small groups. This diversity enhances the survival of such groups, and thwarts evolution of epitopes in pathogens, which would otherwise be able to be shielded from the immune system.
HLA antibodies are typically not naturally occurring, and with few exceptions are formed as a result of an immunologic challenge to a foreign material containing non-self HLAs via blood transfusion, pregnancy (paternally inherited antigens), or organ or tissue transplant.
Antibodies against disease-associated HLA haplotypes have been proposed as a treatment for severe autoimmune diseases.
Donor-specific HLA antibodies have been found to be associated with graft failure in kidney, heart, lung, and liver transplantation.
In some diseases requiring hematopoietic stem cell transplantation, preimplantation genetic diagnosis may be used to give rise to a sibling with matching HLA, although there are ethical considerations.