Molecular Biology

The Genetic Basis of Lung Disease

Genetic factors play an important role in diseases that affect the airways (asthma, chronic obstructive pulmonary disease [COPD], cystic fibrosis, primary ciliary dyskinesia), parenchyma (pulmonary fibrosis, Birt-Hogg-Dubé syndrome, tuberous sclerosis), and vasculature (hereditary hemorrhagic telangiectasia) of the lung (Table 2-1). Such conditions include simple monogenic disorders such as Kartagener syndrome and α₁-antitrypsin deficiency, in which mutations of critical genes are sufficient to induce well-defined disease phenotypes. By contrast, many other disease processes affecting the lung are complex genetic traits in which inheritance subtly affects pathogenesis. This group of entities includes COPD, asthma, and idiopathic pulmonary fibrosis. Extending current understanding of the genetic basis of pulmonary conditions will be essential to provide new insights into their underlying pathophysiology, to make predictions about outcome, and to develop novel therapeutic strategies.

Table 2-1 Examples of Genetic Factors That Underlie Lung Disease

Identification of single-gene defects in families that show the same phenotype is now relatively straightforward, owing to completion of the human genome project and improvements in DNA sequencing. Consequently, the past 20 years have seen rapid progress in elucidation of the genetic basis of disease. This rate of progress can be appreciated by a consideration of the many years required to identify the gene associated with cystic fibrosis. Dorothy Hansine Andersen first defined the condition in 1938 when she described cystic fibrosis of the pancreas in association with lung and intestinal disease. Only later was it recognized to be a recessive condition. The sweat test that is used to diagnose the condition was developed after the detection of abnormal sweat electrolytes by Paul di Sant’ Agnese in 1952. The search for the cystic fibrosis gene started in the early 1980s, and the gene was localized to chromosome 7 in 1985 through recognition of linkage with the highly polymorphic gene paraoxonase in many populations. This achievement was followed by the identification of additional markers more closely linked to the cystic fibrosis locus, MET and D7S8, allowing prenatal diagnosis of the disorder and eventually leading directly to the mapping of the causative gene in 1989 by teams headed by Lap-Chi Tsui, Francis Collins, and Jack Riordan. This gene was called the cystic fibrosis transmembrane conductance regulator (CFTR), and now more than 1000 different mutations have been identified that cause cystic fibrosis.

By contrast, today, what had once taken many groups a decade to complete can be undertaken in a single laboratory in days. For example, modern exome sequencing enables all 180,000 exons encoded by the human genome to be characterized in an individual patient or an entire kindred. Although the exome equates to only 1% of the genome, or about 30 megabases, it is thought to contain 85% of the mutations responsible for mendelian disorders. This technology, for example, was recently used to identify the causative gene of Miller syndrome, a rare disorder that manifests with cleft palate, absent digits, and ocular anomalies. The entire exomes of four persons so affected were sequenced, allowing mutations to be identified in the causative gene encoding dihydroorotate dehydrogenase (DHODH).

The major challenges now are therefore no longer the single-gene disorders but complex genetic diseases such as cancer, COPD, asthma, and interstitial lung disease. These diseases are the result of interactions between multiple genes and environmental factors. Consequently, the diseases cluster within families but do not show a clear pattern of inheritance.

Single-Gene Disorders and Respiratory Disease

Many single-gene disorders have been linked with respiratory disease (see Table 2-1). They are perhaps best typified by the autosomal recessive condition α₁-antitrypsin deficiency. This condition shows a clear genotype-phenotype correlation with current understanding of the molecular basis providing new insights into the pathogenesis of disease. α₁-Antitrypsin is the archetypal member of the serine proteinase inhibitor (“serpin”) superfamily. It is synthesized in the liver and secreted into the plasma, where it is the most abundant circulating proteinase inhibitor. Most people of North European descent carry the normal M allele, but 1 in 25 carries the Z variant (Glu342Lys), which results in plasma α₁-antitrypsin levels in the homozygote that are 10% to 15% of the normal M allele. The Z mutation causes the accumulation of α₁-antitrypsin in the rough endoplasmic reticulum of the liver, predisposing the homozygote to the development of juvenile hepatitis, cirrhosis, and hepatocellular carcinoma. The greatly reduced circulating levels of α₁-antitrypsin are unable to protect the lungs against proteolytic damage by neutrophil elastase, predisposing the Z homozygote to the development of early-onset emphysema.

The structure of α₁-antitrypsin is based on a dominant β-pleated sheet A and nine α-helices (Figure 2-1). This scaffold supports an exposed mobile reactive loop that presents a peptide sequence as a pseudosubstrate for the target proteinase. After docking, the proteinase is inactivated by a mousetrap-type action that swings it from the top to the bottom of the serpin in association with the insertion of an extra strand into β-sheet A (see Figure 2-1). This six-stranded protein bound to its target enzyme is then recognized by hepatic receptors and cleared from the circulation. The structure of α₁-antitrypsin is central to its role as an effective antiproteinase but also renders it liable to undergo conformational change in association with disease. The Z mutation is at residue P₁₇ (17 residues proximal to the key P₁ amino acid that defines the inhibitory specificity of α₁-antitrypsin) at the head of a strand of β-sheet A and the base of the mobile reactive loop (see Figure 2-1). The mutation opens β-sheet A, thereby favoring the insertion of the reactive loop of a second α₁-antitrypsin molecule to form a dimer (see Figure 2-1). This dimer can then extend to form polymers that tangle in the endoplasmic reticulum of the liver to form the inclusion bodies resulting in liver disease. Support for this pathomechanism comes from the demonstration that Z α₁-antitrypsin formed chains of polymers when incubated under physiologic conditions. The rate was accelerated by raising the temperature to 41° C and could be blocked by peptides that compete with the loop for annealing to β-sheet A. The role of polymerization in vivo was clarified by the finding of α₁-antitrypsin polymers in inclusion bodies from the livers of Z α₁-antitrypsin homozygotes (see Figure 2-1).

Figure 2-1 The molecular basis of α₁-antitrypsin deficiency. α₁-Antitrypsin may be considered to act by a mousetrap mechanism. A, After docking (left), the target proteinase (gray) is inactivated by movement from the upper to the lower pole of the protein (right). This is associated with insertion of the reactive loop (red) as an extra strand into β-sheet A (green). The mousetrap mechanism may be triggered spontaneously by point mutations in association with disease. The Z mutation (Glu342Lys) of α₁-antitrypsin is at the head of a strand of β-sheet A (green) and the base of the reactive loop. B, Mutations in this region can destabilize β-sheet A to allow the insertion of a reactive loop of a second molecule (middle). This dimer then extends to form long chains of polymers (right). Each molecule of α₁-antitrypsin in the polymer is shown in a different color. It is these polymers that tangle in the endoplasmic reticulum to cause inclusions resulting in liver disease. C, An inclusion body (arrow) from the liver of a patient with α₁-antitrypsin deficiency (left). The inclusions are composed of chains of molecules of α₁-antitrypsin (right).

(Modified from Gooptu B, Lomas DA: Conformational pathology of the serpins—themes, variations and therapeutic strategies, Annu Rev Biochem 78:147–176, 2009.)

Although many α₁-antitrypsin deficiency variants have been described, only three other mutants of α₁-antitrypsin have similarly been associated with plasma deficiency and hepatic inclusions: α₁-antitrypsin Siiyama (Ser53Phe), α₁-antitrypsin Mmalton (Phe52 deleted), and α₁-antitrypsin King’s (His334Asp). All of these mutants lie in the shutter domain that controls opening of β-sheet A. They destabilize the molecule to allow the formation of loop-sheet polymers in vivo. Further investigations have shown that polymerization also underlies the mild plasma deficiency of the S (Glu264Val) and I (Arg39Cys) variants of α₁-antitrypsin. The point mutations that are responsible for these variants have less effect on β-sheet A than does the Z variant. Thus, the associated rate of polymer formation is much slower than that for Z α₁-antitrypsin, which results in less retention of protein within hepatocytes, milder plasma deficiency, and the lack of a clinical phenotype. However, if a mild, slowly polymerizing I or S variant of α₁-antitrypsin is inherited with a rapidly polymerizing Z variant, then the two can interact to form heteropolymers within hepatocytes. These polymers underlie the inclusions that cause cirrhosis.

Emphysema associated with α₁-antitrypsin deficiency results from lack of protection against proteolytic attack in the lungs associated with reduced levels of circulating proteinase inhibitor. This is particularly the case with individuals who smoke tobacco. The Z α₁-antitrypsin that does escape from the liver into the circulation is less efficient in protecting the tissues from enzyme damage and, like M α₁-antitrypsin, may be inactivated by oxidation of the P1 methionine residue. The demonstration that Z α₁-antitrypsin can undergo a spontaneous conformational transition in association with liver disease raised the possibility that this might also occur within the lung. Indeed, polymers have been detected in bronchoalveolar lavage fluid in patients with Z α₁-antitrypsin deficiency. This observation may have important implications for the pathogenesis of disease, because polymerization obscures the reactive loop of α₁-antitrypsin, rendering the protein inactive as an inhibitor of proteolytic enzymes. Thus, the spontaneous polymerization of α₁-antitrypsin within the lung will exacerbate the already reduced antiproteinase screen, thereby increasing the susceptibility of the tissues to proteolytic attack and increasing the rate of progression of emphysema. Finally, the α₁-antitrypsin polymers themselves are inflammatory for neutrophils, which will also increase the proteolytic load in the lung. Recent data suggest that cigarette smoke can induce the intrapulmonary polymerization of Z α₁-antitrypsin, thereby exacerbating the lung damage associated with smoking.