First authors of the studies and references
Number of cases
Number of controls
Novel genes or genomic regions related to COPD
Replicate genes or regions
Pillai [44]
823
810
CHRNA3/CHRNA5/IREB, HHIP
Cho [46]
2940
1380
FAM13A
CHRNA3/CHRNA5/IREB2, HHIP
Cho [48]
3499
1922
Chromosome 19q13
CHRNA3/CHRNA5/IREB2, HHIP, FAM13A
Wilk [51]
3368
29,507
HTR4
CHRNA3/CHRNA5/IREB2, HHIP
Cho [28]
6633
5704
RIN3
CHRNA3/CHRNA5/IREB2, HHIP, FAM13A
Cho (on severe COPD) [28]
3125
1468
MMP12, TGFB2
CHRNA3/CHRNA5/IREB2, HHIP, FAM13A
Hobbs [100]
6161
6004
IL27
CHRNA5, etc
Though the genes or genotypes related to the susceptibility of COPD could have a critical role in COPD pathogenesis, the mechanism whether and how these genes or genetic variations affect the pathogenesis should be differently investigated. Current evidence in this research field is concisely reviewed in a recent article [56]. CHRNA3 and CHRNA5 are subunits of the nicotinic cholinergic receptor, and the proteins are responsive to nicotine. The association between nicotine addiction and genetic variants of these genes was also reported including GWAS (Fig. 3.1a) [49, 57, 58]. It was also reported by using chrna5 knockout mice that nicotine activates the habenulo-interpeduncular pathway through alpha5-containing nAChRs, triggering an inhibitory motivational signal that acts to limit nicotine intake [59]. Thus, it is speculated that genetic variations of this gene affect COPD susceptibility through modifying the extent of nicotine dependence. Though this idea was supported also by the mediation analysis on previous GWAS, CHRNA3 and CHRNA5 possibly affect COPD pathogenesis through direct and indirect manner, as shown in Fig. 3.1b [60]. The SNPs of these genes are related to the expression level of CHRNA3 and IREB2 in blood and sputum samples [61]. IREB2 is a protein that binds iron-responsive elements which maintains cellular iron metabolism. Though mice with a targeted disruption of this gene Ireb2 have been already generated, it was only reported that misregulation of iron metabolism leads to neurodegenerative disease and that there were no reports on COPD pathogenesis with these mice [62].
Fig. 3.1
Susceptibility genes on smoking and COPD. (a) The various phenotypes of smoking and the genes (BDNF, CHRNA genes, CYP2A6, and PPP1R3C) whose variations are related to these phenotypes (Reproduced with permission from Ref. [58]). (b) The model on the association between the genes related to smoking behavior (e.g., CHRNA3) and their direct and indirect effects on smoking-related diseases [60] (Reproduced with permission from the American Thoracic Society. Copyright © 2016 American Thoracic Society. Ann Am Thorac Soc. 2014 Sep;11(7):1082–3. Official Journal of the American Thoracic Society). FTND means Fagerstroem Test of Nicotine Dependence.
HHIP encodes a membrane glycoprotein that is an endogenous antagonist for the hedgehog pathway, which is critical for the morphogenesis of the lung and other organs. One of the COPD-related SNPs, rs1828591, is located in the enhancer region of HHIP and the expression of these protein non-tumor lung specimens reduced with the risk allele of this SNP [63]. Gene expression microarray analysis in a human bronchial epithelial cell line (Beas-2B) stably infected with HHIP shRNAs revealed that differential expression of the genes related to extracellular matrix and cell growth genes and these genes were also differentially expressed in lung tissues of COPD [64]. In mice with hhip haploinsufficiency exposed to cigarette smoking, severe airspace enlargement and enhanced lymphocyte activation pathways in lung tissues were observed [65].
The function of FAM13A is still largely unknown. However, it was reported that this protein is related to the activation of Wnt signaling pathway [66]. FAM13A was associated with pulmonary function in healthy and also asthmatic populations with GWAS [67], was associated with chronic bronchitis but not with emphysema [68], and was associated with idiopathic pulmonary fibrosis also with GWAS [69]. Thus, possibly this gene has a critical role in pathogenesis of various lung diseases, but in a complicated manner. Since knockout mice of fam13a was recently generated [66], the role of FAM13A on COPD pathogenesis could be further elucidated with this mice model in the near future. On the other genes, including TGFB2 and RIN3, molecular mechanisms through which these genes affect COPD pathogenesis are still largely unknown.
3.4 Recent Progress of Studies of COPD Genetics in GWAS Era (I): Relevant Genes Determining the Susceptibility to Smoking
Since genes of nicotinic acetylcholine receptors, thought to be one of the critical factors affecting smoking behavior, were reported to be associated with COPD by GWASs with validation by using several populations [28, 44] as described above, it is speculated that relevant genes determining the susceptibility to smoking are also involved in COPD pathogenesis and progression in general. In addition, when we observed the association between genes related to smoking behavior like CHRNA3, it is a little unclear whether these genes affect COPD in a direct manner (e.g., by modifying the inflammation process in the local regions of the lungs) or in an indirect manner (e.g., by modifying susceptibility to smoking behavior). In fact, COPDGene and ECLIPSE investigators showed that the effects of two linked variants (rs1051730 and rs8034191) in the AGPHD1/CHRNA3 cluster on COPD development were significantly, yet not entirely, mediated by the smoking-related phenotypes, and they confirmed the existence of direct effects of the AGPHD1/CHRNA3, IREB2, FAM13A, and HHIP loci on COPD development [70], as shown in Fig. 3.1b [60]. They also reported that the association of the AGPHD1/CHRNA3 locus with COPD is significantly mediated by smoking-related phenotypes though IREB2 appears to affect COPD independently of smoking [70]. Very recently, UK Biobank data were used to study the genetic causes of smoking behavior and lung health in UK Biobank Lung Exome Variant Evaluation (UK BiLEVE) study, and the study selected 50,008 unique samples which composed of 10 002 individuals with low FEV1, 10 000 with average FEV1, and 5002 with high FEV1 from each of the heavy-smoker and never-smoker groups [71] (Fig. 3.2 [72]). First, they showed a substantial sharing of genetic causes of low FEV1 between heavy smokers and never smokers and between individuals with and without doctor-diagnosed asthma. They also discovered six novel genome-wide significant signals of association with extremes of FEV1, including signals at four novel loci (KANSL1, TSEN54, TET2, and RBM19/TBX5) and independent signals at two previously reported loci (NPNT and HLA-DQB1/HLA-DQA2), and these variants also showed association with COPD, including in individuals with no history of smoking. In addition, they also discovered five new genome-wide significant signals for smoking behavior, including a variant in NCAM1 and a variant on chromosome 2 (between TEX41 and PABPC1P2) that has a trans effect on expression of NCAM1 in brain tissue. This study is so unique especially on its design, namely, by sampling from the extremes of the lung function distribution in UK Biobank with a so large population. In summary, this study showed that CHRNA3/CHRNA5 is associated with COPD pathogenesis through nicotine dependence, that HHIP is through disturbance of lung development, and that GSTCD and several other genes are through modification of oxidant stress and/or inflammation (Fig. 3.2) [72]. This study not only showed susceptibility genes of COPD but also successfully revealed how these genes affect COPD pathogenesis and suggests a new strategy on COPD genetics study with GWAS. Smoking behaviors include various phenotypes, including smoking initiation, increment of tobacco consumption, nicotine addiction, and smoking cessation, and these behaviors could be associated with different genes (Fig. 3.1a) [58]. Meta-analyses of genome-wide association studies for the number of cigarettes smoked per day (CPD) in smokers (n = 31,266) and smoking initiation (n = 46,481) using samples mainly from the ENGAGE Consortium and replication study with the Tobacco and Genetics (TAG) and Oxford-GlaxoSmithKline (Ox-GSK) consortium cohorts (n = 45,691 smokers) and also with a third sample of European ancestry (n = 9040) revealed the variants in three genomic regions associated with CPD, including previously identified SNPs at 15q25 represented by rs1051730[A] and SNPs at 19q13 and 8p11, where genes encoding nicotine-metabolizing enzymes (CYP2A6 and CYP2B6) and nicotinic acetylcholine receptor subunits (CHRNB3 and CHRNA6) are located [49]. Since CHRNA3/CHRNA5 is only a gene related to both smoking behavior and COPD with GWAS, the other genes related to smoking behaviors, e.g., CYP2A6, could also have a critical role in COPD pathogenesis. Finding novel genes related to both smoking and COPD could be a supportive evidence that shows that therapeutics for nicotine addiction will also prevent from COPD and its progression, which leads to development of novel therapeutics for both smoking and COPD.
Fig. 3.2
The several processes of COPD pathogenesis and the results of a GWAS by UK BiLEVE (Reproduced with permission from Ref. [72])
3.5 Recent Progress of Studies of COPD Genetics in GWAS Era (II): Gene Related to Critical Subtypes or Phenotypes of COPD
Though COPD is thought to result mostly from an accelerated decline in FEV1 over time (namely, the phenotype of “rapid decliner”) which is possibly caused by smoking and also by frequent exacerbations of COPD, it is also possible that a normal decline in FEV1 could also lead to COPD in persons whose maximally attained FEV1 is less than population norms. This idea was proposed by Burrows long time ago [73] (Fig. 3.3), and it was actually shown in real world that low FEV1 in early adulthood is important in the genesis of COPD and that accelerated decline in FEV1 is not an obligate feature of COPD [74]. Thus, genes related to COPD pathogenesis also could affect COPD and its progression through these two models (namely, the pattern of rapid decliners and another one). On rapid decliners, case–control association studies on candidate genes with a population of Lung Health Study (283 rapid decliners (deltaFEV1 = −154 +/− 3 ml/year) and 308 nondecliners (deltaFEV1 = +15 +/− 2 ml/year) among smokers followed for 5 years) were performed, and the associations between the phenotype of rapid declining of FEV1 and MZ genotype of the alpha1-antitrypsin gene and haplotype of the microsomal epoxide hydrolase were suggested [75]. A recent GWAS to assess genetic contributions to lung function decline over a 5 year period in 4048 European American Lung Health Study participants with largely mild COPD showed that two novel regions were associated with lung function decline in mild COPD and that genes within these regions (TMEM26, FOXA1, and ANK3) were expressed in relevant lung cells and their expression was related to airflow limitation [76]. However, the genes related to the subjects whose maximally attained FEV1 is less than population norms are less elucidated so far partly because this phenotype was not thought to be associated with COPD. HHIP could be one of the possible candidates of the genes related to this phenotype or subgroup of COPD (Fig. 3.2 [72]). HHIP was reported to be associated with COPD by GWASs as described above [44], and according to the results of UK BiLEVE study, HHIP is thought to affect COPD pathogenesis especially through modifying lung development [71]. The genes related to airflow obstruction were investigated as GWASs with larger populations than COPD studies [52, 53, 77] and include not only HHIP but also the other several genes. Thus, these genes except for HHIP, namely, TNS1, GSTCD, AGER, HTR4, THSD4, and others, are good candidates as COPD-related genes, and their possible roles on COPD pathogenesis should be investigated, including on their effect size.
Fig. 3.3
Possible “natural histories” of COPD. (a) A Fletcher-Peto curve, presented in the GOLD guideline. (b) “Natural histories” that can lead to severe COPD in various manners, as described by Burrows. Some with a frequent exacerbation phenotype may have repetition of exacerbation and remission leading to progression of COPD, shown as “B.” COPD chronic obstructive pulmonary disease (Reproduced with permission from Ref. [84])
Emphysema is one of the most important phenotypes in COPD, and the pathogenesis of emphysema should be investigated and clarified in a fine manner, partly in order to develop new therapeutic strategy of this phenotype. Airway wall thickening and emphysema phenotypes showed independent aggregation within families of individuals with COPD, suggesting that different genetic factors influence these disease processes [78]. Further, it is relatively easy to assess the extent of emphysema pathologically or by computed tomography (CT) in mice compared to assessing the airway disease; we can realistically proceed the research on molecular mechanism of emphysema formation with mice with knockdown of the gene related to emphysema. By GWASs with ECLIPSE, NETT, GenKOLS, and COPDGene populations, BICD1, SNRPF, and PPT2 were associated with emphysema [79, 80] and SERPINA10 and DLC1 also, as reported recently [81]. One of the COPD phenotype comorbidities with interstitial lung diseases, combined pulmonary fibrosis and emphysema (CPFE) (reviewed in Chap. 18), is also a critical and definite category. The susceptibility genes of CPFE are speculated to the genes related to both emphysema and interstitial pulmonary fibrosis (e.g., FAM13A [82]), which remained to be elucidated. Though the research on the genes related to airway disease in COPD is also ongoing, the results are a little unclear, and these should be investigated further [81].
Frequent exacerbators are also one of the critical subgroups of COPD. Although exacerbations become more frequent and more severe as COPD progresses, the rate at which they occur appears to reflect an independent phenotype [83]. Exacerbation is a leading cause of mortality, decrement of pulmonary function and quality of life, and also is a major cost driver of COPD especially through hospitalization [84]. Fletcher and Peto [85] demonstrated the natural course of COPD, the “Fletcher-Peto curve,” and they showed that the expiratory airflow limitation, defined as FEV1, declines with age throughout adulthood and that smoking, not the effect of mucus hypersecretion, accelerates its decline (Fig. 3.3a). In addition, Burrows [73] indicated that some COPD individuals might have exacerbation and remission periods with each episode leading to a progressive loss of function (Fig. 3.3b). Thus, the susceptibility genes of COPD exacerbations should be also elucidated. Several reports exist regarding the association between exacerbation susceptibility and some gene variations, including surfactant protein B [86], mannose-binding lectin [87], and chemokine ligand 1 [88]; the proteins coded by these genes are a surfactant protein, a lectin that acts as a pattern recognition receptor in serum, and a chemokine, respectively, and they mainly have the capacity to protect against bacteria or viruses. The genetic variations that increase this capacity are thought to reduce susceptibility to infection and thus COPD exacerbations. Loss of Siglec-14, a lectin likely involved in host defense, was also associated with a reduced COPD exacerbation risk in a Japanese population [89]. However, minor allele frequency of this deletion polymorphism of SIGLEC14 is rare in Caucasians; it is uncertain whether this gene has a critical role in the pathogenesis of COPD exacerbations in various ethnics. Since a protein involved in strengthening host defense such as Siglec-14, that could also trigger exaggerated response, might also generate unwanted local and systemic inflammation, which could be detrimental to a host and could generate COPD with a frequent exacerbation phenotype, its progression, and its comorbidities (Fig. 3.4) [84]. Thus, the genes related to signal transduction on inflammation and immunity like SIGLECs, and components of gap junction, are thought to be good candidates for case–control association studies of COPD exacerbation genetics. However, to find truly “novel” genes related to COPD exacerbations, GWASs should be performed with multiethnic population with clear definition of exacerbations.
Fig. 3.4
The hypothesized role of an antibacterial but also proinflammatory molecules (e.g., Siglec-14) in the pathogenesis of COPD as a systemic disease (Reproduced with permission from Ref. [84])
Comorbidities are frequent in COPD and some of them negatively influence survival [90]. It was reported by cluster analysis that multimorbidity is common in patients with COPD and that different comorbidity clusters ((1) less comorbidity, (2) cardiovascular, (3) cachectic, (4) metabolic, and (5) psychological) were identified [91]. If we consider that COPD is a syndrome composed of these subgroups, each subgroup could have a different susceptibility gene to be investigated. COPD with asthma is called as asthma-COPD overlap syndrome (ACOS) and is thought to be relevant as a clinical entity. The non-Hispanic white GWAS identified single-nucleotide polymorphisms in the genes CSMD1 and SOX5, and the meta-analysis identified single-nucleotide polymorphisms in the gene GPR65 associated with ACOS [92].
3.6 Recent Progress of Studies of COPD Genetics in GWAS Era (III): Novel Strategy for Research on COPD Genetics with Multi-omics
Though a number of genes were reported to be associated with COPD so far, the functional relevance of these genetic variations is not so clear on most of or at least a part of them, similar to the results of the GWASs on other common chronic diseases. In this kind of situation, it is thought to be difficult to connect the results of GWASs and disease pathogenesis. Therefore, the associations between two different types of omics data, such as GWAS and gene expression profiling, were examined, which aim to correlate the results to the disease [54] (Fig. 3.5 [54]).
The genetic variants related to COPD were identified with GWASs as written above. In addition, the association between genetic variations and the mRNA expression in, first, lymphoblastoid cells [93] and then sputum and lung tissues was also reported [63, 94]. If these kinds of information are combined, we could find the disease-related genes and simultaneously the molecular mechanism of the genes to affect disease pathogenesis. Recently, the GWAS results on COPD with SpiroMeta-CHARGE, whose consortium undertook the largest GWAS so far (n = 48 201), were integrated with the lung expression quantitative trait loci (eQTLs) in lung tissue from 1111 individuals, and this group found that SNPs associated with lung function measures were more likely to be eQTLs, that the genes whose expression in lung tissues were regulated by these SNPs were enriched for developmental and inflammatory pathways, and that SNPs associated with lung function that were eQTLs in blood, but not in the lung, were only involved in inflammatory pathways [63, 94, 95].
Another group developed a systematic approach to identify key regulators of COPD that integrates genome-wide DNA methylation, gene expression, and phenotype data in lung tissue from COPD and control samples [96]. They identified 126 key regulators of COPD, including EPAS1 as the only key regulator whose downstream genes significantly overlapped with multiple genes sets associated with COPD disease severity. EPAS1 was distinct in comparison with other key regulators in terms of methylation profile and downstream target genes. They also confirmed that EPAS1 protein levels are lower in human COPD lung tissue compared to non-disease controls and that Epas1 gene expression is reduced in mice chronically exposed to cigarette smoke. This kind of methodology could be leveraged to directly identify novel key mediators of this pathophysiology.
The other group hypothesized that by applying unbiased weights derived from unique populations, they could identify additional COPD susceptibility loci, and they performed a homozygosity haplotype analysis on a group of subjects with and without COPD to identify regions of conserved homozygosity haplotype (RCHHs), and weights were constructed based on the frequency of these RCHHs in case versus controls and used to adjust the p-values from a large collaborative GWAS of COPD. They identified two SNPs in a novel gene (fibroblast growth factor-7 (FGF7)) that gained genome-wide significance, and also the association with COPD was validated in an independent population. They also observed that increased lung tissue FGF7 expression was associated with worse measures of lung function [97].
3.7 Recent Progress of Studies of COPD Genetics in GWAS Era (IV): Next-Generation Sequencing and Arrays
The technology to detect genetic variations rapidly progresses in these 10 years, which includes next-generation sequencing and GWAS with genotyping of more SNPs and imputation of genotypes. Parts of the reasons why genotypes of common variants (approximately 500,000 SNPs) were assessed in initial GWASs are the fact that CD/CV hypothesis was widely believed and also the fact at that time that it would be technically and economically difficult to genotype much more SNPs including not only common variants but also rare variants. The initial generation of GWAS was based on the data of HapMap Project which showed the allele frequency of the SNPs in a whole genome (not all SNPs but more than one million SNPs (approximately one tenth of the SNPs) with relatively high frequency) [43]. According to these data, the SNPs with allele frequency more than 5 % are selected in a genome-wide manner, and the association between these SNPs and disease phenotypes was investigated to find a novel disease susceptibility genotype. Thereafter, the 1000 Genomes Project was launched in January 2008 as an international research effort to establish the most detailed catalogue of human genetic variation including rare variations, and it planned to sequence the genomes of at least one thousand anonymous participants from a number of different ethnic groups within the following 3 years, using newly developed technologies which were faster and less expensive, including next-generation sequencing. The project finished its pilot phase in 2010 [98] and completed in 2015 [47]. In parallel, though genome-wide association studies have identified hundreds of genetic variants associated with complex human diseases and traits and have provided valuable insights into their genetic architecture, most variants identified so far confer relatively small increments in risk and explain only a small proportion of familial clustering, leading many to question how the remaining, “missing” heritability can be explained. Many explanations for this missing heritability have been suggested, including much larger numbers of variants of smaller effect yet to be found, rarer variants (possibly with larger effects) that are poorly detected by available genotyping arrays that focus on variants present in 5 % or more of the population, and structural variants poorly captured by existing arrays [99]. Thus, new GWASs with genotyping of minor alleles on which the allele frequency is 0.5–5 % (approximately two million or more SNPs are genotyped) and next-generation sequencing in exome- or genome-wide were performed and also ongoing to see the genetic aspects of COPD further including the aspect of missing heritability. To identify coding variants associated with COPD, non-synonymous, splice, and stop variants with a minor allele frequency above 0.5 % were assessed for association with COPD in five study populations, mainly Caucasians, enriched for COPD including COPDGene Study, and novel single-variant associations were validated in three additional COPD cohorts in a very recent study. The 6004 controls and 6161 COPD cases across five analysis cohorts and the genes related to COPD were not only those reported previously (CHRNA5, AGER, MMP3, and SERPINA1) but also a non-synonymous variant, rs181206, in IL27 [100]. We also reported that serum IL27 levels are a promising biomarker for COPD [101] by using in vitro model in which we modify gene expression of SIGLEC14, which was reported as a susceptibility gene of COPD exacerbations [89]. Thus, it is speculated that IL27 (and SIGLECs possibly) has a critical role in the pathogenesis of COPD and its exacerbations in various ethnics. Also in a recent study, it was hypothesized that exome sequencing in families identified through a proband with severe, early-onset COPD would identify additional rare genetic determinants of large effect, and potential causal variants for COPD in whole exomes from 347 subjects in 49 extended pedigrees from the Boston Early-Onset COPD Study were investigated. However, novel susceptible or causal genes of COPD could not be found with this study, and they also demonstrated the limitations of the power of this approach under genetic heterogeneity through simulation [102]. The missing heritability on COPD could be explained by much larger numbers of variants of smaller effect yet to be found, similar to the case of the heritability of height, for example [103]. The other type of genetic variations, e.g., copy number variants (CNVs), could be also associated with COPD. The effects of polymorphic CNVs on quantitative measures of pulmonary function and chest computed tomography phenotypes among subjects enrolled in COPDGene were investigated, and they identified a polymorphic CNV on chromosome 5q35.2 located between two genes (FAM153B and SIMK1, but also harboring several pseudo-genes) giving genome-wide significance in tests of association with total lung capacity as measured by chest CT scans [104], but the associations between CNVs and COPD should be further elucidated in a near future.