Introduction
This chapter reviews current understanding of the genetic architecture of coronary atherosclerosis as gleaned from Mendelian and common, complex forms of the disease. Newly identified pathways and biologic mechanisms are highlighted before discussing the present and future role of genetic testing for the diagnosis, prognosis, and treatment of patients with coronary artery disease (CAD).
Heritability of Coronary Artery Disease
Familial clustering of CAD has long been observed and suggests an inherited basis for CAD and its downstream complication of myocardial infarction (MI). In the offspring cohort of the Framingham Heart Study, a parental history of premature CAD conferred a two- to three-fold increase in the age-specific incidence of cardiovascular events after adjustment for conventional CAD risk factors—implying a genetic basis for the observed susceptibility to CAD. Twin and family studies have estimated that the heritability of CAD is approximately 40% to 60%. Heritable effects appear most pronounced for early-onset CAD, denoting the importance of inherited over acquired risk factors for the development of premature disease. Furthermore, several risk factors for CAD, including plasma lipid concentrations, blood pressure, and type II diabetes mellitus, are themselves heritable and as such contribute to the overall heritability of the CAD/MI phenotype.
Varying patterns of inheritance have provided insights into the genetic underpinnings of CAD. Some forms of CAD demonstrate a simple, Mendelian inheritance pattern, manifest at a young age without the influence of environmental risk factors, and are typified by a single causal gene with a large effect size. Candidate gene studies and linkage analyses have elucidated these monogenic disorders through the study of patients and families with extreme phenotypes to identify causal genes contributing to the disease of interest.
However, the majority of CAD in the population exhibits a more complex and multifactorial inheritance pattern inconsistent with the ratios of Mendel. Such polygenic forms of CAD involve the interplay of many common DNA variants of small to moderate effect sizes, together with nongenetic factors, including both lifestyle and environment. Advancements in high-throughput DNA microarray technologies have permitted the identification of nearly 60 common DNA variants associated with CAD/MI through large-scale genetic association studies, accounting for approximately 13% of the cumulative genetic variance of CAD. Next-generation sequencing technologies and additional studies interrogating potential gene-environment interactions have begun to bridge the gap on the missing heritability of CAD and its risk factors.
Mendelian Causes of Coronary Artery Disease
Examples of Mendelian forms of CAD largely involve gene defects that lead to extremely high plasma concentrations of low-density lipoprotein cholesterol (LDL-C). One such disease is familial hypercholesterolemia (FH) where defects in the LDL receptor mediate disordered uptake of cellular LDL particles from the bloodstream ( Table 3.1 ). Investigations of homozygous FH patients led to the sequencing and identification of mutations in the LDL receptor ( LDLR ) gene, resulting in defective cellular uptake of LDL-C. LDLR mutations are associated with elevated plasma concentrations of LDL-C, typical physical stigmata of severe hypercholesterolemia—ie, tendon xanthomas and corneal arcus (see Fig. 7.4 , Fig. 7.5 )—and premature coronary atherosclerosis. FH is inherited in a codominant pattern where the number of abnormal allelic copies (1 or 2) correlates directly with the severity of the FH phenotype.
Disease | Causal Gene(s) | Inheritance Pattern | Prevalence | Metabolic Defect |
---|---|---|---|---|
Familial hypercholesterolemia | LDLR APOB PCSK9 | Autosomal dominant | HeFH – 1: 500 HoFH – 1: 1 × 10 6 | Reduced LDL clearance |
Autosomal recessive hypercholesterolemia | LDLRAP1 | Autosomal recessive | < 1: 5 × 10 6 | Reduced LDL clearance |
Sitosterolemia | ABCG5 ABCG8 | Autosomal recessive | < 1: 5 × 10 6 | Reduced plant sterol clearance |
Subsequent studies in FH patients without LDLR mutations led to the discovery of additional causal mutations in the APOB and PCSK9 genes, which encode for apolipoprotein B (ApoB) and proprotein convertase subtilisin/kexin type 9 (PCSK9), respectively. ApoB is a key protein on the LDL particle that facilitates its binding to the LDL receptor for cellular uptake and degradation. APOB mutations associating with FH were found to interrupt the binding of the ApoB protein to LDL receptor on the cell surface, leading to reduced LDL uptake and higher plasma LDL concentrations.
PCSK9 is highly expressed in the liver and regulates cholesterol homeostasis by binding to the LDL receptor and inducing its degradation. Gain-of-function mutations in the PCSK9 gene are associated with FH presumably due to reduced LDL receptor availability and a resultant decrease in LDL particle uptake. Similar to LDLR and APOB , mutations in PCSK9 demonstrate an autosomal dominant inheritance pattern, with one copy of the mutant allele leading to an FH phenotype. Notably, loss-of-function mutations in PCSK9 are associated with upregulation of LDL receptors, a marked reduction in LDL-C concentrations, and an 88% reduction in CAD risk.
Other Mendelian disorders of hypercholesterolemia mediate CAD but through an autosomal recessive pattern of inheritance. Two aberrant copies of the LDLRAP1 gene are causative for autosomal recessive hypercholesterolemia (ARH), the mechanism of which remains uncertain but appears to involve a defect in an adapter protein that disrupts the interaction between the LDL receptor and clathrin-coated pits. Individuals with ARH manifest an intermediate form of hypercholesterolemia somewhere between that of LDLR heterozygotes and LDLR homozygotes. Sitosterolemia is a rare autosomal recessive disorder of plant sterol metabolism caused by a defect in the genes encoding ATP-binding cassette (ABC) transporter proteins involved in the excretion of dietary plant sterols. The disease shares several clinical features with FH, such as tendon xanthomas and premature development of CAD. However, unlike FH, the disease is characterized by elevated plant sterol levels, whereas total cholesterol levels may be normal.
Attempts to uncover Mendelian forms of CAD/MI independent of the aforementioned lipoprotein pathways have been unsuccessful. A 21-kb deletion within the MEF2A gene (which encodes the myocyte enhancer factor [MEF] 2A transcription factor strongly expressed in the coronary endothelium) was initially identified as a putative autosomal dominant form of CAD/MI in a 21-member family with 13 affected individuals. However, the noted deletion and others in the MEF2A gene failed to segregate with the disease in a subsequent cohort analysis, casting doubt on whether the gene leads to CAD/MI.
Common, Complex Forms of Coronary Artery Disease
Genome-Wide Association Studies
Beyond rare variants of Mendelian disorders that confer exceptional disease risk, common DNA variants (minor allele frequency [MAF] > 0.05) with more modest effect sizes have also been shown to impact CAD risk. Population-based association studies—ie, genome-wide association studies (GWAS)—compare the DNA profiles of CAD cases and control participants free of CAD to detect statistically significant differences. GWAS have been enabled by the systematic classification of millions of single-nucleotide polymorphisms (SNPs) in the human genome and the advent of high-throughput technologies permitting the interrogation of 1 million or more SNPs on a single microarray chip. Due to linkage disequilibrium—the nonrandom association of alleles at different loci—it is possible to cover the entire human genome of certain populations with approximately 500,000 marker SNPs for the detection of common DNA variants.
In GWAS, large populations are genotyped and allele frequencies of each SNP are compared in cases and controls to test for associations between common variants and a particular phenotype in a relatively unbiased manner. For GWAS of quantitative traits (ie, blood lipid concentrations), analysis is focused on whether SNPs associate with increases or decreases in the specific trait. Given the simultaneous interrogation of up to a million SNPs for association with the disease or quantitative trait, a stringent p -value criterion of 5 × 10 −8 or less is required to achieve genome-wide significance. Accordingly, these studies have relied upon worldwide collaborations to recruit thousands of carefully phenotyped individuals with and without the disease of interest.
Genome-Wide Association Studies of Coronary Artery Disease/Myocardial Infarction
The first locus associated with CAD at a level of genome-wide significance was reported concurrently in 2007 by three independent groups employing distinct cohorts and genotype arrays. All three studies demonstrated a 58-kb interval on chromosome 9p21 containing multiple index SNPs strongly associated with CAD with high allele frequency and robust effect size. Approximately 20% to 25% of the population were found to be homozygous for the variant, with homozygosity conferring a greater than 60% increase in risk of CAD. The locus has also been associated with the extent and severity of CAD, as increased allele frequency has been reported among patients with premature, as well as multivessel, disease. Of note, it has been repeatedly demonstrated that the 9p21 locus is not associated with traditional CAD risk factors such as plasma lipids, blood pressure, diabetes, older age, or obesity. Furthermore, the 58-kb block does not harbor any annotated genes, which renders unclear the exact mechanism by which the locus confers an elevated risk of CAD. However, studies have associated the 9p21 locus with other vascular phenotypes including carotid atherosclerosis, abdominal aortic aneurysm, peripheral artery disease, and intracranial aneurysm, suggesting a pathogenic process related to vessel wall integrity.
Subsequent meta-analyses of GWAS have involved international collaborations such as the Myocardial Infarction Genetics Consortium (MIGen), the Coronary ARtery DIsease Genome-Wide Replication and Meta-Analysis (CARDIoGRAM) consortium, the Coronary Artery Disease Genetics Consortium (C4D), and the combined CARDIoGRAMplusC4D consortium. Together, these large cohorts identified 48 common variants attaining genome-wide significance for association with CAD. Whereas several of these CAD risk loci include genes linked to lipoprotein metabolism, hypertension, and other related pathways, a large proportion lie in gene regions not previously implicated in CAD pathogenesis. As expected for a complex phenotype with a multifactorial origin, most of these common variants have relatively small effect sizes, with only two of the susceptibility loci—the 9p21 locus and the LPA gene (which codes lipoprotein (a))—conferring a greater than 15% risk of CAD.
The previous analyses were restricted to common SNPs (MAF > 0.05) derived from the International HapMap project. A GWAS published in 2015 leveraged more extensive human genetic data from the 1000 Genomes Project including lower frequency and insertion/deletion variants (indels). This GWAS meta-analysis comprised over 185,000 CAD cases and controls and interrogated 6.7 million common variants, as well as 2.7 million low-frequency (MAF = 0.005–0.05) variants. The study confirmed the majority of known CAD susceptibility loci and identified eight novel loci associated with CAD at a genome-wide level of significance, bringing the total number of replicated CAD susceptibility loci to 56 and accounting for approximately 13% of the overall heritability of CAD ( Table 3.2 ). Of note, the eight novel CAD risk loci and all but one of the previously identified loci were represented by risk alleles with MAF greater than 0.5. This suggests that low-frequency variants and insertion/deletion polymorphisms do not contribute significantly to the missing heritability of this complex disease, further supporting the common disease-common variant hypothesis for CAD.
Chr | Locus Name | Lead SNP | EAF | Risk of CAD (or) | ASSOCIATION of Gene Variant with Traditional Risk Factors |
---|---|---|---|---|---|
1p32 | PPAP2B | rs17114036 | 0.92 | 1.13 | |
1p32 | PCSK9 | rs11206510 | 0.85 | 1.08 | LDL |
1p13 | SORT1 | rs7528419 | 0.79 | 1.12 | LDL, HDL |
1q21 | IL6R | rs4845625 | 0.45 | 1.06 | |
1q41 | MIA3 | rs17465637 | 0.66 | 1.08 | |
2p24 | APOB | rs515135 | 0.75 | 1.07 | LDL |
2p21 | ABCG5-ABCG8 | rs6544713 | 0.32 | 1.05 | LDL |
2p11 | VAMP5-VAMP8-GGCX | rs1561198 | 0.46 | 1.06 | |
2q22 | ZEB2 | rs2252641 | 0.44 | 1.03 | |
2q33 | WDR12 | rs6725887 | 0.11 | 1.14 | LDL |
3q22 | MRAS | rs9818870 | 0.14 | 1.07 | |
4q31 | EDNRA | rs1878406 | 0.16 | 1.06 | |
4q32 | GUCY1A3 | rs7692387 | 0.81 | 1.07 | BP |
4q12 | REST-NOA1 | rs17087335 | 0.21 | 1.06 | |
5q31 | SLC22A4-SLC22A5 | rs273909 | 0.12 | 1.06 | LDL |
6p21 | ANKS1A | rs17609940 | 0.82 | 1.03 | |
6p24 | PHACTR1 | rs9369640 | 0.43 | 1.14 | |
6p21 | KCNK5 | rs10947789 | 0.78 | 1.05 | |
6q23 | TCF21 | rs12190287 | 0.62 | 1.06 | |
6q25 | SLC22A3-LPAL2-LPA | rs2048327 rs3789220 | 0.35 0.02 | 1.06 1.42 | LDL |
6q26 | PLG | rs4252120 | 0.74 | 1.03 | |
7p21 | HDAC9 | rs2023938 | 0.10 | 1.06 | |
7q22 | 7q22 | rs10953541 | 0.78 | 1.05 | |
7q32 | ZC3HC1 | rs11556924 | 0.69 | 1.08 | HDL, BP |
7q36 | NOS3 | rs3918226 | 0.06 | 1.14 | BP |
8p21 | LPL | rs264 | 0.85 | 1.06 | HDL, TG |
8q24 | TRIB1 | rs2954029 | 0.55 | 1.04 | LDL, HDL, TG |
9p21 | CDKN2BAS1 | rs4977574 rs3217992 | 0.49 0.39 | 1.21 1.14 | |
9q34 | ABO | rs579459 | 0.21 | 1.08 | LDL |
10p11 | KIAA1462 | rs2505083 | 0.40 | 1.06 | |
10p11 | CXCL12 | rs501120 rs2047009 | 0.81 0.48 | 1.08 1.06 | |
10q23 | LIPA | rs11203042 rs1412444 | 0.45 0.37 | 1.04 1.07 | |
10q24 | CYP17A1-CNNM2-NT5C2 | rs12413409 | 0.89 | 1.08 | BP, BMI |
11p15 | SWAP70 | rs10840293 | 0.55 | 1.06 | |
11q22 | PDGFD | rs974819 | 0.33 | 1.07 | |
11q23 | ZNF259-APOA5-APOA1 | rs964184 | 0.18 | 1.05 | LDL, HDL, TG |
12q24 | SH2B3 | rs3184504 | 0.42 | 1.07 | LDL, HDL, BP, BMI |
12q21 | ATP2B1 | rs7136259 | 0.43 | 1.04 | |
12q24 | KSR2 | rs11830157 | 0.36 | 1.12 | |
13q12 | FLT1 | rs9319428 | 0.31 | 1.04 | |
13q34 | COL4A1-COL4A2 | rs4773144 rs9515203 | 0.43 0.76 | 1.05 1.07 | |
14q32 | HHIPL1 | rs2895811 | 0.41 | 1.04 | |
15q25 | ADAMTS7 | rs7173743 | 0.56 | 1.08 | |
15q22 | SMAD3 | rs56062135 | 0.79 | 1.07 | |
15q26 | MFGE8-ABHD2 | rs8042271 | 0.90 | 1.10 | |
15q26 | FURIN-FES | rs17514846 | 0.44 | 1.05 | BP |
17q23 | BCAS3 | rs7212798 | 0.15 | 1.08 | |
17p11 | RAI1-PEMT-RASD1 | rs12936587 | 0.61 | 1.03 | |
17p13 | SMG6 | rs2281727 | 0.35 | 1.05 | BMI |
17q21 | UBE2Z | rs15563 | 0.51 | 1.04 | |
18q21 | PMAIP1-MC4R | rs663129 | 0.26 | 1.06 | HDL, BMI |
19p13 | LDLR | rs1122608 | 0.77 | 1.08 | LDL |
19q13 | APOE-APOC1 | rs2075650 rs445925 | 0.13 0.09 | 1.07 1.09 | LDL, HDL, TG, BMI |
19q13 | ZNF507-LOC400684 | rs12976411 | 0.09 | 0.67 | |
21q22 | KCNE2 | rs9982601 | 0.13 | 1.12 | |
22q11 | POM121L9P-ADORA2A | rs180803 | 0.97 | 1.20 |
Genome-Wide Association Studies of Coronary Artery Disease Risk Factors
Common DNA variants impact several prominent CAD risk factors in a similar, polygenic manner, with GWAS over the past decade identifying numerous genetic loci associated with traits such as blood lipid levels, blood pressure (BP), type 2 diabetes, and some nontraditional CAD risk factors such as C-reactive protein (CRP).
The first reported GWAS of blood lipid concentrations assessed 2800 individuals genotyped at approximately 400,000 SNPs from the Diabetes Genetics Initiative and identified three loci reaching genome-wide significance, one associated with each of the three lipid traits: LDL-C, HDL-C, and triglyceride levels. Two of these common variants were mapped to known lipid regulators, thereby validating the GWAS approach. Specifically, the index SNP for LDL-C was mapped to a region near APOE (encoding the apolipoprotein that mediates cellular uptake of chylomicrons and very-low-density lipoprotein [VLDL]), and the index SNP for HDL-C was found near CETP (encoding cholesteryl ester transfer protein [CETP], a component that facilitates transfer of cholesteryl esters from HDL to other lipoproteins). In addition, the GWAS identified an index SNP associated with triglyceride levels within GCKR , which encodes glucokinase regulatory protein.
Additional GWAS have expanded the number of known lipid-related loci. In 2010, a large-scale GWAS of approximately 100,000 individuals identified 95 loci contributing to plasma LDL-C, HDL-C, and triglyceride levels, and a subsequent study of approximately 190,000 persons increased the loci count to 157. Of these loci, 9 demonstrated the strongest associations with LDL-C, 46 with HDL-C, 16 with triglycerides, 18 with total cholesterol, and numerous loci affected multiple lipid fractions. Among the discovered loci are those containing other well-characterized lipid regulators, including APOB , PCSK9 , LDLR , LPL (lipoprotein lipase—involved in triglyceride metabolism), and HMGCR (encoding for 3-hydroxy-3-methylglutaryl-coenzyme A reductase—the pharmacologic target of statins). Several loci are known to harbor rare mutations involved in monogenic lipid disorders such as FH. Notably, many of these causal genes for Mendelian disorders also have common variants that induce subtle effects on gene function resulting in more modest changes in plasma lipid levels.
Large genetic association studies of BP have identified 29 independent genetic variants associated with continuous BP and dichotomous hypertension at genome-wide levels of significance. The majority of these variants affect systolic blood pressure (SBP) and diastolic blood pressure (DBP) in a concordant direction, although variants at three loci are reported to have discordant effects. In most cases, the potential mechanistic links between each gene and the BP phenotype remain unclear. As compared to genetic variants for other CAD risk factors, variants identified for BP appear to exert less influence on the overall phenotype, as the 29 identified variants account for less than 1% of the variation in SBP and DBP. Notably, in a GWAS of 200,000 individuals, an aggregate genetic risk score comprised of the aforementioned genetic variants positively correlated with phenotypes such as CAD and stroke but were not associated with chronic kidney disease or measures of kidney function, suggesting a strong causal relationship between elevated BP and cardiovascular disease but not between elevated BP and renal dysfunction.
Type 2 diabetes mellitus has been studied extensively through several GWAS, including early discovery of an association with TCF7L2 (encoding a transcription factor that regulates proglucagon gene expression in the gastrointestinal tract) through a relatively small GWAS of 2000 cases and 3000 controls. Subsequent larger meta-analyses have increased the total number of type 2 diabetes loci to 63, accounting for approximately 6% of the variation in disease risk. Several of these risk loci harbor genes that alter the processing and secretion of insulin by the pancreatic beta cell, whereas a smaller proportion of genes appear to mediate insulin resistance. SNPs at these loci have therefore been associated with fasting glucose and insulin levels as well as other metabolic traits, such as lipids and adiposity. Of note, variants in the risk loci for type 2 diabetes have little overlap with those for type 1 diabetes and are poor predictors of the latter disease process.
Several other risk biomarkers—including CRP—have also been studied via GWAS with identification of many associated risk loci. CRP is a well-described inflammatory biomarker predictive of CAD/MI in epidemiologic cohorts. GWAS have identified at least 18 loci associated with circulating levels of CRP, including SNPs in the CRP gene encoding for the protein of interest. Other annotated genes at the identified risk loci directly or indirectly involve immune response pathways, as well as various metabolic regulatory pathways implicated in diabetes mellitus.
Studies of Causal Inference—Mendelian Randomization
Since the initial association of total plasma cholesterol and CAD risk, observational epidemiologic studies have identified numerous additional soluble biomarkers associated with CAD. However, due to confounding and reverse causation, observational epidemiology as an approach is inherently limited in ability to draw causal inference. Distinguishing causal from noncausal biomarkers is particularly relevant for therapy, as only causal biomarkers have potential as therapeutic targets.
A study design termed Mendelian randomization affords the ability to assess causality by leveraging Mendel’s law of allele segregation—the random assortment of genetic variants to offspring at the time of conception. This principle results in a natural randomization process akin to that of randomized clinical trials and lessens concerns of confounding and reverse causation. DNA variants associated with the biomarker are utilized as instruments to assess whether an established epidemiologic association between a biomarker and disease reflects a causal relationship. The methodology relies on the premise that if a biomarker is causal for a disease, then the genetic determinants of that biomarker should also be associated with the disease. Furthermore, the magnitude of association should be commensurate with the known effect sizes of the DNA variant on the biomarker and the biomarker on the disease. Presuming adequate study power, lack of an association between a biomarker-related DNA variant (or set of variants) and the disease suggests that the given biomarker is not causal for disease pathogenesis.
The Mendelian randomization approach has several limitations. Importantly, these studies rely on the effect size estimates used for the variant-biomarker and biomarker-disease associations, placing great emphasis on the studies from which these estimates are derived. Furthermore, it is imperative that the variants included in the genetic instrument affect the disease only through the biomarker of interest. The existence of pleiotropy—where a single gene affects a number of unrelated phenotypic traits—undermines the determination of causality, as a proposed causal biomarker may serve as proxy for a separate pathologic mechanism influenced by the genetic variants used.
Over the past decade, Mendelian randomization studies have systematically evaluated which plasma biomarkers causally relate to CAD risk. These studies have provided supportive evidence for LDL-C, triglycerides, and Lp(a) as causal CAD risk factors. In contrast, these studies have cast doubt on HDL and CRP as causal factors for CAD.
Low-Density Lipoprotein
Initial insights into the causal relationship between LDL-C and CAD were derived from studies of patients with FH. As described previously, pathogenic FH mutations in LDLR, APOB, and PCSK-9 mediate increased plasma LDL-C concentrations and are associated with premature CAD, providing strong evidence for a causal link between LDL-C and CAD risk. A formal Mendelian randomization experiment was performed in 50,000 cases and controls employing a genetic score for LDL-C comprised of 13 SNPs exclusively associated with LDL-C. Notably, genetically-elevated LDL-C (a 1-standard deviation increase, ∼35 mg/dL) was associated with a 113% increase in risk of MI, exceeding the 54% increase in MI risk per 1-standard deviation increase in LDL expected from epidemiologic estimates ( Fig. 3.1 ). These results affirmed the causal nature of the association between LDL and CAD/MI. In addition, these data suggest the concept of cumulative, lifelong exposure to LDL-C (measured by genetic risk score) being particularly harmful.