Key Points
- •
Myocardial infarction, especially early-onset myocardial infarction, and blood lipid concentrations are partly heritable traits.
- •
In genome-wide association studies of blood lipid concentrations, more than 30 chromosome regions associated with these traits have been identified.
- •
Genome-wide association studies have been performed for other risk factors for cardiovascular disease, including blood pressure, diabetes mellitus, and C-reactive protein.
- •
In genome-wide association studies of myocardial infarction and coronary artery disease, more than 12 associated chromosome loci—many of which are not linked to traditional cardiovascular risk factors—have been identified.
- •
Genetic risk scores that account for DNA variants associated with abnormal lipid levels or myocardial infarction are modestly predictive of disease but do not add to risk discrimination.
- •
The clinical utility of genetic markers to predict an individual’s risk for cardiovascular disease remains to be defined.
Heritability of Cardiovascular Disease
Coronary heart disease (CHD) and myocardial infarction (MI) are among the leading causes of death and infirmity worldwide. Traditional risk factors for MI include age, blood lipid concentrations, blood pressure, diabetes mellitus, and tobacco use. Family history is also an important risk factor for MI; individuals in the offspring cohort of the Framingham Heart Study who had at least one parent with early-onset cardiovascular disease (age at onset <55 in men and <65 in women) had a more than twofold increase in age-adjusted risk of suffering a cardiovascular event in comparison with individuals with no such family history. This increase in risk persisted even after adjustment for multiple traditional risk factors, which implies a genetic basis for the increased risk. Early-onset MI appears to be particularly heritable, which is suggestive of the importance of inherited risk factors for early manifestation of the disease, as opposed to “acquired” risk factors, such as age and tobacco use, that predispose to MI later in life.
Some of the heritability of MI can be attributed to heritability of various MI risk factors. As much as half of the interindividual variability in blood lipid concentrations—low-density lipoprotein cholesterol (LDL-C), high-density lipoprotein cholesterol (HDL-C), and triglycerides—appears to result from inherited factors. Blood pressure and type 2 diabetes mellitus also appear to have substantial heritability.
The evidence for strong heritable components of MI and some of its risk factors has motivated the search for genetic loci that account for this heritability. In principle, investigation of all of the underlying genetic loci enable researchers to quantify the level of inherited risk for each individual, which should greatly improve cardiovascular risk prediction. With the completion of the Human Genome Project and International Haplotype Map Project, it has become possible to perform large-scale genome-wide screens of common DNA sequence variants for association with phenotypes of interest; this approach is termed genome-wide association (GWA). Successful GWA studies have been performed for many clinical traits and diseases, including cardiovascular disease.
This chapter focuses primarily on GWA studies and the clinical implications of their results. A large body of work on the genetics of myocardial infarction and cardiovascular risk factors—which preceded the advent of the GWA approach and in which approaches such as linkage analyses and candidate gene studies were used—is summarized in Chapter 8 .
Genome-Wide Association Studies
GWA studies are designed to detect common DNA variants—those distributed widely in a given population, in contrast to rare mutations that exist in only a few individuals—that are associated with traits or diseases. For each of the traits and diseases that have been shown to be at least partly heritable, it is presumed that there are specific “causal” DNA variants that affect gene function and thereby contribute to the phenotype. Other common DNA variants that are noncausal but are located very close to a causal DNA variant “mark” the latter; variants that are in close proximity on a chromosome often remain linked to one another through many human generations, rather than becoming uncorrelated by the effects of homologous recombination that occurs during meiosis. In principle, in European populations, it is possible to cover the entire genome and detect any common causal DNA variants with about 500,000 “marker” DNA variants. (This number varies among ethnic groups because of differences in correlation structure among DNA variants in distinct ancestral populations.)
Thus, GWA studies have been made possible by the cataloging of more than 3 million single-nucleotide polymorphisms (SNPs) in the human genome. In GWA studies, hundreds of thousands of SNPs are interrogated by genotyping arrays, and the variants at these SNPs are determined (a typical SNP has two possible variant alleles). This genome-wide genotyping is performed for thousands of individuals. For diseases, the study includes individuals with the disease and healthy control individuals; for quantitative traits such as blood lipid concentrations, the study cohort comprises people representing the full range of values for the trait.
Statistical analyses are performed to determine whether variants at any of the SNPs are associated with disease status or changes (higher or lower) in the quantitative trait. Because hundreds of thousands of SNPs are being used, each of which can be regarded as a unique statistical experiment, a corrected P value threshold of 5 × 10 −8 (rather than the usual 0.05) is used to determine statistical significance. Any SNP meeting this stringent criterion (an “index” SNP on a chromosomal locus) is considered to be associated with the phenotype, although causality cannot be inferred because the SNP may simply be a marker for a nearby causal DNA variant.
Genome-wide Association Studies of Blood Lipid Concentrations
In the first reported GWA study for blood lipid concentrations, the investigators used data from nearly 3000 individuals in the Diabetes Genetics Initiative. This initial study identified SNPs in three loci at genome-wide significance ( P < 5 × 10 −8 ), one for each of the three lipid traits: LDL-C, HDL-C, and triglyceride levels. The index SNP for LDL-C was near the APOE gene (which encodes the apolipoprotein E protein, a component responsible for cellular uptake of large lipoprotein particles such as chylomicrons and very low-density lipoproteins), and the index SNP for HDL-C was near the CETP gene (which encodes the cholesteryl ester transfer protein, a component responsible for facilitating the transfer of cholesteryl esters from HDL to other lipoproteins). Thus, this first GWA study provided internal validation of the technique by mapping common DNA variants in known lipid regulators.
In addition, the GWA study identified a triglyceride level–associated locus that harbored no genes previously known to be involved in lipoprotein metabolism. The index SNP for triglycerides was in an intron of GCKR (which encodes glucokinase regulatory protein), and results of subsequent analyses suggested that a coding missense variant (i.e., an alteration of a single amino acid) is responsible for the association with triglyceride levels.
Data from a second set of lipid GWA studies built upon data from the first; the Finland-United States Investigation of NIDDM Genetics (FUSION) study and the SardiNIA Project, added to the Diabetes Genetics Initiative, included a total of almost 9000 individuals. In order to increase the power to detect statistically significant ( P < 5 × 10 −8 ) associations, the top-scoring SNPs in the initial 9000 participants were genotyped in more than 18,000 additional individuals from other cohorts. This staged approach revealed a total of 19 loci associated with one or more of the three lipid traits. In addition to the three loci already identified, these studies revealed loci containing well-characterized lipid regulators, including APOB (apolipoprotein B), APOAI (apolipoprotein A-I), LDLR (LDL receptor), PCSK9 (proprotein convertase subtilisin/kexin type 9), LPL (lipoprotein lipase), and HMGCR (3-hydroxy-3-methylglutaryl–coenzyme A reductase). The last is of particular note because it is the drug target of the widely used statin class of LDL-C–lowering medications. These studies also identified six novel loci whose causal genes have yet to be characterized. Two of these novel loci were confirmed in simultaneously published, independent GWA studies on LDL-C (on chromosome 1p13) and triglyceride levels (on chromosome 7q11).
In a third wave of even larger GWA studies, genotyping was performed in up to 40,000 individuals from various prospective cohort studies, case-control studies (for conditions such as diabetes and coronary disease), and family-based studies. These studies identified more than 30 lipid-associated loci, of which about half harbor established lipid regulators ( Table 4-1 ). A notable finding of these studies is that genes in 11 of the loci are known to harbor rare mutations that cause monogenic (mendelian) lipid disorders, such as familial hypercholesterolemia (see Table 4-1 ). These rare mutations have large effects on gene function, which leads to a phenotype (such as premature MI) that comes to clinical attention.
Unique Locus | Trait | Chr | SNP | Sample Size | P Value | Gene(s) of Interest within or near Associated Interval | Associated Mendelian Lipid Disorder |
---|---|---|---|---|---|---|---|
1 | LDL | 1p13 | rs12740374 | 19,648 | 2 × 10 −42 | CELSR2-PSRC1- SORT1 | — |
2 | LDL | 2p24 | rs515135 | 19,648 | 5 × 10 −29 | APOB | Familial hypercholesterolemia |
3 | LDL | 19q13 | rs4420638 | 11,881 | 4 × 10 −27 | APOE-APOC1-APOC4-APOC2 | Type III hyperlipoproteinemia |
4 | LDL | 19p13 | rs6511720 | 19,648 | 2 × 10 −26 | LDLR | Familial hypercholesterolemia |
5 | LDL | 2p21 | rs6544713 | 23,456 | 2 × 10 −20 | ABCG5-ABCG8 | Sitosterolemia |
6 | LDL | 5q13 | rs3846663 | 19,648 | 8 × 10 −12 | HMGCR | — |
7 | LDL | 5q23 | rs1501908 | 27,280 | 1 × 10 −11 | TIMD4-HAVCR1 | — |
8 | LDL | 20q12 | rs6102059 | 28,895 | 4 × 10 −9 | MAFB | — |
9 | LDL | 7p15 | rs12670798 | 17,797 | 6 × 10 −9 | DNAH11 | — |
10 | LDL | 12q24 | rs2650000 | 39,340 | 2 × 10 −8 | HNF1A | — |
11 | LDL | 1p32 | rs11206510 | 19,629 | 4 × 10 −8 | PCSK9 | Familial hypercholesterolemia |
12 | HDL | 16q13 | rs1532624 | 21,412 | 9 × 10 −94 | CETP | CETP deficiency |
13 | HDL | 15q22 | rs1532085 | 21,412 | 1 × 10 −35 | LIPC | Hepatic lipase deficiency |
14 | HDL | 16q22 | rs2271293 | 21,412 | 8 × 10 −16 | CTCF-PRMT8-LCAT | LCAT deficiency |
15 | HDL | 18q21 | rs4939883 | 19,785 | 7 × 10 −15 | LIPG | — |
16 | HDL | 9q31 | rs3905000 | 21,412 | 9 × 10 −13 | ABCA1 | Tangier disease |
17 | HDL | 11p11 | rs7395662 | 21,412 | 6 × 10 −11 | MADD-FOLH1 | — |
18 | HDL | 9p22 | rs471364 | 40,414 | 3 × 10 −10 | TTC39B | — |
19 | HDL | 20q13 | rs1800961 | 30,714 | 8 × 10 −10 | HNF4A | — |
20 | HDL | 12q24 | rs2338104 | 19,793 | 1 × 10 −10 | MMAB-MVK | — |
21 | HDL | 19p13 | rs2967605 | 35,151 | 1 × 10 −8 | ANGPTL4 | — |
22 | HDL | 1q42 | rs4846914 | 19,794 | 4 × 10 −8 | GALNT2 | — |
23 | TG | 11q23 | rs964184 | 19,840 | 4 × 10 −62 | APOA1-APOC3-APOA4-APOA5 | Primary hypoalphalipoproteinemia |
24 | TG | 8p21 | rs12678919 | 19,840 | 2 × 10 −41 | LPL | Familial hyperchylomicronemia |
25 | TG | 2p23 | rs1260326 | 19,840 | 2 × 10 −31 | GCKR | — |
26 | TG | 8q24 | rs2954029 | 19,840 | 3 × 10 −19 | TRIB1 | — |
27 | TG | 7q11 | rs714052 | 19,840 | 3 × 10 −15 | MLXIPL | — |
28 | TG | 11q12 | rs174547 | 38,846 | 2 × 10 −14 | FADS1-FADS2-FADS3 | — |
29 | TG | 1p31 | rs1167998 | 17,815 | 2 × 10 −12 | DOCK7-ANGPTL3 | — |
30 | TG | 19p13 | rs17216525 | 19,840 | 4 × 10 −11 | NCAN-CILP2-PBX4 | — |
31 | TG | 20q13 | rs7679 | 38,561 | 7 × 10 −11 | PLTP | — |
32 | TG | 8p23 | rs7819412 | 33,336 | 3 × 10 −8 | XKR6-AMAC1L2 | — |
One lesson from the GWA studies is that the same genes that cause mendelian disorders also have common variants that have more subtle effects on gene function and lead to small changes in lipid levels. GWA studies have been criticized for the ability only to discover common variants that have little clinical importance; however, a GWA-identified gene can prove to be highly clinically relevant if the gene’s activity is modulated by a large degree, either by virtue of a naturally occurring rare mutation in an individual or in a family or by deliberate targeting of the gene by a pharmacologic agent. A case in point is HMGCR: If statins had not been discovered before the GWA era, the finding that common variants in HMGCR lead to modest changes in LDL-C would have suggested inhibition of 3-hydroxy-3-methylglutaryl–coenzyme A reductase as a potential new therapeutic strategy. By this reasoning, some of the more than 15 novel GWA loci discovered to date may harbor clinically useful drug targets and, thus, merit functional investigation.
Increasingly larger GWA studies with more than 100,000 participants of European descent (e.g., by the Global Lipids Genetics Consortium), as well as GWA studies in other ethnic groups (e.g., African Americans in the National Heart, Lung, and Blood Institute Candidate Gene Association Resource [NHLBI CARe]), are expected to uncover dozens more novel loci for which functional investigation will show numerous causal genes that will greatly enhance the understanding of lipoprotein metabolism and perhaps eventually lead to the development of new lipid-modifying medications.
Genome-wide Association Studies of Other Risk Factors for Myocardial Infarction
GWA studies have been performed for a number of cardiovascular risk factors besides blood lipid concentrations. Studies on blood pressure have identified more than a dozen loci with common DNA variants that are associated significantly ( P < 5 × 10 −8 ) with systolic blood pressure or diastolic blood pressure ( Table 4-2 ). However, the effects of each SNP on blood pressure are quite small, in no case exceeding 1–mm Hg change per allele (see Table 4-2 ), and in most cases, potential functional links between the genes in each locus and the phenotype remain obscure.