In the past, many scientists have attempted to measure the aging of cells. Epigenetic clocks have been found to be successful in doing so. These clocks focus on genetic effects that are not encoded in DNA. These alter how genes are expressed without altering the DNA sequence. These clocks can estimate the biological ages of tissues, cell types, and organs. They are expected to be very useful to researchers for many reasons. Epigenetic clocks can test the validity of theories of biological aging. They can help diagnose age related diseases and define cancer subtypes. They may be able to predict the onset of disease. Epigenetic markers may also help evaluate therapies including rejuvenation approaches. They will aid in studies of developmental biology and cell differentiation. They may even have applications in forensics. There is no limit to the good that can from developing an accurate epigenetic clock. They will undoubtedly contribute vastly to the study of aging, a fundamental characteristic of human beings.
The most prominent epigenetic clock is Horvath’s Clock. It was developed by Steve Horvath, a professor of human genetics and biostatistics at UCLA. Horvath spent over 4 years collecting data and developing an effective procedure. The first article on Horvath’s Clock was published on October 21, 2013.
Horvath’s clock focuses on cytosine-5 methylation within CpG dinucleotides, more commonly referred to as DNA methylation (DNAm). In this process, methyl groups are added to DNA segments. Epigenetics vary between different types of tissues, and different tissues are affected differently by age-related changes. However, CpG signatures can be defined independently of sex, tissue type, and disease state. This accuracy despite tissue type makes DNA methylation an accurate way to measure aging. While it is unknown what DNA methylation measures exactly, Horvath has hypothesized that DNA methylation age measures the overall effect of an epigenetic maintenance system. There is a correlation between DNA methylation age and mortality later in life, suggesting that DNA methylation is related to aging. The 353 CpGs identified are referred to as clock CpGs, since their weighted average formed by regression coefficients forms an epigenetic clock.
Illumina Infinium platforms were used to measure DNAm levels in CpGs. These quantify DNAm levels by the value using the ratio of intensities between methylated and un-methylated alleles. values range from 0 (completely un-methylated) to 1 (completely methylated).
In order to find DNAm age, it is helpful to first transform age. The following function F is used to transform age:
The adult age was set to 20 for humans. F should be a continuous increasing function which can be inverted. It should have a logarithmic dependence on age until adulthood and then a linear dependence on age. It should also be defined for prenatal samples by adding 1 year to the age in the logarithm. This is represented in the following figure.
The function F is represented by a line passing through the weighted average of the CpGs. The 353 CpGs identified are referred to as clock CpGs, since their weighted average formed by regression coefficients forms an epigenetic clock.
Next, an elastic net regression model is used to regress the transformed age linearly with coefficients The first few coefficient values can be found in the following chart.
This results in the following linear regression:
Based on these coefficient values, DNAm Age can be estimated as:
Therefore, this regression model can be used to predict transformed age by simply plugging in b values for the selected CpGs. The inverse of this transformed age can be used to estimate DNAm Age.
To develop the clock, Horvath analyzed 7,844 non-cancer samples from 82 data sets and 51 different tissues and cell types. Cancer tissues were excluded since cancer affects DNA methylation levels. The first 39 data sets were used to “train” the age predictor. The training data were chosen to represent a variety of tissues and cell types and to involve samples whose mean age (43 years, SD= 25) was similar to that of the test data (42 years, SD=25). Data sets 40 to 71 were used to test the age predictor, and 72 to 82 were used for separate purposes, such as to
estimate the DNAm age of embryonic stem cells.
The columns of this data table represent the data set and corresponding color, source of the DNA, what Illumina platform was used in analysis, sample size, proportion of females, median age, age range, relevant citations, and public availability, The other columns show the age correlation between Age and DNAm Age, median error, and median age acceleration for DNAm Age. The last two columns show the age correlation and median error when using a leave-one-data-set-out cross-validation analysis. In this approach, the DNAm age for each data set is estimated separately by fitting a multi-tissue age predictor to the data sets that have been left out.
Horvath then considered various measures of predictive accuracy. ‘Age correlation’ was the first involving the Pearson correlation coefficient. The second was referred to as ‘error’, involving the median absolute difference between DNAm age and chronological age. ‘Average age acceleration’, the average difference between DNAm age and chronological age, can also be used to see if the DNAm age of a tissue is consistently higher or lower than expected. According to these measures of accuracy, Horvath’s clock was very accurate. He found an age correlation of 0.96 and an error of 3.6 years. The following graphs show the high correlation between
chronological age and DNAm age in both the training and test data.
(A) Across all test data, the age correlation is 0.96 and the error is 3.6 years.
(B) CD4 T cells measured at birth (age zero) and at age 1 (cor = 0.78, error = 0.27 years)
(C) CD4 T cells and CD14 monocytes (cor = 0.90, error = 3.7)
(D) peripheral blood mononuclear cells (cor = 0.96, error = 1.9)
(E) whole blood (cor = 0.95, error = 3.7)
(F) cerebellar samples (cor = 0.92, error = 5.9)
(G) occipital cortex (cor = 0.98, error = 1.5)
(H) normal adjacent breast tissue (cor = 0.87, error = 13)
(I) buccal epithelium (cor = 0.83, error = 0.37)
(J) colon (cor = 0.85, error = 5.6)
(K) fat adipose (cor = 0.65, error = 2.7)
(L) heart (cor = 0.77, error = 12)
(M) kidney (cor = 0.86, error = 4.6)
(N) liver (cor = 0.89, error = 6.7)
(O) lung (cor = 0.87, error = 5.2)
(P) muscle (cor = 0.70, error = 18)
(Q) saliva (cor = 0.83, error = 2.7)
(R) uterine cervix (cor = 0.75, error = 6.2)
(S) uterine endometrium (cor = 0.55, 11)
(T) various blood samples composed of 10 Epstein Barr Virus transformed B cell, three naive B cell, and three peripheral blood mononuclear cell samples (cor = 0.46, error = 4.4)
Since the age predictor worked well across a wide spectrum of tissues, Horvath hypothesized that many of the CpGs would vary very little across tissues and that they would correlate highly with age. He used an analysis of variance (ANOVA) across the training data sets. Then, he used a multivariate regression model to regress each CpG. Age was included as a covariate, since the different data sets had different age distributions. ANOVA calculated an F statistic for tissue effect. F took on a large value for CpGs that varied greatly across different tissues. By plotting tissue variance versus age variance and using the F statistic, Horvath found that CpGs with high positive or negative age correlations did not vary much across different tissues.
The creation of Horvath’s clock allowed for many other discoveries to be made, many by Horvath himself. The first example was the identification of tissues in which DNAm age was poorly calibrated. These included breast tissue, uterine endometrium, dermal fibroblasts, skeletal muscle tissue, and heart tissue. The high error in breast tissue may be explained by hormonal effects and the high error in uterine endometrium can likely be explained by the menstrual cycle. Myosatellite cells are able to rejuvenate the DNAm age of skeletal tissues, and the integration of stem cells into cardiac muscle may explain DNAm age. These can all be tested in the future.
It was also found that cell passaging (subculturing a cell) increases DNAm age. After a limited number of cell divisions known as the Hayflick limit, most cells lose their proliferation and differentiation potential. However, DNAm age is not the same as mitotic age or cellular senescence (aging). DNAm age can track age in non-proliferative tissues, and short and long lived blood cells often have similar ages. DNAm age is also highly related to age in immortal, non-senescent age. Therefore, it cannot be the same as mitotic age or cellular senescence.
The combination of the 353 clock CpGs varies greatly across age. The logarithmic dependence that slows to a linear dependence established in the formula for DNAm Age represents the ticking age of the epigenetic clock. Organismal growth and frequent cell division cause a high ticking rate. This slows down to a constant ticking rate after adulthood.
Epigenetic clocks have already been applied to specific areas. Cancer tissues have been shown to have both positive and negative acceleration affects. The epigenetic clock has been used to study relations between high BMI in and DNAm ages, finding age acceleration in the liver. Studies of trisomy 21 (Down Syndrome) show increases in the age of blood and brain tissue by an average of 6.6 years. In numerous examples, the epigenetic clock has been associated with the biological age of the brain. Age acceleration of the prefrontal cortex has been correlated with neuropathological measurements associated with Alzheimer’s diseases. The epigenetic age of blood is linked to cognitive functioning in the elderly, as well as Parkinson’s disease. In contrast, the cerebellum has been shown to age quite slowly in comparison. Huntington’s disease increases the epigenetic aging rates of various regions of the brain. To say the least, epigenetic clocks can be very helpful in all sorts of fields.
Horvath has proposed that DNAm age measures the cumulative efforts of an epigenetic maintenance system (EMS). There is a high tick rate initially during organismal development due to the power required to maintain epigenetic stability. If this is correct, DNAm age should be accelerated by disturbances to epigenetic stability. Age acceleration should also have some benefits, considered the protective role of the maintenance system.
By collecting DNA methylation data and forming a weighted average of 353 clock CpGs, Horvath was able to develop a multi-tissue predictor of age. He also concluded that DNA methylation age has the following properties: it is close to zero for embryonic stem cells, it correlated with cell passage number, it is highly heritable, and it is applicable to chimpanzee tissues.
It will have to be seen if DNAm age is an adequate biomarker of aging. In order to be accepted as adequate, the biomarker must predict the rate of aging, it must monitor a basic process underlying the aging process, not the effects of disease, it must be able to be tested repeatedly without harming the subject, and it must work in humans and laboratory animals. Given that Horvath’s clock is effective on chimpanzees, it may satisfy the 4th criterion. It is very likely that it satisfies the 3rd and the 2nd. Large cohort studies sampling groups with a shared defining characteristic over time will help to assess the fulfillment of the 1st criterion. They will need to test whether DNAm will better predict functional capability than chronological age.