TY - JOUR
T1 - Accurate age prediction from blood using a small set of DNA methylation sites and a cohort-based machine learning algorithm
AU - Varshavsky, Miri
AU - Harari, Gil
AU - Glaser, Benjamin
AU - Dor, Yuval
AU - Shemer, Ruth
AU - Kaplan, Tommy
N1 - Publisher Copyright:
© 2023 The Author(s)
PY - 2023/9/25
Y1 - 2023/9/25
N2 - Chronological age prediction from DNA methylation sheds light on human aging, health, and lifespan. Current clocks are mostly based on linear models and rely upon hundreds of sites across the genome. Here, we present GP-age, an epigenetic non-linear cohort-based clock for blood, based upon 11,910 methylomes. Using 30 CpG sites alone, GP-age outperforms state-of-the-art models, with a median accuracy of ∼2 years on held-out blood samples, for both array and sequencing-based data. We show that aging-related changes occur at multiple neighboring CpGs, with implications for using fragment-level analysis of sequencing data in aging research. By training three independent clocks, we show enrichment of donors with consistent deviation between predicted and actual age, suggesting individual rates of biological aging. Overall, we provide a compact yet accurate alternative to array-based clocks for blood, with applications in longitudinal aging research, forensic profiling, and monitoring epigenetic processes in transplantation medicine and cancer.
AB - Chronological age prediction from DNA methylation sheds light on human aging, health, and lifespan. Current clocks are mostly based on linear models and rely upon hundreds of sites across the genome. Here, we present GP-age, an epigenetic non-linear cohort-based clock for blood, based upon 11,910 methylomes. Using 30 CpG sites alone, GP-age outperforms state-of-the-art models, with a median accuracy of ∼2 years on held-out blood samples, for both array and sequencing-based data. We show that aging-related changes occur at multiple neighboring CpGs, with implications for using fragment-level analysis of sequencing data in aging research. By training three independent clocks, we show enrichment of donors with consistent deviation between predicted and actual age, suggesting individual rates of biological aging. Overall, we provide a compact yet accurate alternative to array-based clocks for blood, with applications in longitudinal aging research, forensic profiling, and monitoring epigenetic processes in transplantation medicine and cancer.
KW - CP: Genetics
KW - CP: Systems biology
KW - DNA methylation
KW - aging
KW - computational biology
KW - epigenetics
KW - machine learning
UR - http://www.scopus.com/inward/record.url?scp=85171870122&partnerID=8YFLogxK
U2 - 10.1016/j.crmeth.2023.100567
DO - 10.1016/j.crmeth.2023.100567
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
C2 - 37751697
AN - SCOPUS:85171870122
SN - 2667-2375
VL - 3
JO - Cell Reports Methods
JF - Cell Reports Methods
IS - 9
M1 - 100567
ER -