Methylation .md

DNA methylation patterns vary with age, disease, and tissue type, which makes them useful for both diagnostics and biological age prediction.

A database for methylation profiles

Ying et al. (2024) collected 220k human DNA methylation profiles from 5k datasets through the EWAS Data Hub and the Clockbase agent and converted the raw files to parquet files. The datasets are available from GitHub in the MethylGPT repo. For reference, the EWAS Data Hub lists 180k samples in April 2026:

image

Ying et al.’s datasets are available at lamin.ai/laminlabs/methyldata with additional annotations. You can, e.g., filter by disease status, demographics, or tissue on the UI or with lamindb.

An example use case: age prediction

Here is a tutorial where we walk through how to train a ML model to estimate age.