Abstract
Metabolomics offers a direct insight into the biochemical processes of a biological sample, providing a
snapshot of its metabolic state. By analyzing the complete set of metabolites – small molecules of the
cellular processes – metabolomics allows researchers to investigate organisms and uncover biomarkers
for disease diagnosis, progression, and treatment response. This powerful approach integrates complex
data from various analytical techniques, such as liquid chromatography-mass spectrometry (LC-MS)
and matrix-assisted laser desorption/ionization mass spectrometry (MALDI-MS), allowing screening
of both known and unknown molecules for a holistic search of new markers. A significant limitation of
metabolomics is the difficulties in separating variance that arise from technical and biological
confounders from the outcome variance of interest. Biological variance can stem from genetic diversity,
environmental factors, and general day-to-day fluctuations among samples. Technical variance, on the
other hand, originates from factors such as inconsistencies in sample preparation, instrument
performance, and computational data processing. In this thesis, I aim to explore and mitigate these
sources of variance to enhance the reliability and reproducibility of metabolomics studies. By
employing robust statistical models, machine learning techniques, and tailored experimental protocols,
I optimize the extraction of true biological signals amidst the noise. I show that large-scale
metabolomics studies are viable and have used them specifically to gain insights into aging processes,
osteoporosis progression, and antimicrobial resistance predictions. These studies demonstrate the
methodological perspectives of large-scale untargeted metabolomics, longitudinal sampling and end-to-end workflows. Specifically, I have modelled person age with root mean squared error (RMSE) of
5.77 years in ten thousand untargeted metabolomics samples, predicted osteoporosis one year ahead of
the original diagnosis at an area under the receiver operating curve (ROC-AUC) of 0.72, and
demonstrated that end-to-end modeling of MALDI-MS data may predict antibiotic resistance with near
perfect accuracy. The proposed approaches not only improve the precision of metabolic profiling but
also pave the way for more accurate biomarker discovery and better understanding of disease
mechanisms. Finally, I highlight that the presented work needs methodological and biological validation
in future studies to assess the overall generalizability.
snapshot of its metabolic state. By analyzing the complete set of metabolites – small molecules of the
cellular processes – metabolomics allows researchers to investigate organisms and uncover biomarkers
for disease diagnosis, progression, and treatment response. This powerful approach integrates complex
data from various analytical techniques, such as liquid chromatography-mass spectrometry (LC-MS)
and matrix-assisted laser desorption/ionization mass spectrometry (MALDI-MS), allowing screening
of both known and unknown molecules for a holistic search of new markers. A significant limitation of
metabolomics is the difficulties in separating variance that arise from technical and biological
confounders from the outcome variance of interest. Biological variance can stem from genetic diversity,
environmental factors, and general day-to-day fluctuations among samples. Technical variance, on the
other hand, originates from factors such as inconsistencies in sample preparation, instrument
performance, and computational data processing. In this thesis, I aim to explore and mitigate these
sources of variance to enhance the reliability and reproducibility of metabolomics studies. By
employing robust statistical models, machine learning techniques, and tailored experimental protocols,
I optimize the extraction of true biological signals amidst the noise. I show that large-scale
metabolomics studies are viable and have used them specifically to gain insights into aging processes,
osteoporosis progression, and antimicrobial resistance predictions. These studies demonstrate the
methodological perspectives of large-scale untargeted metabolomics, longitudinal sampling and end-to-end workflows. Specifically, I have modelled person age with root mean squared error (RMSE) of
5.77 years in ten thousand untargeted metabolomics samples, predicted osteoporosis one year ahead of
the original diagnosis at an area under the receiver operating curve (ROC-AUC) of 0.72, and
demonstrated that end-to-end modeling of MALDI-MS data may predict antibiotic resistance with near
perfect accuracy. The proposed approaches not only improve the precision of metabolic profiling but
also pave the way for more accurate biomarker discovery and better understanding of disease
mechanisms. Finally, I highlight that the presented work needs methodological and biological validation
in future studies to assess the overall generalizability.
| Translated title of the contribution | Identificering af varianskilder i storskala metabolomics |
|---|---|
| Original language | English |
| Qualification | PhD |
| Awarding Institution |
|
| Supervisors/Advisors |
|
| Award date | 8 Apr 2025 |
| Publisher | |
| Publication status | Published - 8 Apr 2025 |
Fingerprint
Dive into the research topics of 'Addressing sources of variance in large-scale metabolomics'. Together they form a unique fingerprint.Research output
- 4 Journal article
-
End-To-End Deep Learning Explains Antimicrobial Resistance in Peak-Picking-Free MALDI-MS Data
Lassen, J. K. & Villesen, P., Feb 2025, In: Analytical Chemistry. 97, 5, p. 2795-2800 6 p.Research output: Contribution to journal/Conference contribution in journal/Contribution to newspaper › Journal article › Research › peer-review
-
Large-Scale metabolomics: Predicting biological age using 10,133 routine untargeted LC–MS measurements
Lassen, J. K., Wang, T., Nielsen, K. L., Hasselstrøm, J. B., Johannsen, M. & Villesen, P., May 2023, In: Aging Cell. 22, 5, 12 p., e13813.Research output: Contribution to journal/Conference contribution in journal/Contribution to newspaper › Journal article › Research › peer-review
Open Access20 Citations (Scopus) -
Statistical Modelling Investigation of MALDI-MSI-Based Approaches for Document Examination
Kjeldbjerg Lassen, J., Bradshaw, R., Villesen, P. & Francese, S., Jul 2023, In: Molecules. 28, 13, 5207.Research output: Contribution to journal/Conference contribution in journal/Contribution to newspaper › Journal article › Research › peer-review
Open Access3 Citations (Scopus)
Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver