In modern day science Omics technologies gain in importance for characterization and classification of various biological samples, including food. In contrast to genomics and transcriptomics, metabolomics is strongly influenced by the environment of the sample under study. Especially the metabolome of honey products is strongly affected by environmental influences, because honey is produced by bees who gather nectar and pollen from their immediate surroundings. In order to analyze the metabolome, a liquid chromatograph coupled to a mass spectrometer (LCMS) is highly suitable due to its high throughput, optional soft ionization and good coverage of certain compound classes. The goal of this project is to enable authenticity analysis of honey with appropriate bioinformatics analysis of LCMS data.
This project is divided into three parts. In the first part, a workflow is established to utilize LCMS data from different devices. Subsequently, chemometric approaches like principle component analysis (PCA), Soft Independent Modelling of Class Analogy (SIMCA) and machine learning methods, for instance random forest (RF), are applied to discriminate honey samples. In addition to the application of RFs for classification, RF‑based approaches are also utilized to characterize the honey based on their LCMS data. This is accomplished by the selection of important variables that characterize the variation between samples and and thus can be used to interpret specific environmental influences. Furthermore, the complex data is investigated by the relation analysis provided by Surrogate Minimal Depth (SMD). SMD enables the identification of co-occurring metabolites or metabolic pathways that can be associated with specific environmental influences and therefore with the geographical origin of honey.
In the second part of this project, honey adulteration by sugar syrups is revealed. Therefore, multivariate regression models are generated, e.g. by partial least squares (PLS) regression and RF to determine the proportion of syrup added to honey. Also in this context variable selection methods will be used to identify specific markers of the sugar syrups.
The approaches developed in the first two parts of the project will be applied to other food authentication issues, such as fruit juice adulteration, in the third part of the project
This projekt is worked on by Jule Hansen.
Characterization of food based on their metabolome
The analysis of the metabolome is getting more and more important for characterization and classification of various biological samples including food. The metabolome comprises several very different molecule classes which is why different analytical techniques have been developed for metabolomics analysis. Most of them are based on NMR and mass spectrometry. For food analysis, metabolomics is used to detect fraud, e.g. by the determination of the geographic origin and taxonomic characteristics.
In this project, bioinformatic approaches for the comprehensive analysis of various metabolomics data for food profiling and authentication are developed. This is achieved by the application of established methods like principle component analysis (PCA) and machine learning methods, e.g. random forests (RF). In addition to the application of RF for classification, RF-based approaches are utilized to characterize the investigated food samples. This is accomplished by the selection of important variables that are exploited for the interpretation of differences between the classes and, hence, specific influences on the metabolome of foods. In this context, various variable selection methods are applied and compared. Furthermore, the complex properties of the food metabolome are investigated by the relation analysis provided by Surrogate Minimal Depth (SMD) that also enables the identification of co-occurring metabolites or metabolomics pathways that can be associated with specific environmental influences of food. In order to directly include this external knowledge about functional relationships into the modeling process, pathway-guided RF approaches are also applied in this project.
This projekt is worked on by Sören Wenck.
Bioinformatic profiling of written artefacts
This project is part of the excellence cluster "Understanding Written Artefacts" and is worked on by Lucas Voges. Further information about the project can be found here.