Discovery of food identity markers by metabolomics and machine learning technology

Abstract Verification of food authenticity establishes consumer trust in food ingredients and components of processed food. Next to genetic or protein markers, chemicals are unique identifiers of food components. Non-targeted metabolomics is ideally suited to screen food markers when coupled to effi...

Description complète

Enregistré dans:
Détails bibliographiques
Auteurs principaux: Alexander Erban, Ines Fehrle, Federico Martinez-Seidel, Federico Brigante, Agustín Lucini Más, Veronica Baroni, Daniel Wunderlin, Joachim Kopka
Format: article
Langue:EN
Publié: Nature Portfolio 2019
Sujets:
R
Q
Accès en ligne:https://doaj.org/article/e43a47c8e9e649ffb640ab938f2e05ad
Tags: Ajouter un tag
Pas de tags, Soyez le premier à ajouter un tag!
Description
Résumé:Abstract Verification of food authenticity establishes consumer trust in food ingredients and components of processed food. Next to genetic or protein markers, chemicals are unique identifiers of food components. Non-targeted metabolomics is ideally suited to screen food markers when coupled to efficient data analysis. This study explored feasibility of random forest (RF) machine learning, specifically its inherent feature extraction for non-targeted metabolic marker discovery. The distinction of chia, linseed, and sesame that have gained attention as “superfoods” served as test case. Chemical fractions of non-processed seeds and of wheat cookies with seed ingredients were profiled. RF technology classified original seeds unambiguously but appeared overdesigned for material with unique secondary metabolites, like sesamol or rosmarinic acid in the Lamiaceae, chia. Most unique metabolites were diluted or lost during cookie production but RF technology classified the presence of the seed ingredients in cookies with 6.7% overall error and revealed food processing markers, like 4-hydroxybenzaldehyde for chia and succinic acid monomethylester for linseed additions. RF based feature extraction was adequate for difficult classifications but marker selection should not be without human supervision. Combination with alternative data analysis technologies is advised and further testing of a wide range of seeds and food processing methods.