Many diseases are driven by gene-environment interactions. One important environmental factor is the metabolic output of human gut microbiota. A comprehensive catalog of human metabolites originated in microbes is critical for data-driven approaches to understand how microbial metabolism contributes to human health and diseases. 

U.S. scientists have presented a novel integrated approach to automatically extract and analyze microbial metabolites from 28 million published biomedical records.

First, they classified 28,851,232 MEDLINE records into microbial metabolism-related or not. Second, candidate microbial metabolites were extracted from the classified texts. Third, we developed signal prioritization algorithms to further differentiate microbial metabolites from metabolites originated from other resources.

Finally, we systematically analyzed the interactions between extracted microbial metabolites and human genes. A total of 11,846 metabolites were extracted from 28 million MEDLINE articles.

“Our study represents the first effort towards large-scale extraction and prioritization of microbial metabolites from over 28 million published biomedical articles. We analyzed the interactions between identified microbial metabolites and human genes, which may provide mechanistic insights into how gut microbiota contribute to human health and diseases. Our study will set the foundation for future microbial metabolite entity recognition and relationship extraction”, explaines.

“A comprehensive list of microbial metabolites will also greatly facilitate data-driven studies of how gut microbial metabolites interact with host genetics in different human diseases. The identification of microbial metabolites and the understanding of their role as key mediators through which these bacteria are involved in disease pathogenesis will provide insight into the molecular mechanisms of human health and diseases and enable new possibilities for disease diagnosis, prevention, and treatment”, concludes.

For more information: Nature