University of the Philippines (UP) researchers tested various artificial intelligence (AI) prediction models to determine the antimicrobial resistance of Escherichia coli (E. coli) using genetic data and laboratory results from the National Center for Biotechnology Information (NCBI) database.
E. coli is a common bacterium found in the intestines of people and animals, often used to identify fecal contamination within the environment.
It can also easily develop resistance to antibiotics, meaning it is an ideal organism to test antimicrobial resistance. This is especially in the case of certain agricultural environments where fecal matter is used as manure or wastewater is reused.
According to Marco Christopher Lopez and Dr. Pierangeli Vital of the UP Diliman College of Science’s Natural Science Research Institute and Dr. Joseph Ryan Lansangan of the UP Diliman School of Statistics, traditional laboratory methods for analyzing antimicrobial resistance are often time-consuming, labor-intensive, and impractical for large-scale monitoring.
This led them and other researchers to explore faster approaches using whole-genome sequencing (WGS) and predictive modeling.
“We selected the models based on their strengths in handling biological and imbalanced data… These models were chosen to compare performance across different learning strategies and to identify which is most suitable for predicting antibiotic resistance,” Vital said.
The group used the following AI models: Random Forest (RF), which is well-suited for high-dimensional data; Support Vector Machine (SVM), which excels in classification tasks, particularly when dealing with complex decision boundaries; and two ensemble methods—Adaptive Boosting (AB) and Extreme Gradient Boosting (XGB), which enhance accuracy by focusing on hard-to-classify samples.
These AI prediction models most accurately predicted resistance to streptomycin and tetracycline, demonstrating high accuracy and reliably distinguishing resistant strains from susceptible ones.
However, ciprofloxacin was the “most challenging” to predict given the limited number of resistant samples in the data (only 4 percent), which led to difficulty in identifying resistance and poor sensitivity.
AB and XGB consistently delivered good results among the models, even when tested on imbalanced antimicrobial resistance data.
“We think that this strategy has great potential for real-time monitoring of antimicrobial resistance, particularly in agriculture,” Vital said, emphasizing the potential use of AI prediction models in the sector.
“As DNA sequencing becomes faster and cheaper, prediction models such as ours can pick up resistant bacteria early—before they lead to outbreaks. This can facilitate better decision-making in food safety, agriculture, and public health programs,” she added.
The researchers recommended including more diverse sample types and data sources, including metagenomic data (DNA from all microbes in a sample), to better understand and predict how bacteria develop resistance.
Vital likewise highlighted the value of collaboration between fields, namely how microbiologists and statisticians worked together in their study: “More so, the integration of (micro)biological concepts to statistics and predictive modelling to have an impactful result/outcome to the community, in this instance, agricultural food safety.”
The study titled “Prediction models for antimicrobial resistance of Escherichia coli in an agricultural setting around Metro Manila, Philippines” was published in open-access, peer-reviewed Malaysian Journal of Microbiology. It was funded by the NSRI and the Department of Science and Technology’s (DOST) Grant to Outstanding Achievements in Science and Technology through the National Academy of Science and Technology.