Classifying the pesticides in foods between GC-amenable and LC-amenable using the prediction model with molecular descriptors
Posters | 2020 | Agilent TechnologiesInstrumentation
The selection of an appropriate analytical technique between gas chromatography–mass spectrometry (GC–MS) and liquid chromatography–mass spectrometry (LC–MS) is critical for reliable pesticide residue analysis in food matrices. Misclassification can lead to inadequate detection, compromised food safety assessments, and inefficient method development.
This study aimed to develop a quantitative structure–property relationship (QSPR) model to classify pesticides as GC‐amenable or LC‐amenable. Data were compiled from two validation reports: an FDA study on 136 pesticides in avocado (LC/MS and GC/MS) and an EU Reference Laboratories (EURL) report on 127 pesticides in olive oil, yielding 202 compounds, of which 194 were used after excluding eight with inconsistent analytical assignments.
A canonical SMILES list of the 194 pesticides was retrieved from PubChem. Molecular descriptors were calculated using the rcdk package in R, initially generating 224 descriptors and reducing to 176 by removing zero‐variance entries. Descriptors were standardized prior to modeling. Classification employed 119 machine learning algorithms available in the caret package in R, including ordinary learning, sparse modeling, neural networks, decision trees, ensemble methods, and splines. Performance metrics comprised 10‐fold cross‐validation accuracy and execution time measured via R’s System.time() function.
Across all methods, the average cross‐validation accuracy was 77%. Ensemble spline methods exhibited high variability, with bagEarth variants performing poorly. The top 20 algorithms achieved 83–85.5% accuracy; AdaBoost.M1 reached 85.5% but with a long execution time (~5,600 seconds). XGBoost methods balanced accuracy and speed: xgbTree yielded 84.6% accuracy in under 2 minutes, while xgbDART achieved 85.0% accuracy in ~8.5 minutes, making it the recommended approach. GLM with stepwise feature selection (glmStepAIC) was the slowest, requiring over 3 hours.
The developed QSPR‐based classification tool streamlines the decision process for selecting GC/MS or LC/MS for pesticide analysis, reducing time and resources in method development. It can support laboratories in regulatory compliance, quality control, and rapid screening of multiresidue pesticide methods.
Future work may include expanding the pesticide dataset, incorporating additional physicochemical descriptors, and exploring deep learning architectures. Integration with retention time prediction models, user‐friendly software implementation, and extension to other analyte classes could further enhance applicability in food safety and environmental monitoring.
This study demonstrates that a QSPR approach combined with machine learning can effectively classify pesticide amenability to GC/MS or LC/MS. Among 119 algorithms, xgbDART offers the best compromise of accuracy (85.0%) and execution time (<9 minutes). The model provides a practical decision‐support tool for analytical chemists.
GC/MSD, LC/MS
IndustriesFood & Agriculture
ManufacturerAgilent Technologies
Summary
Significance of the Topic
The selection of an appropriate analytical technique between gas chromatography–mass spectrometry (GC–MS) and liquid chromatography–mass spectrometry (LC–MS) is critical for reliable pesticide residue analysis in food matrices. Misclassification can lead to inadequate detection, compromised food safety assessments, and inefficient method development.
Objectives and Study Overview
This study aimed to develop a quantitative structure–property relationship (QSPR) model to classify pesticides as GC‐amenable or LC‐amenable. Data were compiled from two validation reports: an FDA study on 136 pesticides in avocado (LC/MS and GC/MS) and an EU Reference Laboratories (EURL) report on 127 pesticides in olive oil, yielding 202 compounds, of which 194 were used after excluding eight with inconsistent analytical assignments.
Methodology and Instrumentation
A canonical SMILES list of the 194 pesticides was retrieved from PubChem. Molecular descriptors were calculated using the rcdk package in R, initially generating 224 descriptors and reducing to 176 by removing zero‐variance entries. Descriptors were standardized prior to modeling. Classification employed 119 machine learning algorithms available in the caret package in R, including ordinary learning, sparse modeling, neural networks, decision trees, ensemble methods, and splines. Performance metrics comprised 10‐fold cross‐validation accuracy and execution time measured via R’s System.time() function.
Key Results and Discussion
Across all methods, the average cross‐validation accuracy was 77%. Ensemble spline methods exhibited high variability, with bagEarth variants performing poorly. The top 20 algorithms achieved 83–85.5% accuracy; AdaBoost.M1 reached 85.5% but with a long execution time (~5,600 seconds). XGBoost methods balanced accuracy and speed: xgbTree yielded 84.6% accuracy in under 2 minutes, while xgbDART achieved 85.0% accuracy in ~8.5 minutes, making it the recommended approach. GLM with stepwise feature selection (glmStepAIC) was the slowest, requiring over 3 hours.
Benefits and Practical Applications
The developed QSPR‐based classification tool streamlines the decision process for selecting GC/MS or LC/MS for pesticide analysis, reducing time and resources in method development. It can support laboratories in regulatory compliance, quality control, and rapid screening of multiresidue pesticide methods.
Future Trends and Possibilities
Future work may include expanding the pesticide dataset, incorporating additional physicochemical descriptors, and exploring deep learning architectures. Integration with retention time prediction models, user‐friendly software implementation, and extension to other analyte classes could further enhance applicability in food safety and environmental monitoring.
Conclusion
This study demonstrates that a QSPR approach combined with machine learning can effectively classify pesticide amenability to GC/MS or LC/MS. Among 119 algorithms, xgbDART offers the best compromise of accuracy (85.0%) and execution time (<9 minutes). The model provides a practical decision‐support tool for analytical chemists.
Reference
- Barganska Z, Konieczka P, Namiesnik J. 2018. Comparison of Two Methods for the Determination of Selected Pesticides in Honey and Honeybee Samples. Molecules. 23:2582.
- Anagnostopoulos C, Miliadis GE. 2013. Development and validation of a multiresidue method for multiclass pesticide residues using GC–MS/MS and LC–MS/MS in olive oil and olives. Talanta. 112:1-10.
- Food and Drug Administration. 1999. Pesticide Analytical Manual Vol. I, Appendix II.
- Chamkasem N, Ollis LW, Harmon T, Lee S, Mercer G. 2013. Analysis of 136 Pesticides in Avocado Using a Modified QuEChERS Method with LC/MS/MS and GC/MS/MS. J Agric Food Chem. 61:2315-2329.
- EU Reference Laboratories for Residues of Pesticides. 2012. Validation Data of 127 Pesticides Using a Multiresidue Method by LC/MS/MS and GC/MS/MS in Olive Oil.
- Kuhn M. 2008. Building Predictive Models in R Using the caret Package. J Stat Softw. 28:1-26.
Content was automatically generated from an orignal PDF document using AI and may contain inaccuracies.
Similar PDF
Comprehensive machine learning prediction of GC/MS pesticide recovery based on the molecular fingerprinting for food QA/QC
2019|Agilent Technologies|Posters
Poster Reprint ASMS 2019 TP298 Comprehensive machine learning prediction of GC/MS pesticide recovery based on the molecular fingerprinting for food QA/QC Takeshi Serino* 1,2; Sadao Nakamura1; Yoshizumi Takigawa1; Norton Kitagawa3; Shigehiko Kanaya 2 1 Agilent Technologies, Hachioji City, Japan 2…
Key words
learning, learningmachine, machine𝑖𝑗, 𝑖𝑗descriptor, descriptorsmiles, smilesrecovery, recoverypek, pekatoms, atomsgeneralization, generalizationpesticide, pesticideprediction, predictionpesticides, pesticidesmethods, methodsindex, index𝑦ത
Optimum molecular descriptors based on 89 machine learning methods for predicting the recovery rate of pesticides in crops by GC-MS
2020|Agilent Technologies|Posters
Poster Reprint ASMS 2020 ThP 177 Optimum molecular descriptors based on 89 machine learning methods for predicting the recovery rate of pesticides in crops by GC-MS Takeshi Serino 1, 2, Sadao Nakamura 1, Yoshizumi Takigawa 1, Tarun Anumol 3, Md.…
Key words
mds, mdsdescriptors, descriptorscluster, clusterdescriptor, descriptorclustering, clustering𝑖𝑗, 𝑖𝑗molecular, molecularcorrelation, correlationatoms, atomsordinary, ordinarylearning, learninggraph, graphprediction, predictioncorrelated, correlatedmachine
Agilent ASMS 2020 Posters Book
2020|Agilent Technologies|Posters
Poster Reprint ASMS 2020 MP 176 Using ICP-MS/MS with M-Lens for the analysis of high silicon matrix samples Yu Ying1; Xiangcheng Zeng1 1Agilent China Technologies, China, Shanghai, Introduction The expansion of the connected devices and the Internet of Things (IoT)…
Key words
peptide, peptidereprint, reprintwere, wereposter, postermethod, methoddiscussion, discussionpositive, positiveresults, resultsclassification, classificationusing, usingboth, bothexperimental, experimentalanalysis, analysisrecovery, recoverysample
A Fast Analysis of the GC/MS/MS Amenable Pesticides Regulated by the California Bureau of Cannabis Control
2019|Agilent Technologies|Applications
Application Note Food Testing & Agriculture A Fast Analysis of the GC/MS/MS Amenable Pesticides Regulated by the California Bureau of Cannabis Control Authors Ron Honnold1, Eric Fausett1, Jessica Westland1, and Anthony Macherone1,2 1 Agilent Technologies, Inc. 2 The Johns Hopkins…
Key words
wide, widefalse, falsepentachloronitrobenzene, pentachloronitrobenzenechlordane, chlordanecaptan, captancannabis, cannabispesticides, pesticidesparathion, parathionacquisition, acquisitiontime, timeempirical, empiricalcalifornia, californiacounts, countsmycotoxins, mycotoxinshomogenizers