Author
The Multidimensional Chromatography Workshop
The Multidimensional Chromatography Workshop
The Multidimensional Chromatography (MDC) Workshop draws experts in this exciting field to share and discuss their research.
Tags
Presentation
Video
LinkedIn Logo

Structural elucidation using GC×GC-TOFMS and machine learning (Masaaki Ubukata, MDCW 2026)

We, 25.3.2026
| Original article from: The Multidimensional Chromatography (MDC) Workshop
Discover how AI-driven GC-MS workflows and predictive spectral libraries enable confident identification of unknown metabolites, integrating molecular-formula estimation and high-resolution GC×GC-TOFMS.
Video placeholder
  • Photo: MDCW: Structural elucidation using GCxGC-TOFMS and machine learning for unknown metabolites in HeLa cell (Masaaki Ubukata, MDCW 2026)
  • Video: LabRulez: Masaaki Ubukata: Structural elucidation using GC×GC-TOFMS and machine learning (MDCW 2026)

🎤 Presenter: Masaaki Ubukata (JEOL Ltd.)

💾 PDF presentation

Abstract

Since aroma components in food comprise numerous compounds, comprehensive two-dimensional gas chromatography–mass spectrometry (GCxGC-MS) is an effective technique. In GCxGC-MS, compound identification is commonly performed by searching mass spectral databases using electron ionization (EI). However, many compounds are often not registered in these databases. These require manual structural analysis, which can be difficult. To address the difficulty of manual structural analysis, we developed a structural elucidation method using machine learning-based mass spectral prediction. The predicted mass spectra can be used as a database, enabling structural estimation in the same way as conventional database searching. Furthermore, in this method, molecular formulas obtained from soft ionization (SI) data are used to narrow down candidate structures. However, a limitation was that multiple molecular formula candidates often remained. To overcome this, we also developed a technique to rank candidate formulas using a machine learning model. In this study, we report an application of the above method to the analysis of aroma compounds in spices by SPME-GCxGC-TOFMS.

In total, 518 compounds were detected by GCxGC-TOFMS. Integrated analysis using both EI and SI data enabled reliable identification of key aroma constituents of cardamom, including monoterpenes, sesquiterpenes, and furans. Compound annotation was based on combined evidence from NIST database searching, retention index, molecular ion and isotope information from SI, and accurate fragment masses from EI. When multiple candidate formulas were obtained, our machine learning model ranked them, allowing selection of the most plausible result. Even for database-unregistered compounds, molecular formulas and tentative structures could be estimated.

Video Transcription

Hello everyone, my name is Masaki Ubukata from Tokyo, Japan. Thank you for the opportunity to present our latest developments. In my previous presentation, I introduced our artificial intelligence–based structure analysis functionalities integrated into our software platform. Today, I will provide an updated overview of these AI-driven technologies, including a newly developed model for chemical formula recommendation, followed by a demonstration of metabolomics applications enabled by these tools.

Metabolomics focuses on the comprehensive analysis of small molecules generated through biological processes. In this context, the term “small molecules” represents a key analytical focus. Both GC-MS and LC-MS are widely used in metabolomics; however, GC-MS continues to offer significant advantages for analyzing low-molecular-weight metabolites. Comprehensive separation strategies such as GC×GC are particularly beneficial for resolving complex metabolomic samples. Furthermore, trimethylsilyl (TMS) derivatization remains a gold-standard sample preparation approach in GC-MS-based metabolomics workflows.

Qualitative GC-MS analysis typically involves multiple stages, including sample introduction, chromatographic separation, compound extraction, and final identification. Database searching, particularly using the NIST spectral library, is standard practice. However, a major challenge arises when compounds are not present in available databases. Manual interpretation of mass spectra can address this issue but is time-consuming and requires substantial expertise. This limitation motivated the development of our AI-based identification technologies, which aim to facilitate the characterization of unknown compounds beyond conventional database coverage.

To address this need, we developed a machine-learning-driven predicted electron ionization (EI) mass spectral database. Our high-resolution GC time-of-flight platform provides resolving power exceeding 30,000 and mass accuracy down to 1 ppm, combined with multiple ionization techniques including EI, chemical ionization (CI), photoionization (PI), and field ionization (FI). The latter three approaches provide softer ionization conditions, enabling reliable detection of molecular ions. The accompanying AI software incorporates a database containing over 200 million predicted EI spectra, generated through deep-learning models. Recent software updates also support comprehensive GC×GC data processing.

The workflow integrates both EI and soft-ionization data. Initial steps involve database matching, fragment-ion analysis using accurate-mass spectral information, and molecular-ion determination from soft-ionization measurements. When these results are consistent, compound identification can be achieved with high confidence. If discrepancies occur, AI-based structure elucidation is performed. In this stage, molecular-formula information is used to narrow candidate structures among millions of possible compounds. The system simultaneously evaluates substructural features, predicted spectra, and retention indices, enabling identification even for previously unknown compounds.

Two AI models underpin the spectral library: one predicts EI mass spectra from molecular structures, while the other predicts chromatographic retention indices. By combining predicted spectra and retention information for more than 200 million compounds—including structures sourced from public databases and in-silico-generated chemical spaces—we created a comprehensive AI-assisted identification framework. Validation studies conducted since 2022 demonstrate continuous improvements in model performance, with average cosine similarity between predicted and measured spectra increasing from 0.72 in early versions to 0.86 in the latest release.

Despite these advances, determining the correct molecular formula can still present challenges, particularly when multiple candidate compositions are possible. To address this, we developed an additional AI model that estimates probability distributions for elemental composition directly from mass spectral data. This model evaluates isotopic patterns, fragment masses, and elemental likelihoods, thereby guiding users toward the most probable molecular formula prior to library searching. As a result, analytical workflows can proceed efficiently without manual bottlenecks.

Finally, we demonstrate the application of these technologies in metabolomics. Using GC×GC-TOFMS with combined EI and FI ionization, we analyzed human cancer cell metabolites in collaboration with Professor Tugawa at Tokyo University of Agriculture and Technology. A novel metabolite, not listed in conventional databases, was successfully identified using our AI-assisted workflow. From more than 8,000 detected compounds, the system narrowed candidates to structural isomers with identical molecular formulas, correctly ranking the true compound as the top candidate. This example highlights the capability of AI-driven identification for real-world unknown metabolite analysis.

In conclusion, our platform integrates large-scale predicted spectral libraries, molecular-formula estimation models, and advanced chromatographic separation techniques to enhance confidence in GC-MS qualitative analysis. These combined technologies offer significant benefits for metabolomics and other applications requiring robust unknown-compound identification.

This text has been automatically transcribed from a video presentation using AI technology. It may contain inaccuracies and is not guaranteed to be 100% correct.

The Multidimensional Chromatography Workshop
LinkedIn Logo
 

Related content

Alcohol Determination of Sanitizer Gel Using Nexis GC-2060

Applications
| 2026 | Shimadzu
Instrumentation
GC
Manufacturer
Shimadzu
Industries
Pharma & Biopharma

Sensitive and Reproducible Determination of Nitrosamines by GC/MSD

Applications
| 2026 | Agilent Technologies
Instrumentation
GC/MSD, GC/SQ
Manufacturer
Agilent Technologies
Industries
Environmental

High-Throughput BTEX Analysis in Nail Products by SPME and GC/TQ

Applications
| 2026 | Agilent Technologies
Instrumentation
GC/MSD, GC/MS/MS, GC/QQQ, SPME
Manufacturer
Agilent Technologies
Industries
Materials Testing

Accurate multi-component blast furnace gas analysis maximizes iron production and minimizes coke consumption

Applications
| 2026 | Thermo Fisher Scientific
Instrumentation
GC/MSD
Manufacturer
Thermo Fisher Scientific
Industries
Energy & Chemicals

Gas Chromatograph Nexis GC-2060

Brochures and specifications
| 2026 | Shimadzu
Instrumentation
GC
Manufacturer
Shimadzu
Industries
Other
 

Related articles

Unknown compounds analysis by HRMS, soft ionization and AI for GCxGC (Masaaki Ubukata, MDCW 2025)
Presentation | Video

Unknown compounds analysis by HRMS, soft ionization and AI for GCxGC (Masaaki Ubukata, MDCW 2025)

We developed an AI-based EI mass spectral database with 120 million predicted spectra to improve unknown compound identification in GC×GC-MS beyond commercial library limits.
The Multidimensional Chromatography Workshop
tag
share
more
17th MDCW 2026 (Day 1)
Article | Events

17th MDCW 2026 (Day 1)

Day one of MDCW 2026 showcased GC×GC and 2D-LC advances across energy, pharma, PFAS, aroma, and metabolomics, combining keynote insights, oral sessions, posters, and expert discussion.
The Multidimensional Chromatography Workshop
tag
share
more
Exploring PFAS in consumer goods using GCxGC-high-resolution mass spectrometry (David Alonso, MDCW 2026)
Presentation | Video

Exploring PFAS in consumer goods using GCxGC-high-resolution mass spectrometry (David Alonso, MDCW 2026)

GC×GC-HRTOFMS enables fast and reliable screening of PFAS and emerging contaminants in complex consumer goods, combining EI/PCI/NCI and mass defect workflows for confident identification.
The Multidimensional Chromatography Workshop
tag
share
more
News from LabRulezGCMS Library - Week 10, 2026
Article | Application

News from LabRulezGCMS Library - Week 10, 2026

This week we bring you application notes by Agilent Technologies, BaySpec and Shimadzu and presentation by MDCW / JEOL.
LabRulez
tag
share
more
 

Related content

Alcohol Determination of Sanitizer Gel Using Nexis GC-2060

Applications
| 2026 | Shimadzu
Instrumentation
GC
Manufacturer
Shimadzu
Industries
Pharma & Biopharma

Sensitive and Reproducible Determination of Nitrosamines by GC/MSD

Applications
| 2026 | Agilent Technologies
Instrumentation
GC/MSD, GC/SQ
Manufacturer
Agilent Technologies
Industries
Environmental

High-Throughput BTEX Analysis in Nail Products by SPME and GC/TQ

Applications
| 2026 | Agilent Technologies
Instrumentation
GC/MSD, GC/MS/MS, GC/QQQ, SPME
Manufacturer
Agilent Technologies
Industries
Materials Testing

Accurate multi-component blast furnace gas analysis maximizes iron production and minimizes coke consumption

Applications
| 2026 | Thermo Fisher Scientific
Instrumentation
GC/MSD
Manufacturer
Thermo Fisher Scientific
Industries
Energy & Chemicals

Gas Chromatograph Nexis GC-2060

Brochures and specifications
| 2026 | Shimadzu
Instrumentation
GC
Manufacturer
Shimadzu
Industries
Other
 

Related articles

Unknown compounds analysis by HRMS, soft ionization and AI for GCxGC (Masaaki Ubukata, MDCW 2025)
Presentation | Video

Unknown compounds analysis by HRMS, soft ionization and AI for GCxGC (Masaaki Ubukata, MDCW 2025)

We developed an AI-based EI mass spectral database with 120 million predicted spectra to improve unknown compound identification in GC×GC-MS beyond commercial library limits.
The Multidimensional Chromatography Workshop
tag
share
more
17th MDCW 2026 (Day 1)
Article | Events

17th MDCW 2026 (Day 1)

Day one of MDCW 2026 showcased GC×GC and 2D-LC advances across energy, pharma, PFAS, aroma, and metabolomics, combining keynote insights, oral sessions, posters, and expert discussion.
The Multidimensional Chromatography Workshop
tag
share
more
Exploring PFAS in consumer goods using GCxGC-high-resolution mass spectrometry (David Alonso, MDCW 2026)
Presentation | Video

Exploring PFAS in consumer goods using GCxGC-high-resolution mass spectrometry (David Alonso, MDCW 2026)

GC×GC-HRTOFMS enables fast and reliable screening of PFAS and emerging contaminants in complex consumer goods, combining EI/PCI/NCI and mass defect workflows for confident identification.
The Multidimensional Chromatography Workshop
tag
share
more
News from LabRulezGCMS Library - Week 10, 2026
Article | Application

News from LabRulezGCMS Library - Week 10, 2026

This week we bring you application notes by Agilent Technologies, BaySpec and Shimadzu and presentation by MDCW / JEOL.
LabRulez
tag
share
more
 

Related content

Alcohol Determination of Sanitizer Gel Using Nexis GC-2060

Applications
| 2026 | Shimadzu
Instrumentation
GC
Manufacturer
Shimadzu
Industries
Pharma & Biopharma

Sensitive and Reproducible Determination of Nitrosamines by GC/MSD

Applications
| 2026 | Agilent Technologies
Instrumentation
GC/MSD, GC/SQ
Manufacturer
Agilent Technologies
Industries
Environmental

High-Throughput BTEX Analysis in Nail Products by SPME and GC/TQ

Applications
| 2026 | Agilent Technologies
Instrumentation
GC/MSD, GC/MS/MS, GC/QQQ, SPME
Manufacturer
Agilent Technologies
Industries
Materials Testing

Accurate multi-component blast furnace gas analysis maximizes iron production and minimizes coke consumption

Applications
| 2026 | Thermo Fisher Scientific
Instrumentation
GC/MSD
Manufacturer
Thermo Fisher Scientific
Industries
Energy & Chemicals

Gas Chromatograph Nexis GC-2060

Brochures and specifications
| 2026 | Shimadzu
Instrumentation
GC
Manufacturer
Shimadzu
Industries
Other
 

Related articles

Unknown compounds analysis by HRMS, soft ionization and AI for GCxGC (Masaaki Ubukata, MDCW 2025)
Presentation | Video

Unknown compounds analysis by HRMS, soft ionization and AI for GCxGC (Masaaki Ubukata, MDCW 2025)

We developed an AI-based EI mass spectral database with 120 million predicted spectra to improve unknown compound identification in GC×GC-MS beyond commercial library limits.
The Multidimensional Chromatography Workshop
tag
share
more
17th MDCW 2026 (Day 1)
Article | Events

17th MDCW 2026 (Day 1)

Day one of MDCW 2026 showcased GC×GC and 2D-LC advances across energy, pharma, PFAS, aroma, and metabolomics, combining keynote insights, oral sessions, posters, and expert discussion.
The Multidimensional Chromatography Workshop
tag
share
more
Exploring PFAS in consumer goods using GCxGC-high-resolution mass spectrometry (David Alonso, MDCW 2026)
Presentation | Video

Exploring PFAS in consumer goods using GCxGC-high-resolution mass spectrometry (David Alonso, MDCW 2026)

GC×GC-HRTOFMS enables fast and reliable screening of PFAS and emerging contaminants in complex consumer goods, combining EI/PCI/NCI and mass defect workflows for confident identification.
The Multidimensional Chromatography Workshop
tag
share
more
News from LabRulezGCMS Library - Week 10, 2026
Article | Application

News from LabRulezGCMS Library - Week 10, 2026

This week we bring you application notes by Agilent Technologies, BaySpec and Shimadzu and presentation by MDCW / JEOL.
LabRulez
tag
share
more
 

Related content

Alcohol Determination of Sanitizer Gel Using Nexis GC-2060

Applications
| 2026 | Shimadzu
Instrumentation
GC
Manufacturer
Shimadzu
Industries
Pharma & Biopharma

Sensitive and Reproducible Determination of Nitrosamines by GC/MSD

Applications
| 2026 | Agilent Technologies
Instrumentation
GC/MSD, GC/SQ
Manufacturer
Agilent Technologies
Industries
Environmental

High-Throughput BTEX Analysis in Nail Products by SPME and GC/TQ

Applications
| 2026 | Agilent Technologies
Instrumentation
GC/MSD, GC/MS/MS, GC/QQQ, SPME
Manufacturer
Agilent Technologies
Industries
Materials Testing

Accurate multi-component blast furnace gas analysis maximizes iron production and minimizes coke consumption

Applications
| 2026 | Thermo Fisher Scientific
Instrumentation
GC/MSD
Manufacturer
Thermo Fisher Scientific
Industries
Energy & Chemicals

Gas Chromatograph Nexis GC-2060

Brochures and specifications
| 2026 | Shimadzu
Instrumentation
GC
Manufacturer
Shimadzu
Industries
Other
 

Related articles

Unknown compounds analysis by HRMS, soft ionization and AI for GCxGC (Masaaki Ubukata, MDCW 2025)
Presentation | Video

Unknown compounds analysis by HRMS, soft ionization and AI for GCxGC (Masaaki Ubukata, MDCW 2025)

We developed an AI-based EI mass spectral database with 120 million predicted spectra to improve unknown compound identification in GC×GC-MS beyond commercial library limits.
The Multidimensional Chromatography Workshop
tag
share
more
17th MDCW 2026 (Day 1)
Article | Events

17th MDCW 2026 (Day 1)

Day one of MDCW 2026 showcased GC×GC and 2D-LC advances across energy, pharma, PFAS, aroma, and metabolomics, combining keynote insights, oral sessions, posters, and expert discussion.
The Multidimensional Chromatography Workshop
tag
share
more
Exploring PFAS in consumer goods using GCxGC-high-resolution mass spectrometry (David Alonso, MDCW 2026)
Presentation | Video

Exploring PFAS in consumer goods using GCxGC-high-resolution mass spectrometry (David Alonso, MDCW 2026)

GC×GC-HRTOFMS enables fast and reliable screening of PFAS and emerging contaminants in complex consumer goods, combining EI/PCI/NCI and mass defect workflows for confident identification.
The Multidimensional Chromatography Workshop
tag
share
more
News from LabRulezGCMS Library - Week 10, 2026
Article | Application

News from LabRulezGCMS Library - Week 10, 2026

This week we bring you application notes by Agilent Technologies, BaySpec and Shimadzu and presentation by MDCW / JEOL.
LabRulez
tag
share
more
Other projects
LCMS
ICPMS
Follow us
FacebookX (Twitter)LinkedInYouTube
More information
WebinarsAbout usContact usTerms of use
LabRulez s.r.o. All rights reserved. Content available under a CC BY-SA 4.0 Attribution-ShareAlike