Sex Classification and Identity Verification from GC×GC of Human Scent Samples
Presentations | 2025 | CTU | MDCWInstrumentation
Human scent carries a rich signature of volatile organic compounds (VOCs) that reflect physiological and metabolic states. Advanced separation and detection techniques such as comprehensive two-dimensional gas chromatography coupled to time-of-flight mass spectrometry (GC×GC–ToF–MS) enable high-resolution profiling of these complex mixtures. Automated analysis of scent profiles can transform forensic identification, non-invasive diagnostics, security screening and personalized health monitoring.
This work presents an end-to-end, open-source pipeline for two key biometric tasks using raw GC×GC–ToF–MS data: biological sex classification and individual identity verification. The authors aim to eliminate manual intervention by developing an automated alignment procedure and leveraging deep learning on tensorial spectral data interpreted as images. The approach is validated on a dataset of 2 528 human scent samples from 252 individuals.
The proposed pipeline offers several practical advantages:
Emerging directions include integration with portable GC×GC–ToF–MS platforms for field deployment, multimodal data fusion combining scent profiles with other biometric signals, real-time monitoring of metabolic biomarkers, and privacy-preserving neural architectures to protect individual data. Continuous improvements in deep learning and alignment algorithms may further enhance sensitivity and specificity.
This study demonstrates that raw GC×GC–ToF–MS data, when interpreted as images and processed through automated registration and convolutional neural networks, can accurately classify biological sex and verify individual identity from human scent. The open-source, zero-human-intervention pipeline paves the way for scalable, reproducible biometric analysis using complex chromatographic datasets.
No explicit literature references were provided in the original text.
GC/MSD, GC/TOF, GCxGC
IndustriesClinical Research
ManufacturerLECO
Summary
Importance of the Topic
Human scent carries a rich signature of volatile organic compounds (VOCs) that reflect physiological and metabolic states. Advanced separation and detection techniques such as comprehensive two-dimensional gas chromatography coupled to time-of-flight mass spectrometry (GC×GC–ToF–MS) enable high-resolution profiling of these complex mixtures. Automated analysis of scent profiles can transform forensic identification, non-invasive diagnostics, security screening and personalized health monitoring.
Objectives and Overview of the Study
This work presents an end-to-end, open-source pipeline for two key biometric tasks using raw GC×GC–ToF–MS data: biological sex classification and individual identity verification. The authors aim to eliminate manual intervention by developing an automated alignment procedure and leveraging deep learning on tensorial spectral data interpreted as images. The approach is validated on a dataset of 2 528 human scent samples from 252 individuals.
Methodology and Instrumentation
- Sample Data: Raw GC×GC–ToF–MS outputs are treated as three-dimensional tensors (mass spectral dimension × first-dimension retention time × second-dimension retention time), approximately 4 GB per sample.
- Data Variants: Two representations are evaluated—FULL (complete spectral tensor) and SAS (sum along the mass spectral axis reducing dimensionality).
- Alignment: An unsupervised registration step detects 24 reference compounds and applies a piecewise linear transformation to align all samples to a common chromatographic frame, requiring zero manual annotation.
- Machine Learning: Convolutional neural networks (FCN architecture) process the registered tensors/images. Models are trained with gradient-descent optimization to minimize task-specific loss.
- Computational Resources: All experiments were executed on the LUMI supercomputer, consuming approximately 120 000 GPU hours (estimated cost ∼ 250 000 USD on rented hardware).
Main Results and Discussion
- Sex Classification: On an 80:20 training/validation split with 10-fold cross-validation, the FCN achieved high accuracy distinguishing male from female scent profiles using both FULL and SAS representations.
- Identity Verification: Pairwise comparison of samples by the same individual versus different individuals yielded robust performance, with area under the ROC curve (AUC) demonstrating strong discrimination capability.
- Computational Efficiency: Despite the large data volume, the automated alignment and neural models scale effectively without human intervention and maintain reproducibility across runs.
Benefits and Practical Applications
The proposed pipeline offers several practical advantages:
- Zero Human Labor: Fully automated alignment and feature extraction eliminate subjective bias and reduce labor costs.
- Reproducibility: Standardized processing ensures consistent results across laboratories and time points.
- Open-Source Implementation: Python code and pretrained models will be publicly available to accelerate adoption in forensic, clinical and security contexts.
Future Trends and Potential Applications
Emerging directions include integration with portable GC×GC–ToF–MS platforms for field deployment, multimodal data fusion combining scent profiles with other biometric signals, real-time monitoring of metabolic biomarkers, and privacy-preserving neural architectures to protect individual data. Continuous improvements in deep learning and alignment algorithms may further enhance sensitivity and specificity.
Conclusion
This study demonstrates that raw GC×GC–ToF–MS data, when interpreted as images and processed through automated registration and convolutional neural networks, can accurately classify biological sex and verify individual identity from human scent. The open-source, zero-human-intervention pipeline paves the way for scalable, reproducible biometric analysis using complex chromatographic datasets.
Reference
No explicit literature references were provided in the original text.
Content was automatically generated from an orignal PDF document using AI and may contain inaccuracies.
Similar PDF
16th Multidimensional Chromatography Workshop Abstract book
2025|JEOL|Others
February 3 – February 5, 2025 Workshop Guidebook Thank you to our sponsors for making this event possible. It is your generous support that enriches the conference program and allows us to operate the conference with free registration for all…
Key words
dimensional, dimensionalabstract, abstractchromatography, chromatographycomprehensive, comprehensivetwo, twogas, gastofms, tofmsspectrometry, spectrometrymass, massmultidimensional, multidimensionalflight, flightcoupled, coupledcompounds, compoundsusing, usinganalysis
Agilent AI Peak Integration for MassHunter
2023|Agilent Technologies|Technical notes
Technical Overview Agilent AI Peak Integration for MassHunter A tool to optimize GC-MS Phthalate Test Quantitative Analysis Authors Oleksandr Sosnovshchenko, Ruoji Luo, and Tamas King, Agilent Technologies, Inc. Abstract Agilent AI Peak Integration for MassHunter is a software tool designed…
Key words
integration, integrationlearning, learningmodel, modelmachine, machinepeak, peaktraining, trainingmasshunter, masshuntermanual, manualaccuracy, accuracypractitioners, practitionerscorrectness, correctnessmodels, modelsintegrator, integratoragilent, agilentprocess
Agilent 7250 GC/Q-TOF SEE THE WHOLE PICTURE
2017|Agilent Technologies|Brochures and specifications
Agilent 7250 GC/Q-TOF SEE THE WHOLE PICTURE 7250 GC/Q-TOF SEE THE WHOLE PICTURE The new Agilent 7250 GC/Q-TOF is the premier instrument for all of your GC/MS identification, quantification, and exploration challenges. Discovering confidently what’s in your sample at what…
Key words
masshunter, masshuntersuremass, suremassspectrum, spectrumunknowns, unknownscompound, compoundswarm, swarmlibrary, librarymolecular, molecularfidelity, fidelityidentification, identificationpicture, pictureautotune, autotuneprocessing, processingprofinder, profindermfe
Forensic olfactronics and human scent signatures created from GC×GC-MS data
2025|LECO|Presentations
Forensic olfactronics and human scent signatures created from GC×GC-MS data Štěpán Urban Department of analytical chemistry University of Chemistry and Technology, Prague Forensic olfactronics and human scent signatures created from GC×GC-MS data CONTENT 5 - 7 MARCH 2024 DUBAI WORLD…
Key words
scent, scentolfactronics, olfactronicsforensic, forensichuman, humancontraception, contraceptionprimary, primarycompounds, compoundssignatures, signaturesštěpán, štěpántertiary, tertiarysecondary, secondarychemistry, chemistrythousands, thousandspersons, personsgenetically