Sex and person identity recognition from GC×GC analysis of scent samples (Jan Hlavsa, MDCW 2025)

- Photo: MDCW: Sex Classification and Identity Verification from GC×GC of Human Scent Samples (Jan Hlavsa, MDCW 2025)
- Video: LabRulez: Jan Hlavsa: Sex and person identity recognition from GC×GC analysis of scent samples (MDCW 2025)
🎤 Presenter: Jan Hlavsa (Czech Technical University, Prague, Czech Republic)
💡 Book in your calendar: 17th Multidimensional Chromatography Workshop (MDCW) 13 - 15. January 2026
Abstract
We present a method for the recognition of the identity and sex given a GC×GC chromatogram of a scent sample. In all stages, the method is fully automatic. First, the chromatogram is aligned by identifying 2D peak locations of pre-determined compounds that have been found to be present in all samples. The peak location is the point of maximum correlation with a machine-learned prototype of its spectrogram within a pre-learned part of the chromatogram. The GCxGC space is triangulated and affine warped to canonical coordinates. In the second step, the aligned chromatogram is flattened to a 1D vector whose values are, depending on the experiment, the total mass recorded at a given time, or the output of a spectrometer. The vector is the input of the identity or sex classifier. Two types of classifiers were trained and evaluated - the SVM (polynomial and RBF kernels) and a shallow convolution neural network.
Experiments were carried out on data collected from the palms of 40 volunteers onto glass beads, extracted into ethanol, and analysed by a LECO GC×GC-ToF over a three year period ( 2019-2022) . Each volunteer provided at least 10 samples, each a week apart. In case of the gender identification, by finding an optimal linear projection of the Total Ion Chromatogram data, we achieve approximately 85% validation accuracy in sex identification using 10-fold cross-validation on a balanced dataset (50% male, 50% female) of 504 measurements from 40 identities. The top 5 validation accuracy for classifying 20 identities is approximately 60%.
Video transcription
Jan Hlavsa presented research focused on automatic sex classification and identity verification from human scent chromatograms using GC×GC-ToFMS. The project is a collaboration between the Czech Technical University in Prague and the University of Chemistry and Technology, Prague, with computational support from the LUMI supercomputer (over 120,000 GPU hours).
Data and Methods
- Samples: 2,528 chromatograms collected from 252 volunteers, analyzed by GC×GC-ToFMS.
- Data representation: raw outputs with no pre-processing, stored as 3D tensors (mass spectrum, 1st retention time, 2nd retention time). Each dataset ~4 GB.
- Chromatogram alignment: fully automatic registration using 24 chemical marker compounds and piecewise linear transformation.
- Model: convolutional neural networks (CNNs) applied for both sex classification and identity verification tasks.
Sex Classification
- Task: determine the biological sex of a donor from a human scent chromatogram.
- Results:
- Reduced spectral dimension data: 84.5% accuracy.
- Full data: 94.5% accuracy.
Identity Verification
- Task: verify whether two scent samples come from the same individual.
- Evaluation metric: AUC (Area Under the ROC Curve).
- Results:
- Training set: AUC = 0.96.
- Validation set: AUC = 0.93.
Key Advantages
- Fully automated workflow requiring zero manual intervention during chromatogram registration.
- Reproducible results with open-source Python code soon to be available.
- High accuracy in both sex classification and identity verification tasks.
Conclusion
- This study demonstrates that GC×GC analysis of human scent, combined with convolutional neural networks, can successfully differentiate sex and verify identity with high accuracy. Future work will focus on expanding the datasets and further optimizing the methodology.
This text has been automatically transcribed from a video presentation using AI technology. It may contain inaccuracies and is not guaranteed to be 100% correct.
-Workshop-LOGO_s.webp)



