Harnessing Featurization and Machine Learning to Drive Practical Surface-enhanced Raman scattering Sensing Applications

Surface-enhanced Raman scattering (SERS) spectroscopy is an ultrasensitive vibrational spectroscopic technique that enhances molecules’ weak intrinsic Raman fingerprints for rapid qualitative and quantitative detection. However, real-life and practical molecular detection is hindered by challenges, including low-throughput SERS nanosensor production, the presence of structural isomers, and multiplexed sample matrices.

Herein, we overcome these roadblocks by synergizing domain knowledge-based feature engineering with emerging machine learning (ML) strategies to advance the practical translation of SERS from expediting SERS nanosensor characterization to detecting unknown chemical homologs and mixtures. We leverage plasmonic featurization and ML to enable rapid, accurate, bidirectional gold and silver nanoparticle characterization from extinction spectra and vice-versa to bypass the tedious electron microscopy. We also successfully created a forward-predictive chemical taxonomy ML framework for untargeted structural elucidation and quantification of “unknown” epimeric biomolecules that were not trained in the models before. We further demonstrate a generalizable transfer learning model capable of detecting and quantifying four carnitine derivatives in multicomponent binary, ternary, and mixtures without needing to train specific models for different mixtures.

Overall, we showcase the immense potential of ML-driven SERS in real-life sensing applications where the identity and quantity of analytes are often unknown.