Open Source for Raman Spectroscopy Data Harmonization
Publication category: Research article
Author: Georgi Georgiev, Dirk Lelinger, Luchesar Iliev, Vedrin Jeliazkov, Miguel Banares, Raquel Portela, Nina Jeliazkova
Publication date: 06 March 2025
DOI: https://doi.org/10.1002/jrs.6789
Language: English
Abstract:
Raman spectra from the same sample might differ between instruments and over time depending on the spectrometer, optical path, or sample environment, among others. There is a need to harmonize and standardize characterization by Raman spectroscopy, enabling end users to share and reuse digital spectroscopic data through FAIR databases. In this context, we present an open-source, MIT-licensed, Python package called ramanchada2 that collects existing and novel state-of-the-art algorithms, allowing the users to generate, read, visualize, and process Raman spectra, with special emphasis on instrument calibration. A number of input formats are supported, including those from DFT simulations. This package also offers a tool for the generation of synthetic spectra based on user specifications for data augmentation and algorithm benchmarking. NeXus is introduced as an input and output format, enabling the possibility to package into a single file data and metadata, including data processing information, from multiple and even from different types of experiments, for example, XRD, Raman, and biological assays, using a harmonized structure and terminology. To facilitate Raman spectra analysis by end users, we developed “Oranchada” add-on for data-mining open-source software Orange, a user-friendly wrapper of all ramanchada2 functionalities with predefined harmonization workflows, as well as an online application for calibration.
Keywords: Calibration, Data processing, NeXus, Orange data mining, Python