Spectral denoising: electronic denoising
The electronic_denoising function removes obvious electronic noise ions in MS/MS spectra, usually shown as multiple ions with identical intensities ( Grass noise )
According to empiracally tested on NIST23 database, in a given spectrum, the number of ions with identical intensities more than 4 is extremely unlikely (< 0.05%). Thus, the electronic_denoising function removes ions with identical intensities greater than 4.
Here’s an example:
import spectral_denoising as sd
from spectral_denoising.noise import *
peak = np.array([[69.071, 7.917962], [86.066, 1.021589], [86.0969, 100.0]], dtype=np.float32)
pmz = 91
noise = generate_noise(pmz, lamda=10, n = 50)
peak_with_noise = add_noise(peak, noise)
peak_denoised = sd.electronic_denoising(peak)
print(f'Entropy similarity of spectra with noise: {sd.entropy_similairty(peak_with_noise,peak, pmz ):.2f}.')
print(f'Entropy similarity of spectra with noise: {sd.entropy_similairty(peak_denoised,peak, pmz ):.2f}.')
The output will be:
Entropy similarity of spectra with noise: 0.37.
Entropy similarity of spectra with noise: 1.00.
References
- spectral_denoising.electronic_denoising(msms)[source]
Perform electronic denoising on a given mass spectrometry (MS/MS) spectrum. This function processes the input MS/MS spectrum by sorting the peaks based on their intensity, and then iteratively selects and confirms peaks based on a specified intensity threshold. The confirmed peaks are then packed and sorted before being returned.
- Parameters:
msms (np.ndarray): The first item is always m/z and the second item is intensity.
- Returns:
np.ndarray: The cleaned spectrum with electronic noises removed. If no ion presents, will return np.nan.
- spectral_denoising.noise.generate_noise(pmz, lamda, n=100)[source]
Generate synthetic electronic noise for spectral data.
- Parameters:
pmz (float): The upper bound for the mass range.
lamda (float): The lambda parameter for the Poisson distribution, which serves as both mean and standard deviation of the distribution.
n (int, optional): The number of random noise ions to generate. Defaults to 100.
- Returns:
np.array: A synthetic spectrum with electronic noise.
- spectral_denoising.noise.generate_chemical_noise(pmz, lamda, polarity, formula_db, n=100)[source]
Generate chemical noise for a given mass-to-charge ratio (m/z) and other parameters. The m/z of the chemical noise is taken from a database of all true possible mass values. The detailes about this database can be found paper: LibGen: Generating High Quality Spectral Libraries of Natural Products for EAD-, UVPD-, and HCD-High Resolution Mass Spectrometers
- Args:
pmz (float): The target mass-to-charge ratio (m/z) value.
lamda (float): The lambda parameter for the Poisson distribution used to generate intensities, which serves as both mean and standard deviation of the distribution.
polarity (str): The polarity of the adduct, either ‘+’ or ‘-‘.
formula_db (pandas.DataFrame): A DataFrame containing a column ‘mass’ with possible mass values.
n (int, optional): The number of noise peaks to generate. Default is 100.
- Returns:
np.array: A synthetic spectrum with chemical noise.
- Raises:
ValueError: If the polarity is not ‘+’ or ‘-‘.
- spectral_denoising.noise.add_noise(msms, noise)[source]
Add noise to a mass spectrum and process the resulting spectrum. This function takes a mass spectrum and a noise spectrum, standardizes the mass spectrum, adds the noise to it, normalizes the resulting spectrum, and sorts it.
- Args:
msms (np.ndarray): The mass spectrum to which noise will be added.
noise (np.ndarray): The noise spectrum to be added to the mass spectrum.
- Returns:
np.ndarray: The processed mass spectrum after adding noise, normalization, and sorting.
- Notes:
The noise spectrum is generated with intensity as ralatie measure (from 0-1)
Thus, the mass spectrum is standardized using the standardize_spectrum function.