Enhancing Spectrum Clarity: A Neural Network Approach for Signal-Noise Discrimination in DIA MS2 Data

Supervision

Viktoria Dorfer
FHOOE, 1st Supervisor
Lukas Käll
KTHZ, 2nd Supervisor

Objectives

Harness state-of-the-art neural network architectures to improve the distinction between signal and noise peaks in MS2 spectra. Develop a robust training pipeline using DIA data to accurately classify MS2 peaks as either signals or noise. Enhance spectrum cleaning, enabling more efficient peptide identification by reducing noise from diverse sources.

Methodology

Utilize deep neural network components like RNNs, LSTMs, and GRUs for mass spectrum data analysis. Leverage variations in individual peptide intensities during elution to differentiate noise from signal peaks in MS2 spectra. Apply this understanding to improve spectrum cleaning before peptide identification.

Required skills

The candidate should have a solid background in machine learning and deep learning , with experience in time-series data analysis and signal processing. Proficiency in Python is required, along with strong analytical and problem-solving skills.

Expected Results

Integration of a spectrum cleaning model for signal and noise classification. Customized peptide identification algorithms for enhanced accuracy. Machine learning algorithm supporting re-training and fine-tuning with lab-specific data. Tailored identification of measurement artifacts

Planned Secondments

Host: KTH (L. Käll), Duration: 1 Months; When: Year 1, Goal: Integration of pretrained models for DIA for MS2 peak classification.

Host: SANGER (B.Lehner): Duration: 1 Month; When: Year 2, Goal: Learn how to add interpretability to model.

Host: BIOCEV (P. Novák): Duration: 1 Month; When: Year 3; Goal: Networking & application to HDX/FPOP data.

Enrolment in doctoral programs

UNIVERSITAT LINZ

References

1 Buur, L. M., Declercq, A., Strobl, M. T., Bouwmeester, R., Degroeve, S., Martens, L., Dorfer, V. & Gabriels, R. (2024). MS2Rescore 3.0 is a modular, flexible, and user-friendly platform to boost peptide identifications, as showcased with MS Amanda 3.0. Journal of Proteome Research. 23 (8), 3200-3207, 10.1021/acs.jproteome.3c00785.

2 Birklbauer, M. J., Matzinger, M., Müller, F., Mechtler, K., & Dorfer, V. (2023). MS Annika 2.0 Identifies Cross-Linked Peptides in MS2–MS3-Based Workflows at High Sensitivity and Specificity. Journal of Proteome Research, 22 (9), 3009-3021, 10.1021/acs.jproteome.3c00325

3 Dorl, S., Winkler, S., Mechtler, K., & Dorfer, V. (2023). MS Ana: Improving Sensitivity in Peptide Identification with Spectral Library Search. Journal of Proteome Research, 22(2), 462-470, 10.1021/acs.jproteome.2c00658

4 Dorfer, V., Strobl, M., Winkler, S., & Mechtler, K. (2021). MS Amanda 2.0: Advancements in the standalone implementation. Rapid Communications in Mass Spectrometry, 35(11), e9088, 10.1002/rcm.9088

5 Dorfer, V., Pichler, P., Stranzl, T., Stadlmann, J., Taus,T., Winkler, S., & Mechtler, K. (2014). MS Amanda, a Universal Identification Algorithm Optimized for High Accuracy Tandem Mass Spectra, Journal of Proteome Research, 13 (8), 3679-3684, 10.1021/pr500202e