Enhancing Spectrum Clarity: A Neural Network Approach for Signal-Noise Discrimination in DIA MS2 Data
Supervision
Viktoria Dorfer
FHOOE, 1st Supervisor
Lukas Käll
KTHZ, 2nd Supervisor
Objectives
Harness state-of-the-art neural network architectures to improve the distinction between signal and noise peaks in MS2 spectra. Develop a robust training pipeline using DIA data to accurately classify MS2 peaks as either signals or noise. Enhance spectrum cleaning, enabling more efficient peptide identification by reducing noise from diverse sources.
Methodology
Utilize deep neural network components like RNNs, LSTMs, and GRUs for mass spectrum data analysis. Leverage variations in individual peptide intensities during elution to differentiate noise from signal peaks in MS2 spectra. Apply this understanding to improve spectrum cleaning before peptide identification.
Required skills
The candidate should have a solid background in machine learning and deep learning , with experience in time-series data analysis and signal processing. Proficiency in Python is required, along with strong analytical and problem-solving skills.
Expected Results
Integration of a spectrum cleaning model for signal and noise classification. Customized peptide identification algorithms for enhanced accuracy. Machine learning algorithm supporting re-training and fine-tuning with lab-specific data. Tailored identification of measurement artifacts
Planned Secondments
Host: KTH (L. Käll), Duration: 1 Months; When: Year 1, Goal: Integration of pretrained models for DIA for MS2 peak classification.
Host: SANGER (B.Lehner): Duration: 1 Month; When: Year 2, Goal: Learn how to add interpretability to model.
Host: BIOCEV (P. Novák): Duration: 1 Month; When: Year 3; Goal: Networking & application to HDX/FPOP data.
Enrolment in doctoral programs
UNIVERSITAT LINZ
References
1 Buur, L. M., Declercq, A., Strobl, M. T., Bouwmeester, R., Degroeve, S., Martens, L., Dorfer, V. & Gabriels, R. (2024). MS2Rescore 3.0 is a modular, flexible, and user-friendly platform to boost peptide identifications, as showcased with MS Amanda 3.0. Journal of Proteome Research. 23 (8), 3200-3207, 10.1021/acs.jproteome.3c00785.
2 Birklbauer, M. J., Matzinger, M., Müller, F., Mechtler, K., & Dorfer, V. (2023). MS Annika 2.0 Identifies Cross-Linked Peptides in MS2–MS3-Based Workflows at High Sensitivity and Specificity. Journal of Proteome Research, 22 (9), 3009-3021, 10.1021/acs.jproteome.3c00325
3 Dorl, S., Winkler, S., Mechtler, K., & Dorfer, V. (2023). MS Ana: Improving Sensitivity in Peptide Identification with Spectral Library Search. Journal of Proteome Research, 22(2), 462-470, 10.1021/acs.jproteome.2c00658
4 Dorfer, V., Strobl, M., Winkler, S., & Mechtler, K. (2021). MS Amanda 2.0: Advancements in the standalone implementation. Rapid Communications in Mass Spectrometry, 35(11), e9088, 10.1002/rcm.9088
5 Dorfer, V., Pichler, P., Stranzl, T., Stadlmann, J., Taus,T., Winkler, S., & Mechtler, K. (2014). MS Amanda, a Universal Identification Algorithm Optimized for High Accuracy Tandem Mass Spectra, Journal of Proteome Research, 13 (8), 3679-3684, 10.1021/pr500202e