AI models for in silico proteome prediction

Supervision

Eduard Sabidó
CRG, 1st Supervisor
Mathias Wilhelm
TUM, 2nd Supervisor

Objectives

Exploration of the multidimensional space of the proteome to evaluate its constraints, and relationships. Employment of artificial intelligence to recreate complete proteomes in silico and predict protein detectability. Prioritization of peptides for identification and quantification in mass spectrometry-based proteomics experiments. Identification of unexpected mutations and post-translational modifications in proteomic data.

Methodology

Re-analyse publicly available human datasets of plasma, cell, and tissue proteomes. Assess protein detectability, abundance distributions, constraints, and dependencies. Build a multidimensional map of protein abundance to predict protein presence in samples. Generate in silico complete proteomes using artificial intelligence. Analyse experimental quantitative data to predict and confirm peptide mutations. Anticipate the presence of post-translational modifications.

Required skills

Programming skills. Knowledge of machine learning algorithms, bioinformatics, proteomics and molecular biology.

Expected Results

Artificial intelligence models capable of predicting complete proteomes, simplifying the identification and quantification of proteins, mutations, and post-translational modifications in biomedical samples through mass spectrometry analysis.

Planned secondments

Host: TUM (M. Wilhelm), Duration: 2 Months; When: Year 1, Goal: Learn AI-prediction models for peptide features.

Host: CNRS-IPBS (D. Bouyssié ), Duration: 1 Month; When: Year 2, Goal: Application of real-time acquisition methods in proteome analysis.

Host: UCAM (K. Lilley), Duration 1 Month; When: Year 3, Goal: Learn subcellular proteomics techniques.

Enrolment in doctoral programs

PhD in Biomedicine from Universitat Pompeu Fabra.

References

1 Elhamraoui Z, Borràs E, Wilhelm M, Sabidó E. Theoretical Assessment of Indistinguishable Peptides in Mass Spectrometry-Based Proteomics. Anal Chem. 2024 Oct 8;96(40):15829-15833. doi: 10.1021/acs.analchem.4c02803.

2 Chiva C, Elhamraoui Z, Solé A, Serret M, Wilhelm M, Sabidó E. Assessment and Prediction of Human Proteotypic Peptide Stability for Proteomics Quantification. Anal Chem. 2023 Sep 19;95(37):13746-13749. doi: 10.1021/acs.analchem.3c02269.

3 Borràs E, Pastor O, Sabidó E. Use of Linear Ion Traps in Data-Independent Acquisition Methods Benefits Low-Input Proteomics. Anal Chem. 2021 Aug 31;93(34):11649-11653. doi: 10.1021/acs.analchem.1c01885.

4 Olivella R, Chiva C, Serret M, Mancera D, Cozzuto L, Hermoso A, Borràs E, Espadas G, Morales J, Pastor O, Solé A, Ponomarenko J, Sabidó E. QCloud2: An Improved Cloud-based Quality-Control System for Mass-Spectrometry-based Proteomics Laboratories. J Proteome Res. 2021 Apr 2;20(4):2010-2013. doi: 10.1021/acs.jproteome.0c00853.