Bioinformatics
In
order to facilitate standardized storage, management and
retrieval of data of mass spectrometry based proteomics
experiments and for a better presentation and
comprehensive interpretation of such data, we are
developing a suite of software tools.
Generally, after analysis by mass spectrometry data goes through specialized software (Mascot, GPM, PROWL, etc.) to determine what peptides (and from that, what proteins) are present in the sample and what their modification status is. For this step, we use a dedicated server running Mascot that uses publicly available protein databases. The results of the identification process are stored in a relational database (PostgreSQL) for easy retrieval. This database also contains daily updated reference data from NCBI (Entrez), EBI (IPI), Uniprot, Flybase, and several more. In addition, relevant information about the experimental setup is stored.
We focus on presenting comprehensive biological annotation to large sets of proteins that results from the mass spectrometry analyses. GO terms, as well as links to all major protein databases and protein interactions databases are automatically added to the output lists of proteins that contain all relevant information as provided by the search engine (scoring, numbers of peptides identified, sequences, etc.). Also, homologous proteins from different species are added to the list of proteins.
The software can compare multiple datasets using advanced searching, sorting and filtering options and see which proteins overlap and which ones are specific for your sample. Also, it is possible to compare your results to those of other researchers (in case of mutual agreement) who are doing comparable research or work on similar experimental systems. Contact us if you have specific requirements concerning the database search process or data presentation, such as using customized protein sequence databases or introducing links to any database of your interest.
Output data are now presented in spreadsheet files, but in the near future we will shift to a web based interface so that the results can be viewed in a more interactive way by questioning the database directly. We plan to make these software tools available to other researchers as an open-source package.
For quantitative proteomics experiments we use MaxQuant (Mann Lab), the MatrixScience Quantitation Toolbox and Thermo's Proteome Discoverer 1.2.