Genome annotation

Our core genome informatics project is the PEDANT Genome Database that is being maintained since 1997 and currently contains over 3000 fully sequenced genomes (joint project with the Institute for Bioinformatics and Systems Biology and Biomax Informatics). PEDANT was used to support several genome sequencing projects (Saccharomices cerevisiae, Arabidopsis thaliana, Neurospora crassaThermoplasma acidophilum, Chlamydia, and Kuenenia stuttgartiensis). We proposed novel algorithms to predict genes in bacterial genomes and to improve annotation quality by both positive and negative rule mining. 

Structural bioinformatics

In the past we developed algorithms for secondary structure prediction and assignment as well as for threading. More recent work involved prediction of various experimental properties of proteins, including crystallisation propensity and solubility. A novel predictor of solvent accessibility is under active development. In general we are interested in applying machine learning methods to derive various structural features of proteins from their amino acid sequences.

One of the main directions of work is the investigation of the structural diversity of membrane proteins and the development of new methods to predict and classify their structure. In particular, our group developed the first ever algorithms to predict residue contacts and helix interation patterns in membrane proteins based on correlated mutations and neural networks. We maintain a comprehensive database called CAMPS (Computational Analysis of Membrane Protein Space) designed for studies of structure-function relationships in membrane proteins. Current activities focus on clustering membrane protein folds based on helix interaction diagrammes and investigating the impact of genetic variation (alternative splicing, mutations) on membrane protein structure.

Another recent interest is the investigation of sequence-structure relationships in mRNA molecules.

Protein and domain interactions

We are interested in developing better methods to predict protein and domain interactions from genomic data. The group maintains a comprehensive and up-to-date analysis of domain interaction networks. One problem in predicting protein interactions is the absence of reliable negative data. In collaboration with the group of Dr. Ruepp (Hemlholtz Zentrum München) we created a database on non-interacting proteins (Negatome) based on three-dimensional structure of protein complexes as well as on careful manual annotation of scientific literature. We also contributed to the development of the MIPS mammalian protein interaction database and conducted a survey of mammalian protein complex organization. We are currently working on methods to classify protein interactions according to their type and to incorporate the strength and directionality of interactions into interaction networks.

Computational proteomics

We provided computational support for abundance profiling of the Escherichia coli cytosolidentification of GroEL substrates, and proteome-wide analysis of chaperonin-dependent protein folding in Escherichia coli (joint projects with the departments of Prof. Mann and Prof. Hartl at MPI for Biochemistry). We currently analyze structural environments of phosphorylation sites and their influence on phosphorylation dynamic during cell cycle (with the Mann department). We also investigate small cell lung cancer based on epigenomic, expression, and phosphoproteomics data and phosphorylation patterns at different developmental stages of swine (joint work with Prof. Küster, TU Munich). There is an active collaboration with Prof. Pevzner (UCSD) and Dr. Payne (PNL) on proteogenomics annotation of cleavable N-terminal targeting signals and with the Dr. Kolker group (Seattle Childrens) on predicting detectability of proteins by proteomics.

Viral bioinformatics

This new direction of work involves investigation of evolutionary constraints acting on viral RNA structures and determining expression levels of viral genes (joint work with Dr. Shneider, CureLab Inc.) as well mutation patterns during longitudinal evolution (joint work with Dr. Hoffmann, Institute of Virology, TU Munich).