TRANSPATH® Report 1, 0001 (2003)
 
Application of automatic Pfam annotation to TRANSPATH®

Philip Stegmaier, TRANSPATH_Team

BIOBASE GmbH
Halchtersche Strasse 33
D-38304 Wolfenbuettel, Germany
 
Protein sequences from TRANSPATH® release 3.3 were searched with sequence family models from Pfam release 7.5 [1].

A subset of the Pfam-A database was derived through parsing the swisspfam file of the current Pfam release, available from the Pfam ftp site, and extracting relevant models from the database. TRANSPATH® proteins were searched with this subset and sequence family hits are documented with a general E-value cut-off of 0.5. Overlapping matches are tolerated up to a length of 10% of each of the affected hits, otherwise only the model with the lower E-value is considered. The software used for the extraction of models from the Pfam-A database and for searching TRANSPATH® proteins with the derived Pfam subset was HMMER 2.2g [2][3].

For each Pfam family shown in the match display of a TRANSPATH® molecule, the raw score and the E-value are given. While raw scores can be negative and increase with the quality of a match, E-values are always greater or equal to zero and are smaller the greater the quality of a match. An E-value reports the estimated significance of a hit, while the raw score reports the score of an alignment between the matched region and the model. Empirically, matches with E-values computed with the applied software, which are equal to or below 0.1 can be considered to be significant despite a negative raw score. Still, true matches can have much higher E-values.

At least one of 410 Pfam sequence families per protein was documented for 2938 of 3088 TRANSPATH® sequences.
 
[1] PMID: 11752314
Bateman A, Birney E, Cerruti L, Durbin R, Etwiller L, Eddy SR, Griffiths-Jones S, Howe KL, Marshall M, Sonnhammer ELL. Nucleic Acids Res. 30:276-280 (2002).

[2] Eddy SR. HMMER: Profile hidden Markov models for biological sequence analysis (http://hmmer.wustl.edu/).

[3] PMID: 9918945
Eddy SR. Profile hidden Markov models. Bioinformatics 14:755-763 (1998).

 
The TRANSPATH®-Team - Feel free to mail us!
copyright © > BIOBASE GmbH < 2003 / Production by: TRANSPATH®-Team