Assessing the Precision of High-Throughput Computational and Laboratory Approaches for the Genome-Wide Identification of Protein Subcellular Localization in Bacteria

Peer reviewed: 
Yes, item is peer reviewed.
Scholarly level: 
Final version published as: 

BMC Genomics 2005, 6:162 doi:10.1186/1471-2164-6-162

Date created: 

AbstractBackground: Identification of a bacterial protein's subcellular localization (SCL) is important forgenome annotation, function prediction and drug or vaccine target identification. Subcellularfractionation techniques combined with recent proteomics technology permits the identification oflarge numbers of proteins from distinct bacterial compartments. However, the fractionation of acomplex structure like the cell into several subcellular compartments is not a trivial task.Contamination from other compartments may occur, and some proteins may reside in multiplelocalizations. New computational methods have been reported over the past few years that nowpermit much more accurate, genome-wide analysis of the SCL of protein sequences deduced fromgenomes. There is a need to compare such computational methods with laboratory proteomicsapproaches to identify the most effective current approach for genome-wide localizationcharacterization and annotation.Results: In this study, ten subcellular proteome analyses of bacterial compartments werereviewed. PSORTb version 2.0 was used to computationally predict the localization of proteinsreported in these publications, and these computational predictions were then compared to thelocalizations determined by the proteomics study. By using a combined approach, we were able toidentify a number of contaminants and proteins with dual localizations, and were able to moreaccurately identify membrane subproteomes. Our results allowed us to estimate the precision levelof laboratory subproteome studies and we show here that, on average, recent high-precisioncomputational methods such as PSORTb now have a lower error rate than laboratory methods.Conclusion: We have performed the first focused comparison of genome-wide proteomic andcomputational methods for subcellular localization identification, and show that computationalmethods have now attained a level of precision that is exceeding that of high-throughput laboratoryapproaches. We note that analysis of all cellular fractions collectively is required to effectivelyprovide localization information from laboratory studies, and we propose an overall approach togenome-wide subcellular localization characterization that capitalizes on the complementary natureof current laboratory and computational methods.

Document type: