Computational Predictions of Regulatory Genomic Elements Conforming Regulatory Networks

Context and History

Through the years we have used the knowledge in RegulonDB to implement
computational methods to predict different elements of what globally constitutes the regulatory network of a cell. It
is true that we have mostly worked with bacterial genomes, but we have
also investigated and contributed with methods initially implemented in
yeast and of more wider use, such as the so-called RSA tools or Regulatory Sequence Analysis Tools implemented by Jacques van Helden.

We participated in the E.coli
K-12 genome sequence annotation, contributing to what was the first
bacterial genome with comprehensive genomic predictions of operons,
regulatory sites, and promoters (Blattner et al., 1997). We recognize
that the interaction with Fred Blattner
motivated several of us to go beyond the previous grammatical modeling
of regulatory sites (Collado-Vides, 1992; Rosenblueth et al., 1996) and
got interested in predicting promoters and operons.

We contributed the first paper of a whole bacterial genome computational prediction of regulatory binding sites (Thieffry, et al., 1998). Similary, the first method to predict the operon organization of every single gene in a bacterial genome is the result of work performed in this labo ratory (Salgado et al., 2000).

Based on the helix-turn-helix motif, part of Ernesto Pérez-Rueda´s Ph.D. thesis was his comprehensive prediction of transcriptional regulatory factors in the complete E.coli genome ( Pérez-Rueda and Collado-Vides 2000).

The detailed analysis and computational support to the complex problem of sigma 70 promoter predictions in E.coli was the result of Araceli Huerta`s Ph.D. thesis (Huerta and Collado-Vides, 2003).

Current predictions in E.coli and beyond

Based on the methods mentioned here, RegulonDB datasets contains the most updated predictions of operons, promoters, regulatory sites and transcription factors in E.coli K-12.

Expanding on a strategy that combines gene orthology, operon organization, and binding site conservation (See (Tan et al, 2001) for the proof-of-concept publication),
in collaboration with Cuban and Brazilian colleagues, we have
implemented comprehensive predictions of regulons -binding sites of
specified regulators- in several gamma proteobacterial genomes
(González et al., 2005) . This can be accessed at TRACTORdb, and their biological analyses found in (Espinosa et al., in press).

The Transcriptional Factor Database (TRACTORdb) stores computationally
predicted transcription factors' binding sites in eight
gamma-proteobacterial genomes: Escherichia coli K12, Haemophilus influenzae, Salmonella typhi, Salmonella typhimurium, Shewanella oneidensis, Shigella flexneri, Vibrio cholerae, and Yersinia pestis.

Building on this experience we have implemented the database and computational predictions in the R.etli genome sequenced in this Center.


F.R.; Plunkett III G.; Bloch C.A.; Perna N.T.; Burland V.; Riley M.;
Collado-Vides J.; Glasner J.D.; Rode C.K.; Mayhew G.; Gregor J.; Davis
N.W.; Kirkpatrick H. A.; Goeden M.A.; Rose D.J.; Mau B.; Shao Y. (1997)
"The complete genome sequence of Escherichia coli K-12" Science 277 : 1453-1462

Collado-Vides J. (1992) "Grammatical model of the regulation of gene expression" Proc.Natl.Acad.Sci.USA 89 :9405-9409

Espinosa V, Gonzalez A, Vasconselos AT, Huerta A, Collado-Vides J."Comparative studies of transcriptional regulation mechanisms in a group of eight Gamma-Proteobacterial genomes". J. Mol Biol. In press.

A.D., dos Santos M.T., Espinosa V., Vasconcelos A.T., Ernesto
Pérez-Rueda E. and Collado-Vides J. (2005) "Computationally predicted
regulons in eight gamma-proteobacterial genomes" Nucleic Acids Res. 2005 33 Database Issue:D98-102

Huerta A.M. and Collado-Vides J. (2003) "Sigma 70 Promoters in Escherichia coli : Specific Transcription in Dense Regions of Promoter-like Signals" J.Mol.Biol . 333 :261-278

Pérez-Rueda E., and Collado-Vides J. (2000) "The Repertoire of DNA-Binding Transcriptional Regulators in Escherichia coli" Nucleic Acids Research 28 : 1838-1847

Rosenblueth D.A., Thieffry D., Huerta A.M., Salgado H., and Collado-Vides J. (1996) "Syntactic recognition of regulatory regions in Escherichia coli " CABIOS 12 : 415-422.

Salgado H., Moreno-Hagelsieb G., Smith T.F., and Collado-Vides J. (2000) "Operons in Escherichia coli : Genomic Analyses and Predictions" Proc.Natl.Acad.Sci.USA 97 : 6652-6657

Tan K., Moreno-Hagelsieb G , and Collado-Vides J. and Stormo G.D. (2001) "A Comparative Genomics Approach to Prediction of New Members of Regulons" Genome Res. 11 :566-584

Thieffry D.; Salgado H.; Huerta A.M., and Collado-Vides J. (1998) "Prediction of transcription regulatory sites in the complete genome of Escherichia coli" Bioinformatics 14 : 391-400

Computational Genomics Program