Bacterial regulatory networks are extremely flexible in evolution.
Irma Lozada-Chávez, Sarath Chandra Janga & Julio Collado-Vides
Program of Computational Genomics, Center for Genomic Sciences-UNAM, Apdo. Postal 565-A, Cuernavaca, Morelos, 62100 Mexico.
Abstract
Over millions of years the structure and complexity of the transcriptional regulatory network (TRN) in bacteria has extensively changed, reorganized and enabled them to adapt to almost every environmental niche on earth. In order to understand the enormous plasticity of TRNs in bacteria, we studied the conservation of currently known TRNs of the two model organisms Escherichia coli K12 and Bacillus subtilis across complete genomes including Bacteria, Archaea and Eukarya at three different levels: individual components of the TRN, pairs of interactions and regulons. We found that transcription factors (TFs) evolve mquch faster than the target genes (TGs) across phyla. We show that global regulators are poorly conserved across the phylogenetic spectrum and hence transcription factors could be the major players responsible for the plasticity and evolvability of the TRNs. We also found that there is only a small fraction of significantly conserved transcriptional regulatory interactions among different phyla of bacteria and that there is no constraint on the elements of the interaction to co-evolve. Finally our results suggest that majority of the regulons in bacteria are lost rapidly with phylogenetic distances implying a high order flexibility in the TRNs. We hypothesize that during the divergence of bacteria certain essential cellular processes like the synthesis of arginine, biotine and ribose, transport of amino acids and iron, availability of phosphate, replication process and the SOS response are well conserved in evolution. From our comparative analysis, it is possible to infer that the complexity and structure of the transcriptional regulatory networks has been an important factor for phenotypic speciation.
Sections
1) Data sets of Regulons used in the entire analysis from Escherichia coli and Bacillus subtilis.
2) Predictions of Regulons in the complete set of 204 genomes used in the entire analysis from the perspective of Escherichia coli and Bacillus subtilis using the regulog approach described in the paper.
3) All the Figures described in the manuscript but of high quality.
6) Tables showing the function of different TFs from E. coli and B. subtilis analysed in this work.
1.Date sets of regulons used in the entire analysis from Escherichia coli and Bacillus subtilis. The files are tab-limited with the first column representing the TF regulating the number of genes mentioned in the second column. The third column represents the list of all genes regulated by this TF separated by space.
2.Predictions of regulons in the complete set of 204 genomes used in the entire analysis from the perspective of Escherichia coli K12 can be downloaded from here and those predicted from the perspective of Bacillus subtilis can be downloaded from here. The format of the files is the same as above.
3.Download all the Figures which have been described in the manuscript in high quality PDF format from here in a tarball or please click below to view the respective figures.
Figure 1(a) showing the "Conservation of the components of the TRN (TFs and TGs) across the three domains of life for Escherichia coli K12 along with the distribution of different phyla grouped as shown in the supplementary material of the manuscript. Note that the colors on the X-axis show the grouping of phyla as shown in the supplementary material."
Figure 1(b) showing the " Conservation of the components of the TRN (TFs and TGs) across the three domains of life for Bacillus subtilis along with the distribution of different phyla grouped as shown in the supplementary material of the manuscript"
Figure 2(a) showing the " Conservation of Global Regulators (GR) and their regulons across genomes for Escherichia coli K12 "
Figure 2(b) showing the " Conservation of Global Regulators (GR) and their regulons across genomes for Bacillus subtilis "
Figure 3 showing the " Classification of TF-TG pairs into three different categories a) TFs and TGs co-evolve b) TF is evolutionarily more conserved than TG and c) TF is less conserved than TG " along with an example explaining the calculation of Distance.
Figure 4(a) showing the " Conservation of regulons across genomes clustered by the extent of TRN and regulon conservation for Escherichia coli K12 "
Figure 4(b) showing the " Conservation of regulons across genomes clustered by the extent of TRN and regulon conservation for Bacillus subtilis "
Transposed versions of the Figure 2a, Figure 2b and Figure 3 which show the vertical display of the conservation of global regulators and an example showing the criteria for co-occurence patterns.
4.Figures showing that the distributions of the quantities (TF=TG, TF> TG and TF < TG) in the randomized networks (based on 1000 random networks) are normal can be downloaded from here for both the genomes analysed (Note that the Figures are in postscript format in the tarball) or alternatively the same figures can be viewed in pdf format for E.coli and B. subtilis . In each case we found that the distribution can be easily approximated to normal and hence tests used for normal distributions were employed. Increasing the number of randomisations doesnt vary the results appreciably. Below is a table showing the results based on a comparison against 10,000 randomly generated networks.
Escherichia coli K12 |
Bacillus subtilis |
||||
Category |
Interactions |
Z-score(P-value) |
Category |
Interactions |
Z-score(P-value) |
TF = TG |
15 |
4.96 (< 0.0001) |
TF = TG |
15 |
2.94 (0.0033) |
TF > TG |
813 |
-5.28 (< 0.0001) |
TF > TG |
363 |
3.30 (0.00097) |
TF < TG |
759 |
3.84 (0.000123) |
TF < TG |
349 |
-3.61 (0.00031) |
5.Comparison of different distance metrics for the clustering of regulons shown in Figure 4 shows that the clustering is not particularly sensitive to the type of distance metric used in both E. coli and B. subtilis (click to see the figures). These figures show that the results are not sensitive to any particular distance metric. Please note that the type of linkage method used in this comparison was K-means clustering for all of them.
Clustering of regulons across 110 species. For each TF in Escherichia coli and Bacillus subtilis , we calculated the percentage of total interactions conserved in its regulon across genomes. To represent this distribution we clustered by the extent of TRN and regulon conservation using Centroid Linkage Clustering method with an Uncentered Correlation as distance metric from the Cluster program (de Hoon et al ., 2004) . Clustering data represents 118 regulons in E. coli and 93 regulons in B. subtilis conserved across genomes [see Figures 4a and 4b].
Comparison of distance metrics. We calculate k-means clustering to evaluate other distance metrics, such as Euclidean and Kendall 's Tau distances, but they were not found to be significantly different in their ability to group lineages and regulons. We start this comparison with a collection of items (regulons and species: 118 regulons in E. coli , 93 regulons in B. subtilis, and 110 genomes in both cases) forming 10 clusters (k 1 ) for regulons and 5 clusters (k 2 ) for species that we want to find through 100 runs. The sum of distances within the clusters is used to compare different clustering solutions. The clustering solution with the smallest sum of within-cluster distance is saved for each distance metric. Therefore, we compare the items (5 clusters for regulons and 10 clusters for species) for each cluster generated with Uncentered Correlation distance with respect to items for each cluster generated with Euclidean and Kendall 's Tau distances. Finally, we calculate the proportion of items shared in two last metrics with respect to first for each cluster. In particular we found that the results are not sensitive to this parameter so we chose to stick to Uncentered Distance Metric which is widely used for clustering biological data sets.
a) Escherichia coli Transcription Factors
b) Bacillus subtilis Transcription Factors
For Questions/Comments, please mail: sarath AT ccg.unam.mx or ilozada AT ccg.unam.mx