Ecology and genomics of Bacillus subtilis
Abstract
Bacillus subtilis is a remarkably diverse bacterial species – capable of growth within diverse environments including the gastrointestinal tracts of animals. Microarray-based comparative genomic analyses have revealed that members of this species also exhibit considerable genome diversity. The identification of strain-specific genes might explain how B. subtilis has become so broadly adapted. This goal – to identify ecologically adaptive genes – may soon be realized with the imminent release of several new B. subtilis genome sequences. As we embark upon this exciting new era of B. subtilis comparative genomics we review what is currently known about this species’ ecology and evolution.
Where do we find Bacillus subtilis?
Bacillus subtilis can be isolated from myriad environments – terrestrial and aquatic – making it appear as though this species is ubiquitous and broadly adapted to grow in diverse settings within the biosphere. However, like all members of the genus Bacillus, B. subtilis is capable of forming highly resistant dormant endospores in response to nutrient deprivation and other environmental stresses 1. These spores are easily made airborne and dispersed by wind 2, 3. Thus, spores might migrate long distances, land in a given environment but never germinate there. Considering that the traditional methods for isolating B. subtilis require that the organism be in its spore form, there is no guarantee that when a strain is isolated from a particular environment it was actually growing at that location. Thus, to date, the question of where B. subtilis grows remains largely unanswered.
Does B. subtilis actually grow in soil or is this a place where spores accumulate until they are once again presented with conditions propitious for their germination and proliferation? The use of fluorescent anti-bodies to distinguish vegetative and spore forms of B. subtilis in diverse soil samples 4 revealed that the organism was most often in its vegetative form when associated with decaying organic material 5. Further support for the idea that B. subtilis can lead a saprophytic lifestyle comes from experiments in which spores were inoculated into artificial soil microcosms saturated with filter-sterilized soluble organic matter extracted from soil 6. Under these conditions the spores not only germinated but the vegetative cells proliferated for several days until they once again sporulated, likely in response to nutrient depletion. Soon after germination the cells formed bundled chains that moved on the surface in a flagella-independent fashion 6. Interestingly, a similar transition to growth as bundled chains is observed during the early stages of biofilm development under laboratory conditions 7 (Box 1).
B. subtilis can also grow in close association with plant root surfaces. In the laboratory, when B. subtilis was inoculated on the roots of Arabidopsis thaliana growth of biofilms was observed 8, 9. In addition, B. subtilis can be isolated, in greater numbers than most other spore-forming bacteria, from the rhizosphere of a variety of plants 10–13. There is evidence that through these associations B. subtilis may promote plant growth 13. Possible explanations for this growth promotion are that: (i) B. subtilis out-competes other microbes that would otherwise adversely affect the plant, (ii) B. subtilis activates the host defense system so that the plant is poised to resist potential pathogens and, (iii) B. subtilis makes certain nutrients more readily available to the plant (e.g. phosphorous and nitrogen) 14.
Considering that B. subtilis is found on and around plants and that many animals consume plants, it is no wonder that this bacterium is often found in feces 15–17. Passage of B. subtilis through animal gastrointestinal (GI) tracts may not be without effects; the idea that B. subtilis plays an active role within the GI tract has had anecdotal support for years. In fact, B. subtilis has been touted as a probiotic that when ingested has “beneficial” effects, likely by helping to maintain or restore “healthy” bacterial communities in the body 18. B. subtilis is also found in several commercially available fermented food products, including soy beans fermented with B. subtilis natto which is popular in Japan and which has been long thought to confer health benefits 19. But as its role in plant growth promotion, just how B. subtilis imparts its probiotic effects is not clear.
Work from recent years has transformed our view of what B. subtilis can do within the GI tract of animals. In the past, B. subtilis was thought to be an obligate aerobe that simply transited through the mostly anaerobic GI tract as a spore. Therefore, any benefit incurred by its consumption was thought to be due to some intrinsic property of the spore. Recent evidence, however, suggests that B. subtilis can complete its entire lifecycle within the GI tract going from spore to vegetative cell and sporulate again 16–18, 20, 21. In fact, growth within the GI tract must be robust enough such that it can out-compete pathogens like E. coli in poultry GI tracts when administered orally 22.
In summary, current data suggest that B. subtilis’s apparent ubiquity is not solely a consequence of spore persistence in these environments. Instead B. subtilis appears to grow in diverse environments including soils, on plant roots, and within the GI tract of animals.
What can genomics teach us about B. subtilis ecology?
Today we find ourselves in a golden age of genomics thanks to increasingly facile methods for generating, assembling, and analyzing large amounts of sequence information 23. We no longer need to rely solely on isolation geography, behaviors in the laboratory, or anecdotal reports to gather a picture of a species’ ecology. In addition, we can investigate the genes present or absent in any strain of interest. The identity of the proteins predicted to be encoded in an organism’s genome can reveal much about that organism’s lifestyle and the habitats where it resides.
The genome sequence of B. subtilis 168 has provided many insights into the lifestyles of the organism 24. Consistent with the view that the bacterium is not a pathogen, no genes coding for known virulence factors were found. Interestingly, the genome encoded numerous pathways for the utilization of plant-derived molecules, bolstering the idea that this species associates intimately with plants 24. One observation challenged the long-held belief that B. subtilis was an obligate aerobe; genes encoding a putative respiratory nitrate reductase were found 24. This suggested that B. subtilis should be able to grow anaerobically using nitrate instead of oxygen as an electron acceptor. Anaerobic growth of B. subtilis in the presence of nitrate has since been demonstrated experimentally 25. The discovery that B. subtilis can indeed grow anaerobically further supports the idea that vegetative life within the mostly anaerobic GI tract of animals is feasible.
The genome sequence also revealed that B. subtilis has dedicated a relatively large portion of its genome (~4%) to making secondary metabolites. Some of these compounds are potent inhibitors of fungi and bacteria and likely allow B. subtilis to compete in the natural environment 14, 15, 26, promote plant growth, and serve as a probiotic.
The limitations of genome sequence from a single laboratory strain
The genome of B. subtilis 168 was chosen for sequencing because the laboratory strain had been the workhorse for molecular genetic studies for several decades. That very strength meant that it had been grown under artificial settings for many generations. As a consequence, the B. subtilis 168 strain had evolved in ways that improved fitness in the laboratory, a process that is referred to as domestication 7. But this domestication came at a cost. We now recognize that B. subtilis 168 is deficient in a number of traits that are characteristic of wild strains. Among these are surface swarming and the ability to form architecturally complex biofilms (Box 1 Figure 1) 7, 27. Conversely, B. subtilis 168 produces a much higher proportion of cells in the state of genetic competence than do wild strains.
At the same time that investigators began to recognize strain domestication as a common laboratory phenomenon the genomic era delivered a surprise. In some cases the genomes of different strains of a single species were highly conserved while in others the genetic variability was enormous. In fact, in the case of Escherichia coli, even though different strains possess identical 16S rRNA gene sequences, strains can harbor more than one thousand strain-specific genes 28! There seems to be a trend that the amount of differences in gene content observed within a given species correlates with certain features of that species’ ecology. Bacterial species with little genome variability appear to occupy few habitats while those with more genomic diversity within strains appear to colonize diverse environments.
Where does B. subtilis lie in the spectrum of genomic diversity? Does the genome of B. subtilis 168 tell the full tale of this species’ biology and ecology? Is there genomic variation among members of this species? And, if so, could this variation explain differences in strain ecology?
Foreshadowing B. subtilis genomic diversity
For many years most of the available evidence concerning genotypic variation among different B. subtilis isolates came from the assessment of phenotypic variation, principally strain-to-strain variation in the ability to make various antibiotics 26, 29. It wasn’t until the 1990s that loci other than 16S rRNA genes were examined among multiple strains 30–32. These studies revealed that B. subtilis was not nearly as genetically monomorphic as its pathogenic relative, B. anthracis 33. One such survey used restriction fragment length polymorphisms (RFLP) of three housekeeping genes as markers for genetic diversity among strains isolated from geographically distant locations 30. The results revealed that these strains were clearly phylogenetically separate from other recognized species of the genus Bacillus, but they themselves fell into two distinct phylogenetic groups 30. This robust phylogenetic separation called into question the assignment of B. subtilis as a single species. In other words, did the strains from both phylogenetic clusters belong to the species B. subtilis or was there enough variation to reclassify one of these groups as a distinct species within the genus Bacillus? Using “classical” methods for bacterial species assignments 34, including DNA re-association analysis, it was concluded that the two groups exhibited sufficient “relatedness” to be kept within the same species, but were different enough to warrant subspecies classification 35. Thus, strains of B. subtilis were divided into subspecies B. subtilis subsp. subtilis, containing the sequenced strain B. subtilis 168, and B. subtilis subsp. spizizenii 35.
Analyses involving DNA-reassociation kinetics also gave indications that there was more genetic diversity among members of this species than what was found by nucleotide variation at conserved sites 35. The results suggested that a large percentage of the DNA of each strain’s genome was strain-specific. However, the identities of these strain-specific regions were entirely unknown. Could the identities of genes within these variable regions focus and/or expand our view of B. subtilis’s ecology?
Microarray-based comparative genomic hybridization analyses
Ideally, to begin to answer the foregoing question one would seek to identify and compare all of the genes harbored by each strain. But even though whole genome sequencing has become an increasingly feasible option for such an analysis, it is still not a quick or inexpensive undertaking. However, the available B. subtilis 168 genome sequence did provide an opportunity to explore genome variation among strains at much lower cost. Using an oligonucleotide microarray designed to represent each of B. subtilis 168’s predicted coding sequences, it was possible to query closely related strains for variation in each of B. subtilis 168’s genes. This technique, called microarray-based comparative genomic hybridization (M-CGH) is simply a DNA re-association method that provides more detailed information about which genes are contributing to lowered re-association values. DNA from strains that either lack or possess a divergent copy of a B. subtilis 168 gene will not hybridize as well as the DNA from B. subtilis 168 to that gene-specific oligonucleotide. The relative hybridization of a strain’s DNA can be easily assessed by measuring variation in fluorescence intensity at each gene spot when the B. subtilis 168 and test strains’ genomes are differentially labeled with fluorescent nucleotides.
Such analyses were performed using a collection of diverse strains from both subspecies groups 36. The results from this study revealed that 30% of B. subtilis 168’s predicted coding sequences were cumulatively absent or divergent in the strains tested 36. Not surprisingly, strains that were more closely related to B. subtilis 168 (within the subtilis subspecies) exhibited less total gene diversity relative to those in the other subspecies, consistent with the RFLP and DNA re-association data.
Where is genome diversity localized? To answer this question, knowledge of the extent of synteny among strains is needed. While there was only one B. subtilis genome sequence available, a high degree of synteny among B. subtilis strains is to be expected given the observed synteny between the B. subtilis 168 and the recently published B. licheniformis ATCC 14580 genome sequences 37, 38. Assuming that synteny among B. subtilis strains is high, it seems that genomic diversity among this species is not localized to only a few areas within the genome. Rather, it is distributed along the entire genome. In summary, based on the M-CGH analyses there are very few large stretches of genomic DNA that do not have some possibility of variation.
M-CGH analysis reveals regions of variability among wild strains of B. subtilis
Within these distributed regions of diversity were some genes that, given previous phenotypic and biochemical observations, came as no surprise. These included genes that encode for the synthesis of secondary metabolites 26, 39, teichoic acid 40, and the adaptive response to alkylation DNA damage 41. The M-CGH analysis revealed that there was also variability in nearly all “functional” categories of genes, some of which could prove ecologically relevant by changing (expanding or limiting) the environments in which these strains can live. Divergence was observed in genes that encode for the uptake and breakdown of carbohydrates and amino acids (e.g. xylose and glutamine) as well as a number of cell surface-associated proteins - including those involved in environmental sensing 36. The observed variability among these loci, and others like it, suggests that certain metabolic and/or environmental-monitoring capabilities may not be required for B. subtilis’ life in all environments.
It is equally informative to determine which genes exhibit limited or no variability. Presumably these highly conserved loci would encode proteins that are selected for in all environments inhabited by the species. As expected, nearly all of the genes that had been previously shown to be essential under laboratory conditions in B. subtilis 168 42 were invariable among the B. subtilis strains examined 36. Also, a very large fraction of the sporulation genes were conserved. This is not surprising given that all of the strains from the M-CGH study were originally isolated as spores. It is interesting to note, however, that many of B. subtilis 168’s germination genes exhibited divergence. This suggests that the cues for reinitiating growth may not be the same in all environments.
Genes involved in biofilm formation were highly conserved 36. Life within matrix-associated multi-cellular communities appears to be a universally important ecological trait for this species. However, there are reports that there is some strain-to-strain variation among conserved loci that can affect the outcome of this developmental process 43, 44. This strain-to-strain variation was not detected in the M-CGH analyses because it involved minor sequence changes in conserved genes and regulatory regions. The noted allelic variation may thus have been the consequence of laboratory domestication alone and not necessarily reflective of variation among wild isolates.
As referred to above, B. subtilis is also noted for its ability to become naturally competent for transformation, i.e. the ability to take up and recombine extra-cellular DNA into its genome 45. The M-CGH analyses revealed that all of the competence machinery identified in B. subtilis 168 was highly conserved except for one operon. The three-gene comPQX operon is involved in the synthesis, processing, and recognition of an extra-cellular signal that is required for the initiation of competence 46. comPQX had been previously recognized as variable among strains of B. subtilis 47–50. It was further demonstrated that the observed genetic variation also resulted in functional variation such that different strains produced and recognized different variants of the extra-cellular signal to the exclusion of others 47–50. Considering that competence signal recognition is a population-density-dependent phenomenon, B. subtilis strains likely become competent only when their own numbers are high. This would suggest that genetic transfer via transformation would occur most often with DNA from “self”.
What are the drivers of diversity/evolution in this species?
How does genome diversity come about? Mutagens as well as DNA replication and repair errors can introduce mutations into a genome. If a mutation is neutral or confers an advantage for life in a given environment, that mutation may become fixed within a population and eventually come to predominate. Though this mechanism for genetic change unquestionably occurs in nature, it is not the primary driver of evolution among bacterial species 51. Instead, horizontal gene transfer (HGT), via transduction, conjugation or transformation, is thought to play the most important role in this process 51. Consistent with this notion, the B. subtilis 168 genome sequence revealed that a large portion of this strain’s genome might have arisen by HGT 52, 53. And, perhaps not surprisingly, many of the divergent genes (~40%) among the strains examined by M-CGH were located in these regions 36.
Among the genes predicted to have been horizontally transferred many are clustered and bear the hallmarks of phage integration, suggesting that they were gained by transduction. A recent study reported that phage integration could account for as much as 16% of the predicted HGT regions in B. subtilis 168’s genome 53. This suggests that, like many other bacterial species examined to date 51, 54, phages are playing a role in the evolution of this species. Whether phages are actually shaping the ecology of B. subtilis via the introduction of novel loci that could be used to explore or reside within different environments is yet to be conclusively determined. Although the presence, within these phage elements, of genes that encode for antibiotic synthesis and detoxification strongly suggests that they could serve such a purpose.
Plasmids mediate gene transfer via conjugation and thus play a role in bacterial evolution 51. A survey of plasmid diversity among natural isolates of B. subtilis estimated that only ~10% of strains harbor these extra-chromosomal elements 32. All of the plasmids identified appeared to be highly homologous, probably all sharing the same basic replicon 32. This is different from the account of plasmid diversity in other bacterial species, such as E. coli 55, 56. There is no evidence to suggest that B. subtilis plasmids confer any benefit, perhaps explaining their low occurrence among natural B. subtilis populations as well as their genetic homogeneity 32. Although there is report of conjugation between B. subtilis strains in soil microcosms 57, it seems unlikely that conjugation is an important driver in the evolution of this species.
Finally, it appears that transformation may indeed help drive the evolution of B. subtilis. Under laboratory conditions, strains can take-up and recombine exogenously added genomic DNA from relatives 58. This can occur even between subspecies although the number of recombinants goes down as relatedness decreases, a phenomenon termed sexual isolation 58. Also early experiments using sterilized soil microcosms monitored what happened when differentially “marked” variants of strains were mixed 59. Such exchange was observed even between different species, e.g. B. subtilis and B. licheniformis 60. However, the results observed were likely biased by the choice of strains as both laboratory strains used are known to be much more highly transformable then wild strains. The “hybrid species” recombinants were also unstable suggesting that the results may not be relevant to what is occurring in nature. It does appear, however, that wild populations of B. subtilis do indeed recombine their genes in nature 30. How this exchange is mediated – by transformation, transduction or conjugation – is yet to be determined.
In summary, B. subtilis is a widely adapted bacterial species, capable of growing within myriad environments including soil, plant roots and the GI tracts of animals. The B. subtilis 168 genome sequence has been an important tool in aiding our understanding of how growth within some of these environments is possible. It is now clear, however, that the B. subtilis 168 genome does not tell the entire story. M-CGH analyses have revealed great variability among the genes of different members of the species.
Though intriguing, the results of M-CGH give us an incomplete picture. For instance, M-CGH cannot forecast what, if any, genes are present within regions of divergence that are not already found within the B. subtilis 168 genome. However, we are poised to answer this question. Whole genome sequences from select representatives of both B. subtilis subspecies will soon be available, revealing the identities of genes within these regions of divergence. It will be interesting to see whether some of these genes prove to be ecologically significant and whether they broaden our view of this organism’s habitat and the adaptations it has acquired to propagate in diverse environments. Ultimately, the new genome sequences, in conjunction with the M-CGH data, will likely prove powerful in increasing our understanding of B. subtilis’ ecology and evolution.
Acknowledgments
Work in our laboratories on B. subtilis biofilms and genome diversity is funded by grants from the National Institutes of Health GM18568 to R.L. and GM58213 to R.K. A.M.E. was the recipient of a postdoctoral fellowship from the National Institutes of Health (GM072393).