Previous Article | Next Article ![]()
Journal of Clinical Microbiology, October 2006, p. 3742-3751, Vol. 44, No. 10
0095-1137/06/$08.00+0 doi:10.1128/JCM.00618-06
Copyright © 2006, American Society for Microbiology. All Rights Reserved.
Department of Food Science, Cornell University, Ithaca, New York,1 Department of Animal Sciences, Colorado State University, Fort Collins, Colorado,2 Department of Biostatistics and Computational Biology, Cornell University, Ithaca, New York,3 Center for Bioinformatics and Institute of Biology, University of Copenhagen, Copenhagen, Denmark4
Received 22 March 2006/ Returned for modification 30 April 2006/ Accepted 27 May 2006
|
|
|---|
|
|
|---|
Listeria monocytogenes, a facultative intracellular pathogen that may cause severe invasive infections in humans and more than 40 species of animal hosts (25, 32), was chosen as a model organism for the development and implementation of a novel two-step statistical approach for the unbiased identification of phylogenetic clades that are significantly associated with different source populations, i.e., humans, animals, and food. L. monocytogenes was chosen as a model since it not only causes human and animal food-borne infections but also is commonly isolated from different environmental sources (e.g., soil, surface water, vegetation, and manure) and from food (7, 8). Consequently, a number of epidemiological studies have previously been performed to identify the L. monocytogenes subtypes and clonal groups that may differ in their ability to cause disease (9, 19, 20, 33). In general, the results of these previous studies supported the finding that L. monocytogenes represents two major genetic lineages (termed lineages I and II) and a third minor lineage (lineage III) that appear to differ in their abilities to cause human disease (3, 9, 21, 33). L. monocytogenes isolates representing lineage I, which includes serotypes 1/2b, 4b, and 3b (17), are more prevalent among human clinical isolates (9, 11, 16, 31), while lineage II, which includes isolates belonging to serotypes 1/2a, 1/2c, and 3a (17), even though they are regularly isolated from human clinical cases, are overrepresented among L. monocytogenes strains isolated from food (9, 31). Lineage III includes isolates belonging to serotypes 4a, 4b, and 4c; and animal clinical isolates are overrepresented among this third rare lineage (11, 17, 31, 33). Modeling of the dose-response relationships for these L. monocytogenes lineages also supported the finding that lineage I strains show higher levels of virulence for humans than lineage II strains (3). Previous studies (9, 20, 33) performed by use of a tissue culture plaque assay also showed that lineage I isolates, on average, formed larger plaques than lineage II isolates, indicative of an enhanced ability of lineage I isolates to spread intracellularly between host cells.
While previous studies clearly indicate that the genetic lineages within L. monocytogenes differ in their virulence characteristics and their association with different source populations (e.g., human, animal, and food), each L. monocytogenes lineage contains considerable subtype diversity and subtypes that are associated with isolation from specific source populations (9). Further studies are thus needed to probe the evolution of niche adaptation and to define biologically meaningful clonal groups within the major L. monocytogenes genetic lineages. Specifically, there is a clear need to differentiate L. monocytogenes clonal groups that are adapted to infect humans and that may have virulence enhanced over that of clonal groups that are adapted to colonize the environment and that may have limited virulence. We thus used L. monocytogenes as a model organism to (i) develop and implement a novel statistical approach for the unbiased identification of phylogenetic clades that are significantly associated with a particular source population (i.e., humans, animals, or food) and (ii) validate the biological significance of clades associated with specific source populations using tissue culture pathogenicity data and epidemiological information.
|
|
|---|
Frozen stocks of L. monocytogenes isolates were maintained at 80°C in brain heart infusion (Difco, Detroit, MI) broth with 15% (wt/vol) glycerol.
Mouse cell plaque assay. The in vitro virulence phenotype of all 120 L. monocytogenes isolates was evaluated by a plaque assay with mouse L cells, which was performed essentially as described previously (9, 26, 27). Briefly, L. monocytogenes isolates were grown in brain heart infusion broth overnight (18 h) at 30°C without shaking. Overnight bacterial cultures (1 ml) were pelleted and resuspended in phosphate-buffered saline (PBS; pH 7.4), and 1:10 serial dilutions were performed in PBS. Mouse L cells were grown to confluence in treated flat-bottom tissue culture six-well plates (Corning; Acton, MA), and approximately 1.5 x 105 or 4.0 x 106 CFU of a given L. monocytogenes isolate was inoculated into one well containing an L-cell monolayer. The lineage II, EcoRI ribotype DUP-1030A, standard laboratory control strain 10403S (1) was included as an internal control in each plaque assay. Plaques were visualized by staining with neutral red (Sigma Chemical, St. Louis, MO), as described previously (9, 26, 27), and images of infected L cells were captured with a digital scanner (Perfection 1650; Epson, Long Beach, CA). The area of approximately 25 plaques, selected to represent each L. monocytogenes isolate, was measured with SigmaScan Pro software (version 5.0; Statistical Solutions, Saugus, MA). The ability of each L. monocytogenes isolate to spread from cell to cell was expressed as the average plaque size for a given L. monocytogenes isolate relative to the average plaque size for 10403S, which was set equal to 100%. Two independent plaque assays were performed for all 120 L. monocytogenes isolates studied here.
Phylogenetic analysis. In a previous study (18), the 120 L. monocytogenes isolates described above were assigned sequence types (STs) on the basis of MLST analyses that included partial gene sequences for two key virulence genes (actA and inlA), a stress response gene (sigB), two hypervariable (15) housekeeping genes (purM and ribC), and two slowly evolving (18) housekeeping genes (gap and prs). While we previously reported phylogenetic trees inferred from a single isolate selected to represent each of the 52 unique STs (18), phylogenetic trees based on DNA sequence data for all 120 L. monocytogenes isolates were constructed here to allow the implementation of the novel statistical approach to detection of the same-source clustering described below.
MODELTEST (22) was used to optimize the input parameters to infer maximum-likelihood phylogenetic trees in PAUP* (28), based on partial sequences of two key virulence genes (i.e., actA and inlA) and a concatenated sequence representing two housekeeping genes (gap and prs) and the stress response gene sigB. Heuristic searches were performed by using equal weights for all sites, and the tree-bisection-reconnection branch-swapping algorithm was used. While unrooted maximum-likelihood phylogenies were used as inputs for the TreeStats test (described below), phylogenies were displayed here as rooted phylograms (see Fig. 1A to C). A homologous concatenated gene sequence from Bacillus subtilis (http://genolist.pasteur.fr/SubtiList/) was used as the outgroup for the concatenated housekeeping and stress response gene phylogeny (Fig. 1C), while the three lineage III sequences (i.e., FSL F2-525, FSL F2-655, and FSL-F2-695) were defined as the outgroups for the inlA and actA phylogenies (Fig. 1A and B), as described previously (18), because homologous B. subtilis sequences are not available for these sequences.
![]() ![]() ![]() View larger version (78K): [in a new window] |
FIG.1. Maximum-likelihood phylogenetic trees inferred from (A) actA, (B) inlA, and (C) concatenated gap, prs, and sigB sequences for a sample set containing 60 human clinical (red), 30 animal clinical (blue), and 30 food (green) L. monocytogenes isolates. Taxon labels include the Cornell Food Safety Laboratory Culture Collection isolate name (e.g., F2655 represents isolate FSL F2-655), EcoRI ribotype (e.g., 44A represents ribotype DUP-1044A), and source (e.g., human isolate from New York State Department of Health [HS], human isolate from New York City Department of Health and Mental Hygiene [HC], animal isolate [AN], and food isolate [FD]). Isolates in clades that show significant or marginally significant associations with source populations (i.e., humans, animals, and food), as identified by the TreeStats test, are indicated by italics and are marked with a vertical bar; significance levels are indicated by * and ** (which represent P < 0.05 and P < 0.01, respectively) and # (which indicates marginally significant associations; P < 0.10). Clades that are significantly associated with source populations in at least one phylogeny and that contain the same or overlapping isolates are designated with the same letter (A to J) across all three phylogenies (Table 3); the predominant source population for each clade (i.e., human, animal, or food) is also indicated. The mean plaque size (PS) for isolates in each cluster as well as classification into ECs (as described by Kathariou [12]) is also indicated at each significant source-associated clade.
|
The SourceCluster test was performed by using uncorrected pairwise distance matrices (raw distance) generated with PAUP* software (28) from (i) actA sequences, (ii) inlA sequences, and (iii) a concatenated sequence of two housekeeping and a stress response gene (i.e., gap, prs, and sigB) for the 120 L. monocytogenes isolates described above. The SourceCluster test used the average genetic distance (raw distance) between sequences from the same source (i.e., human [H], animal [A], and food [F] samples) as a test statistic (T) calculated as follows:
![]() | (1) |
The TreeStats test was performed by using unrooted maximum-likelihood phylogenetic trees for the 120 L. monocytogenes isolates inferred from (i) actA sequences, (ii) inlA sequences, and (iii) a concatenated sequence of two housekeeping and a stress response gene (i.e., gap, prs, and sigB). In the TreeStats test, for each clade on a given phylogenetic tree, the expected number of isolates belonging to group A, F, and H, under the null hypothesis of no clustering between isolates from the same source population, was calculated by permuting the labeling of the leaf clades 10,000 times. A chi-square goodness-of-fit test was then used to compare the observed to the expected numbers for each clade on the tree. This procedure is implemented in the program TreeStats, which is publicly available at http://www.foodscience.cornell.edu/wiedmann/TreeStats.htm. Clades with P values <0.05 and <0.10 were considered to show statistically significant and marginally significant evidence for clustering among isolates belonging to the same source population, respectively. No correction for multiple testing was performed, since the SourceCluster test was used to first test the null hypothesis of no clustering between isolates from the same source population. Significant and marginally significant P values from TreeStats analysis were superimposed on the maximum-likelihood trees that were constructed and rooted with an appropriate outgroup as described above.
Statistical analysis. Chi-square tests of independence or Fisher's exact tests (if appropriate; i.e., if more than 25% of expected values in a table were less than five) were used to probe associations between L. monocytogenes source populations (i.e., human, animal, or food) and L. monocytogenes genotypes (i.e., genetic lineage and EcoRI ribotype). Human and animal clinical isolate were also pooled to create a categorical variable termed "host" to evaluate associations between this source population and L. monocytogenes genotypes. Categorical analyses were performed only for the EcoRI ribotypes that were observed at least four times among the 120 L. monocytogenes isolates studied here.
One-way analysis of variance was used to determine the relationship between the measure "plaque size" and the categorical variables "lineage," "source," and "ribotype." Analyses were performed only for ribotypes that were observed at least four times among the 120 isolates. All analyses were performed with Statistical Analysis Systems software (SAS Institute, Cary, NC). While P values of <0.05 were considered statistically significant, P values of <0.10 are also reported and were considered marginally significant.
|
|
|---|
The combined SourceCluster and TreeStats approach identifies biologically relevant phylogenetic clades, which provide evidence for niche adaptation within L. monocytogenes lineages. Initial categorical analyses showed that the distribution of the 120 L. monocytogenes isolates studied here among source populations, genetic lineages, and EcoRI ribotypes was consistent with that found in previous studies, which were often based on larger L. monocytogenes isolate sets (9, 11, 16, 31). Specifically, L. monocytogenes lineage I was marginally overrepresented (P = 0.07) among isolates from human listeriosis cases, while lineage II was significantly (P = 0.04) overrepresented among food isolates (Table 1). Determination and analyses of mean plaque sizes, a measure for the ability of a given L. monocytogenes strain to spread intracellularly between mammalian host cells, indicated that isolates representing lineage II produced significantly (P < 0.0001) smaller plaques than isolates belonging to lineages I and III and, similarly, that isolates belonging to lineage I formed significantly (P < 0.01) larger plaques than lineage II and III isolates (Table 2), also consistent with previous studies (9, 20, 33). Interestingly, L. monocytogenes isolated from food samples also formed significantly (P = 0.03) smaller plaques than human and animal clinical isolates (Table 2). Since isolates that form larger plaques have accelerated rates of intracellular spread (26), these data further suggest that lineage I isolates are more virulent than those belonging to lineage II and, similarly, that isolates from clinical cases of listeriosis (i.e., both human and animal cases) are typically more virulent than isolates from food, providing preliminary evidence for niche adaptation within L. monocytogenes. Our analyses also support the finding that the collection of 120 L. monocytogenes isolates studied here is representative of the previously reported genetic and phenotypic diversity for this pathogen and is thus appropriate for use for the development and biological validation of a new statistical approach to the identification of the phylogenetic clades associated with specific source populations.
|
View this table: [in a new window] |
TABLE 1. Distribution of L. monocytogenes molecular subtypes among isolate sources
|
|
View this table: [in a new window] |
TABLE 2. Summary of associations between plaque size and categorical variables
|
Implementation of SourceCluster and TreeStats tests can identify biologically meaningful same-source clades in phylogenetic trees. Because the SourceCluster and TreeStats tests identified the same, biologically validated, source-associated lineages as previous studies (9, 11, 16, 31), this approach has the potential to identify specific clades representing clonal groups that may have common biologically relevant characteristics, consistent with niche adaptation within the major genetic lineages. Specifically, the TreeStats test identified eight, six, and five clades with significant (P < 0.05) source associations in the actA (Fig. 1A), inlA (Fig. 1B), and concatenated housekeeping and stress response gene (Fig. 1C) phylogenies, respectively. In addition, one clade each in the actA and the concatenated housekeeping and stress response gene phylogeny as well as three clades in the inlA phylogeny showed marginal (P < 0.10) evidence for associations with specific source populations. Overall, a total of 10 clades covering the same or overlapping isolates showed significant associations with specific source populations across the three phylogenies analyzed here, and all 10 of these clades were identified independently in at least two phylogenies (Table 3). For example, clade F (Table 3; Fig. 1) was identified as a significant (P < 0.05) source-associated clade in all three phylogenies. Additionally, each of the 10 clades listed in Table 3 was identified as a significant (P < 0.05) source-associated clade in at least one phylogeny. For example, while clade A was identified only as a marginally significant (P < 0.10) source-associated clade in the inlA phylogeny, this clade was significantly (P < 0.05) overrepresented by isolates obtained from human clinical cases in the actA phylogeny (Fig. 1). While we appreciate that the source-associated clades identified by TreeStats analysis of the phylogeny inferred from the concatenated housekeeping and stress response gene sequence should be interpreted carefully, since the overall SourceCluster statistic was not significant, the overlap between significant clades identified in the different trees supports the robustness of the TreeStats approach, particularly when it is applied to pathogens with a highly clonal population structure, such as L. monocytogenes (18).
|
View this table: [in a new window] |
TABLE 3. Description of significant same-source isolate clusters in L. monocytogenes phylogenies inferred from key virulence genes (actA and inlA) and a concatenated housekeeping gene sequence composed of gap, prs, and sigB
|
Overall, a total of three clades were significantly associated with isolation from animal clinical cases, including one clade within lineage I (clade B) and two clades within lineage II (clades E and H) (Table 3). Interestingly, isolates in clade B predominantly represent ribotype DUP-1042B, which, on average, shows an enhanced ability to spread intracellularly between host cells (Table 2). Our data suggest that DUP-1042B isolates represent two distinct clonal groups that may have adapted to infect specific host species. This observation is further supported by the fact that DUP-1042B isolates in the human- and animal-associated clades (clades C and B, respectively) represent distinct serotypes, with isolates in clades C and B belonging to serotypes 4b and 1/2b, respectively, consistent with the fact that most human listeriosis epidemics are caused by serotype 4b strains (14). While isolates in clades E and H do not show any apparent phenotypic or epidemiological characteristics that explain their association with animal clinical cases, further characterization of isolates belonging to these clades may provide new insights into the evolution of host specificity in L. monocytogenes.
Overall, a total of four clades were significantly associated with isolation from contaminated food, including two clades within lineage I (clades D and J) and two clades belonging to lineage II (clades F and I) (Table 3). Interestingly, isolates in clade F belong to three L. monocytogenes ribotypes (i.e., DUP-1062A, DUP-1039C, and DUP-1046B) previously shown to carry nonsense mutations that lead to premature stop codons in the virulence gene inlA, express a truncated and secreted form of InlA, and demonstrate attenuated invasiveness in Caco-2 cells (19; R. Orsi and M. Wiedmann, unpublished data). We recently identified two similar additional mutations that lead to premature stop codons in inlA in isolates belonging to ribotypes DUP-1045B, DUP-1062D, and DUP-1048B, which also demonstrated attenuated invasion of Caco-2 cells (Nightingale and Wiedmann, unpublished), supporting the finding that the second lineage II food-associated clade identified here, clade I, may also represent a human virulence-attenuated clonal group. These findings not only provide a clear biological explanation for the association of clades F and I isolates with food but also are consistent with the findings of a previous study (19), which found that isolates with premature stop codons in inlA are significantly underrepresented among human clinical isolates compared to their prevalence in food. On very rare occasions, L. monocytogenes isolates containing premature inlA stop codons have been linked to human disease (<2% of more than 1,000 human listeriosis cases), suggesting that these strains may have the potential to cause disease in extremely immunosuppressed individuals (19). Isolates in clades I and F also showed, on average, a smaller plaque size (Table 3), indicative of a reduced ability to spread intracellularly in mammalian hosts cells (Table 3), further supporting the hypothesis that isolates in these clades may represent virulence-attenuated clonal groups. Interestingly, ribotype DUP-1042C isolates belonging to food-associated clade D also formed smaller (92.9%) plaques (Table 3). The classification of clade D as a food-associated clade is also consistent with the results from a previous study (9), which included a larger number of DUP-1042C isolates (n = 14) and which showed that DUP-1042C isolates were significantly more common among food isolates and were not observed among nearly 500 human clinical isolates. Some DUP-1042C isolates from food were also recently found to be characterized by attenuated invasiveness in Caco-2 cells (Nightingale and Wiedmann, unpublished), providing a possible biological explanation for their classification into a food-associated clade. Interestingly, isolates in food-associated clade J, on average, formed large (120%) plaques (Table 3), further indicating that different clades associated with food may show distinct phenotypic characteristics. Future studies are thus required to fully understand the biological underpinnings for the observed sources associations of specific clades.
Conclusions. While a number of previous studies have provided preliminary evidence that the main L. monocytogenes lineages (i.e., lineages I and II) differ in their ability to cause human disease, only a few clonal groups with specific host or niche preferences (e.g., epidemic clones and virulence-attenuated strains with inlA mutations) have been described (10, 12, 19, 23). Our findings not only support the finding that previously described epidemic clones and virulence-attenuated strains represent distinct clades within L. monocytogenes but also identified a number of additional clades that are significantly associated with specific source populations and may thus represent host- or niche-adapted ecotypes. Most importantly, our data, obtained by using L. monocytogenes as a model system, show that a novel two-step statistical approach (SourceCluster and TreeStats analyses) for the unbiased identification of phylogenetic clades that are significantly associated with particular source populations can reliably identify biologically relevant clonal groups that may represent niche-adapted L. monocytogenes ecotypes. This approach will thus provide a valuable set of tools for the characterization of other bacterial pathogens as well as nonpathogenic bacteria, allowing the rapid identification of meaningful subtypes and clades that differ in their relevant phenotypic characteristics, without requiring considerable information a priori on the biology of the organism being characterized. As larger sequence data sets for L. monocytogenes and other pathogens become available, this approach may also provide an opportunity to identify associations between clades and more narrowly defined hosts and niches (e.g., specific animal species and environments).
We thank Qi Sun (Computational Biology Service Unit, Cornell University) for his expertise in setting up the computing system to perform evolutionary analyses. We also thank Wan-Lin Su for helpful discussions. We are also indebted to Esther Fortes and Alphina Ho for their technical assistance with DNA sequencing.
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»