| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
,
Yoav Y. Broza,1
Hanoh Goldshmidt,1
Elinor Malul,1
Lea Valinsky,2
Larisa Lerner,2
Meir Broza,3 and
Yechezkel Kashi1*
Department of Biotechnology and Food Engineering, Technion-Israel Institute of Technology, Haifa 32000, Israel,1 Government Central Laboratories, Ministry of Health, Jerusalem 94467, Israel,2 Department of Biology, Faculty of Science and Science Education, University of Haifa, Oranim, Tivon 36006, Israel3
Received 12 September 2006/ Returned for modification 30 October 2006/ Accepted 14 December 2006
| ABSTRACT |
|---|
|
|
|---|
| INTRODUCTION |
|---|
|
|
|---|
Traditionally, V. cholerae classification is serological and requires about 200 antisera based on the somatic O antigen (72). Isolates of V. cholerae are divided into three major subgroups: O1, O139, and non-O1/non-O139, of which only the O1 and O139 serogroups are associated with cholera pandemics and epidemics. Non-O1, non-O139 serogroups are recognized as causative agents of sporadic and localized outbreaks (73). Pathogenic V. cholerae isolates carry virulence genes, such as the toxins genes ctxAB (25, 65, 68). The environmental V. cholerae strains from non-O1, non-O139 serogroups are a possible natural reservoir of potentially new emerging epidemic strains (67, 73). This assumption is supported by the finding that some of these environmental strains harbor virulence genes (23) and thus are likely to evolve into novel pathogenic strains by horizontal gene transfer (24, 25). The emergence of new pathogenic V. cholerae strains requires not only an efficient, rapid, and accurate identification tool but also a means for determining genetic relationships among environmental and clinical isolates.
Genome-based bacterial identification and typing is essential for several disciplines, including taxonomy, epidemiology, determining phylogenetic relationships, and the study of evolutionary mechanisms. It allows distinguishing among strains within a species to monitor epidemics and routes of contamination. Recent advances in biotechnology have resulted in the development of numerous methods for microorganism typing that differ in their sensitivity, rapidity, complexity, discriminatory power, reproducibility, labor-intensity, and cost (7, 60, 82, 85). Several DNA-based methods were used for typing of V. cholerae, such as tests for virulence genes, ribotyping, pulsed-field gel electrophoresis (PFGE), amplified fragment length polymorphism, enterobacterial repetitive intergenic consensus sequence-PCR, random amplification of polymorphic DNA, and multilocus sequence typing (MLST) (2, 17, 34, 44, 46, 67, 73, 87).
Simple sequence repeats (SSR), also termed variable number of tandem repeats (VNTR), are a class of short DNA sequence motifs that are tandemly repeated at a specific locus. The variation in SSR tracts, generally having several alleles, enables their use for strain typing in several bacterial species (30, 38, 41, 59, 80). SSR composed of mononucleotide repeats (MNR) in Escherichia coli were found to be abundant and highly polymorphic but stable at the strain level, making them a valuable tool for bacterial typing and phylogenetics studies (20, 30). Larger SSRs (L-SSR) consisting of two to 9-bp motifs were successfully used to analyze variation among strains by size determination of the PCR amplicon (39, 81). Thus, in the last few years multiple-locus SSR, also termed multiple-locus VNTR analysis (MLVA), has been increasingly recognized as the marker of choice for genotyping a number of pathogens such as Bacillus anthracis (39), Borrelia species (22), Salmonella enterica (48), and Enterococcus faecium (78).
MLST utilizing differences mainly in housekeeping genes is a rising DNA sequence-based method for bacterial strain typing (14, 44, 51). DNA sequencing uncovers far more variation per locus than any other method currently used for bacterial strain typing, and it provides a uniform platform for strain comparison analyzed in different laboratories and for database storage. By their basic nature, however, housekeeping genes diversify slowly and exhibit only limited sequence variation. Hence, it has been suggested to apply MLST analysis to additional highly polymorphic genomic regions, such as virulence genes, stress response genes, and intergenic regions (11, 16, 44, 61). Recently, MLST of loci harboring MNR at noncoding regions were shown in E. coli to contain high sequence variation, including single-nucleotide polymorphisms (SNPs) in the flanking area of the MNR (20). This approach was termed MNR-MLST. In the present study, the short highly informative MNR sequences proved to be more efficient than MLST of housekeeping genes for strain discrimination and for determining phylogenetic relationships in E. coli.
The present study presents an SSR typing method applied to V. cholerae. The two chromosomes of V. cholerae genome (32) were screened for the presence of SSR tracts. The selected SSR sites were used for the development of an SSR-based method that combines both analysis of length variation at large SSR (L-SSR) loci and MLST at MNR loci (MNR-MLST). This method combines the variation seen in highly mutable SSR loci with that of shorter, relatively more stable MNR loci for the accurate and rapid typing of V. cholerae isolates.
| MATERIALS AND METHODS |
|---|
|
|
|---|
|
DNA preparation. High-quality DNA extraction of pure cultures was used throughout the study (30, 76). Briefly, a loop full of V. cholerae colonies was washed with SSC buffer (0.15 M NaCl, 15 mM sodium citrate [pH 8.0]). The suspended cells were incubated overnight at 50°C with proteinase K solution (20 mg/ml in Tris-EDTA [TE]). Sodium dodecyl sulfate was added to a final concentration of 1%, followed by incubation at 55°C for 1 h, followed in turn by phenol-chloroform extraction and ethanol precipitation. The DNA was stored at 20°C.
For crude and fast DNA extraction, a loopful of cells was suspended in 0.5 ml of TE buffer, heated for 10 min at 80°C, and centrifuged for 10 min at maximum speed. The pellet was resuspended in 100 µl of TE buffer and heated for 5 min at 100°C. The supernatant was stored at 20°C for a few days.
Genome evaluation of SSR distribution.
The complete genomic sequence of V. cholerae (El-Tor N16961 [32]) was obtained from the National Center for Biotechnology Information database (http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi). The two chromosomes VC (chromosome 1, 2.9 Mb) and VCA (chromosome 2, 1.1 Mb) were screened for perfect SSR with minimal number of repeats greater than three, using the "SSR" computer program (30; http://www.technion.ac.il/
anne/choice3.html).
Loci and primer selection. The longest 17 SSR loci identified by the SSR computer program, in regions without similarity to phage or prophage sequences (from both chromosomes) were selected for the present study. These 17 sequences include nine L-SSR loci with core motifs of 3 to 9 bp and eight sites with MNR. Unique primers were designed to generate PCR products of 100 to 250 bp for the L-SSR loci and 200 to 350 bp for the MNR loci using the computer software Gene Runner (version 3.05; Hastings Software, Inc.). The locus name includes the chromosomal designation, the downstream gene-assigned number, and the repeat motif and length (see Table SA1 in the supplemental material). Each locus was tested for uniqueness throughout the V. cholerae genome by using NCBI BLAST against microbial genome (http://www.ncbi.nlm.nih.gov/sutils/genom_table.cgi).
PCR. PCR mixtures contained 0.2 mM deoxynucleoside triphosphates, 10 µM concentrations of each primer; 0.25 U of Taq polymerase (SuperNova; JMR Holding, Kent, England), 1x buffer (1.5 mM MgCl2), and 50 ng of template DNA in a total volume of 25 µl. The reactions were carried out in a PTC-100 Thermocycler (MJ Research) as follows: 95ºC for 5 min; followed by five cycles of 45 s at 95°C, 45 s at Tm (see Table SA1 in the supplemental material), and 45 s at 72°C; followed by 20 cycles of 45 s at 95°C, 45 s at Tm of 5°C, and 45 s at 72°C; followed by a final 7 min at 72°C. All PCR amplification products were analyzed by agarose gel (1.5%) electrophoresis.
Multiplex PCR assay of L-SSR loci. Primers of six SSR loci were mixed into two PCR multiplex cocktails. The PCR mixtures were as described above. Primer concentrations for each mix were as follows. Mix 1 included VC0437-(7)7, VC1650-(9)7, and VCA0283-(6)14 at final concentrations of 0.8, 0.6, and 1.4 µM, respectively; mix 2 included VC0147-(6)9, VCA0171-(6)23, and VC1457-(7)4 at final concentrations of 0.4 µM. The annealing temperatures were 60 and 62°C for mix 1 and mix 2, respectively.
Detection of specific genes by PCR. Amplification of ctxA and ompW (terminal) loci (58) was performed, together with the amplification of bacterial-universal 16S-rRNA subunit (76) in a multiplex PCR assay, using a 65°C annealing temperature. nag (stn/sto) gene amplification was performed as described above, with annealing temperature of 60°C. The primers were NAG-F 5'-TCGCATTTAGCCAAACAGTAGAA-3' and 194R 5'-GCTGGATTGCAACATATTTCGC-3' (67), amplifying a predicted 172-bp fragment.
DNA sizing. Sizing of L-SSR amplification products was performed by 2.5% agarose gel electrophoresis and by capillary electrophoresis on an ABI Prism 310 automated DNA sequencer, using Fluorofor-labeled primers. The amplified samples were loaded on the ABI analyzer, together with 10 µl of formamide and 0.6 µl of tetramethylrhodamine (TMR) MapMaker LOW (lot 030600; BioVentures, Inc.). The results were analyzed with Genescan 3.1 and Genotyper analysis software (Perkin-Elmer).
DNA sequencing. PCR products were purified using a QIAquick PCR purification kit (QIAGEN). Portions (50 ng) of the purified product were sequenced on both strands using a BigDye terminator cycle sequencing kit (Perkin-Elmer Applied Biosystems) and loaded onto an ABI 310 automated sequencer. Analysis of the results was done by using the sequence analysis software (Perkin-Elmer).
PFGE. Gel electrophoresis was performed according to the PulseNet standardized PFGE protocol for Escherichia coli subtyping (12) with minor modifications. Briefly, genomic DNA from all isolates were prepared in agarose plugs and digested with the restriction enzyme NotI, separated in a 1% agarose gel in 0.5x Tris-borate-EDTA at 14°C using a CHEF-DRIII apparatus (Bio-Rad). The pulse time ranged from 4 s to 8 s for 9 h and from 8 to 25 s for 11 h both at 6 V/cm. After visualization, the PFGE patterns were analyzed by using the BioNumerics software (Applied Maths). Profiles were compared by using the Dice coefficient, followed by UPGMA (unweighted pair-group method with arithmetic averages) clustering (tolerance, 1.0%).
Data and statistical analysis. Two methods were used to assess the variation at MNR loci: allele analysis of sequence types and sequence comparison.
(i) Allele analysis. A nonparametric analysis of allele variations was used for both L-SSR and MNR loci. Alleles refer to L-SSR size alleles, MNR alleles, or sequence types (ST) that include MNR polymorphism and SNPs. An additional allele (null allele) was counted where there was no amplification product (see Table 4). Two polymorphic MNR tracts were found at the VC0929-(G)8 and VCA0107-(T)8 loci. Thus, MNR microhaplotypes were considered as alleles (see Table 4). In addition, the VC1457-(7)4 locus was excluded form all analyses, since it was found to be duplicated in O1-124 (see Table SA2 in the supplemental material).
|
P2ij, where Pij is the frequency of the jth allele at the ith locus. Genetic relations among strains were inferred from the L-SSR, MNR, and MNR-MLST data. SAS 8.02 software was used to calculate the Nei coefficient of association and to generate the corresponding matrix (SAS system for Windows, version 8.02; SAS Institute, Inc., Cary, NC). This matrix was used to create dendrograms based on the UPGMA method using MEGA 2.1 software (45).
(ii) Sequence comparison. Two phylogenetic analyses were performed for the sequence data: (i) combined sequences of 29 strains at all eight MNR polymorphic loci and (ii) combined sequences of all 32 stains at seven loci. These analyses were carried out since no products were amplified for three strains (see Table SA3 in the supplemental material) at the VC0929-(G)8 locus. Multiple alignments of the sequences were performed by using Sequence Navigator (version 1.0.1; ABI) or CLUSTALX (33). The alignment files, in PIR format, were exported to the Gapcoder software (86) for analysis of the indel. The output files (NEXUS format), including the indel variations coded as presence or absence, were transformed to FASTA format via SEAVIEW software (28), following transformation of the presence or absence to A/C, respectively (0 = A, 1 = C). These files were converted to MEGA format and used to evaluate genetic relationships by the UPGMA method (MEGA 2.1) (45). Gaps were treated as missing data using the "pairwise deletion" option. Bootstrap confidence values were based on 1,000 simulated dendrograms.
Nucleotide accession numbers. The GenBank accession numbers for the nucleotide sequence of the 32 strains at eight MNR loci determined in the present study were as follows: EF176922 to EF176953 for VC0332(A)9, EF176954 to EF176982 for VC0929(G)8, EF176983 to EF177014 for VC1132(A)9, EF177015 to EF177046 VC1490(A)9, EF177047 to EF177078 for VC1833(T)9, EF177079 to EF177110 for VCA0107(T)8, EF177111 to EF177142 for VCA0196(A)8, and EF177143 to EF177174 for VCA1063(T)8.
| RESULTS |
|---|
|
|
|---|
|
8 is 1.5 higher in chromosome 1 (46.6 and 30 per Mb, respectively). In addition, the motif size of L-SSR is limited to 9 bp in chromosome 1 versus 6 bp in chromosome 2. However, L-SSR with the highest repeat numbers are found in chromosome 2. The longest 17 SSR loci were chosen for further analysis. Nine loci had core motifs ranging from three to nine nucleotides (L-SSR), and eight loci had a mononucleotide core repeat (Tables 2 and 3). The L-SSR loci were selected from both coding and noncoding regions, whereas the MNR were selected from noncoding regions only.
|
Size variation at L-SSR loci. All nine L-SSR loci were polymorphic among the tested V. cholerae strains, with the number of alleles ranging from 2 to 13. The most polymorphic loci were those of 6-bp motifs with 10 alleles or more (Table 3). Rapid and accurate sizing was achieved by using both agarose gel and fluorescence-labeled capillary electrophoresis. Size variation of PCR amplification products differed according to the repeated motif size (see Table SA2 in the supplemental material), with the number of repeats ranging from 2 to 25. No correlation was found between the number of repeats and serological groups. Five of the loci had no amplification product (null allele) that was referred to as a single allele. Diversity indexes were high, ranging from 0.45 to 0.91, with one exception of 0.06 at VCA1082-(3)4. At the VCA1082-(3)4 locus, all strains had the same allele, except the O1 Og ET-109-negative [ET-109()] strain, which presented a null allele. All nine L-SSR markers have consistent amplification products in at least three separate amplifications. Two products were amplified consistently in V. cholerae O1 In-124 (+) strain at the VC1457-(7)4 L-SSR locus (see Table SA2 in the supplemental material). Efficient L-SSR genotyping was shown by multiplex PCRs, followed by capillary sizing. Two cocktails of three markers were chosen. Mix 1 consisted of VC0437-(7)7, VC1650-(9)7, and VCA0283-(6)14 primers labeled with Hex, Fam, and Hex, respectively, and mix 2 consisted of VC0147-(6)9, VCA0171-(6)23, and VC1457-(7)4 primers labeled with Fam, Hex, and Fam, respectively. The same alleles were observed in both multiplex and separate amplifications, supporting the use of the multiplex assay for high-throughput typing.
The consistency of L-SSR markers was verified in a stability experiment of 4-day colony transfers ("generations") of two representative V. cholerae strains, O1 In-1 (+) and O1 Og ET-109 (), each tested for the stability of two loci, VCA0283-(6)14 and VC0147-(6)9. In order to examine the stability of these markers each day, two single colonies were picked from an overnight grown plate and streaked onto a fresh LB plate. Crude DNA extraction proved to be sufficient for the assessment of variants from the 60 colonies. All colonies from a specific strain showed amplification of the expected alleles [23 and 24 repeats at VCA0283-(6)14 and 10 repeats at VC0147-(6)9 for O1 In-1 (+) and O1 Og ET-109 (), respectively (see Table SA2 in the supplemental material)], supporting the appropriate consistency of L-SSR loci.
Sequence variation at MNR loci. Sequence analysis of MNR harboring loci provided information for both MNR tract variations and SNPs at their flanking sequences (Fig. 1). Most of the loci contained few MNR tracts (with numbers of repeats greater than five), at least one of which was polymorphic (Table 3). MNR tracts showed high polymorphism in seven of the loci with up to eight alleles (including no amplification product as one of the alleles; see Table 4 and Table SA3 in the supplemental material), despite their short length. These variations were mostly due to insertion and/or deletions in the MNR sequence itself. The corresponding diversity indexes ranged from 0.06 to 0.77. Additional SNPs were found at the flanking sequences of the eight polymorphic MNR loci. These multiple sequence variations that combine both MNR and SNPs presented the highest levels of polymorphism with 4 to 11 ST for a locus (Table 3). Diversity indexes were high as well, ranging from 0.37 to 0.79. These findings of highly variable sequences at noncoding MNR loci regions indicated the potential of their use for V. cholerae typing.
|
(i) L-SSR analysis. Phylogenetic analysis of the fragment size variation data at eight L-SSR loci (see Fig. SA2 in the supplemental material) enabled the partitioning of the 32 V. cholerae isolates to 27 different SSR types. We could discriminate among all O1 and O139 ctxA (+) isolates, such that each isolate presented a unique SSR type, whereas the 17 non-O1/non-O139 and O139 ctxA () isolates were divided into 12 different SSR types.
(ii) MNR-MLST. Two strategies were used for the phylogenetic analysis of sequence variations (MNR polymorphism and SNP) at the MNR loci: (i) direct analysis of the aligned sequences-sequence comparison (Fig. 2a; also see Fig. SA3 in the supplemental material) and (ii) nonparametric ST (NST) analysis (see Fig. SA4 in the supplemental material).
|
In order to include all sequence variation information, an NST analysis, including the use of nonamplified product as one of the alleles, was carried out. The NST-based dendrogram showed similar results to the sequence comparison dendrogram but included all 32 strains (Fig. 2a; also see Fig. SA4 in the supplemental material). The NST analysis enabled the separation of the 15 O1 and O139 ctxA (+) isolates to three sequence types, and allows us to discriminate among 11 of the 17 environmental non-O1/O139 and O139 ctxA () strains.
(iii) Combined nonparametric analysis. The nonparametric analysis enabled the inclusion of all source of variation data in one comprehensive analysis. Therefore, the data for eight L-SSR and eight MNR-MLST loci (NST) were combined and used for the inclusive phylogenetic analysis of all V. cholerae isolates (Fig. 2b). The resulting dendrogram discriminates among all O1 and O139 ctxA (+) isolates and among 12 of the 17 environmental non-O1/O139 and O139 ctxA () isolates.
In general, all analyses showed clear separation of the clinical strains from the environmental strains without relation to the isolation source or location (Fig. 2; also see Fig. SA2 to SA4 in the supplemental material). All O1 isolates ctxA (+ and ) and O139 ctxA (+) isolates were grouped together. The O139 ctxA () strain clustered with strains from the non-O1/non-O139 serogroups. However, the diversity, estimated by the genetic distance, among strains of the clinical or among the environmental groups was different as a result of the various analyses (Table 4). The MNR-MLST analysis showed low diversity among O1 and O139 ctxA (+) isolates compared to the diversity found among environmental isolates. In addition, the O37-129 isolate, which carries virulent potential, clustered closer to the clinical O1 and O139 ctxA (+) strains (Table 4 and Fig. 2a; see also the Discussion). The L-SSR, which is known to have higher mutation rate, showed high genetic distances across clinical O1 and O139 ctxA (+) isolates compared to the low diversity found by the MNR-MLST analysis. Interestingly, the L-SSR analysis showed higher genetic distances among the closely related clinical isolates than that found across environmental strains and O139 ctxA () (Table 4). The combination of both L-SSR and MNR-MLST variations showed higher genetic distances within the environmental strains. In addition, the combined analysis clearly separated the clinical and environmental groups (mainly by the MNR loci), while discriminating the closely related strains belonging to the O1 groups (by the L-SSR loci), thus yielding better prediction of the genetic relations among the tested strains (Fig. 2b).
PFGE. PFGE analysis was used as a reference method to assess the genetic interrelationship among the 32 V. cholerae strains (Fig. 2c). Chromosomal DNAs from all isolates were digested in agarose plugs with the restriction enzyme NotI, resulting in 17 to 24 bands. A total of 31 V. cholerae strains were partitioned to into 28 PFGE types and 1 strain, O49-128, could not be typed by the PFGE methodology used here and constantly appeared as a smear on the gel. Two pairs of isolates O140-20-O140-21 and O79-28-O79-29 revealed the same PFGE pattern (data not shown). Each of the 15 O1 and O139 ctxA (+) isolates gave a unique PFGE type. O139 ctxA () clustered together with environmental strains. No clear correlation was found between the PFGE clusters and the serological groups or the major phylogenic groups identified by SSR analysis.
| DISCUSSION |
|---|
|
|
|---|
SSRs were found to be highly abundant and evenly distributed in the genome sequence of V. cholerae (El-Tor N16961), as in other bacterial species (see, e.g., references 26, 27, 30, 41, 43, 80, and 81). MNR are the most abundant SSR in V. cholerae genome, in agreement with previous reports for E. coli and for Yersinia pestis (30, 43). The frequency of SSR (with repeats >3 and MNR of >5 bp) in the V. cholerae genome is similar to that found in the pathogenic E. coli O157:H7 EDL933 (one SSR every
150 bp). However, there are more and longer L-SSR tracts (longer than 12 bp) in the genome of E. coli O157:H7 (146 SSR tracts compared to 82 in V. cholerae). The hypothesis connecting part of the SSR to pathogenicity is supported by the more frequent appearance (per megabase) of L-SSR and of MNR (>8 bp) in chromosome 1 (Table 2), which carries more genes associated with pathogenicity than does chromosome 2 (32). Polymorphism at SSR tracts could have a functional role affecting both gene regulation and the expressed protein, such that this variation could become available for natural selection and subsequent evolution (36, 37, 42, 57). SSR polymorphism, found in regulatory regions of bacterial species, was found to be associated with variation in gene expression (i.e., on-off switches and levels), providing the dynamic response to environmental changes (4, 52, 56) [see the discussion of the VC1457-(7)4 locus below]. Most of the tested L-SSR are found in coding regions where different alleles could have various biological effects (Table 3) (41). Since most of the tested L-SSR consist of 3-, 6-, or 9-bp motifs, variations do not change the reading frame. In contrast to the L-SSR, MNR are 1.56-fold more abundant at noncoding regions. All tested MNR tracts were found to be located between 90 to 355 bp upstream of an ORF, regions that might harbor regulatory elements.
Polymorphisms at SSR loci were tested in a highly diverse genetic panel, including 32 V. cholerae strains, representing both clinical and environmental isolates. High polymorphism, but stability at the strain level, was found among the V. cholerae isolates at all tested SSR loci with corresponding high diversity indexes. There was a correlation, in general, between the number of alleles and the diversity indexes, mainly across the L-SSR loci (Table 3). High variation was previously found among O139 strains at the VCA0171-(6)23 locus by using polyacrylamide gel electrophoresis (83). In addition, variation of L-SSR tracts increases with the number of repeats, as found in other bacterial species (e.g., references 22 and 41).
The VC1457-(7)4 L-SSR locus is located at the promoter of the ctxA gene. This region originated from prophage islands and is known to exist in two or more copies in pathogenic O1 strains (3, 19, 53, 74). This could explain the presence of two amplification products (alleles) in the V. cholerae O1 In-124(+) strain. However, the consistency of a single amplification product in the other 31 isolates enables the use of this marker in cases where only one product is observed for a specific isolate. Furthermore, three to six repeats of the heptamer sequence at the VC1457-(7)4 locus were observed in the tested strain panel. This polymorphism should have direct effect on virulence since it was shown that the higher number of this tandem repeat is connected to a higher binding affinity of the ToxR, leading to higher toxin expression (55, 62).
MNR loci were found to be polymorphic and highly informative in V. cholerae. Variations of these loci were due to repeat variability in the MNR tract and due to the SNPs in the flanking sequences. Similar results were observed for MNR loci in E. coli (20, 30, 54) or in Mycobacterium avium subsp. paratuberculosis (1). We compared the variation found using MLST analyses of housekeeping or virulence genes to that of MNR loci. Both methods revealed the same number of alleles; however, the same degree of nucleotide variation was found on shorter segments (115 to 270 bp) of the MNR loci compared to a longer segment (560 to 1,100 bp) for the housekeeping, 16S-23S rRNA intergenic spacer regions or virulence genes (10, 13, 29, 44). The original MLST method is based on sequence variation at housekeeping genes that are conserved and therefore usually present in all strains but that have limited variation (21, 59, 69). On the other hand, the MNR loci that present high levels of polymorphism have the disadvantage of missing data from a portion of the strains due to nonamplified ("null") alleles (20, 75) as a result of sequence mismatch (SNPs) at the priming sites, as well as insertions or deletions and genome rearrangements (32). We have solved the problem of "no data" from part of the tested strains by using a nonparametric sequence type NST analysis. According to the NST analysis, no product was assigned as an additional ST allele. The NST analysis provides 100% typeability of any strain at any selected locus by combining both sequence variations and product amplification results versus the commonly used sequence based methods that consider only sequence variations (51, 79). In addition, NST analysis enables the combination of data from various diagnostics methods, e.g., MNR-MLST, L-SSR and AILP (75), as well as the comparison of results from various typing methods.
The data sets were used for phylogenetic analyses in order to compare the results obtained by the different typing strategies. In general, the analyses showed that all O1 isolates, including ctxA (+) and ctxA () isolates, and all O139 isolates [ctxA (+)] were grouped together, indicating, like other studies (73), that O1 and O139 strains have the same clonal origin. The O139 ctxA-negative strain clustered together with strains from the non-O1/non-O139 serogroups by both L-SSR and MNR-MLST methods, as well as by the PFGE method (Fig. 2). Although clustered together with the environmental strains, the O139 ctxA () isolate presented a unique allele in one SSR locus [VC0147-(6)9] and unique ST in two MNR loci [VC0929-(G)8 and VC1833-(T)9]. Accordingly, this strain also presented a unique PFGE type, and its serology clearly belonged to the O139 serogroup. These results support previous findings suggesting that serogroup O139 does not evolve from a unique clone (21).
Low diversity was found among pathogenic, clinical [O1and O139 ctxA (+)] isolates by the MNR-MLST. Most of these strains presented the same ST. These results are an outcome of the low mutation rate found at MNR loci, indicating the power of MNR-MLST analysis for long-term phylogeny relation. Such information can also be use for simple differentiation between pathogenic and nonpathogenic isolates. Similarly, the majority of O1 (ctxAB-positive and -negative) and all O139 (ctxAB-positive) strains presented the same PCR-SSCP profile (I) (66).
In contrast to the MNR-MLST results, high variation was found in the present study among pathogenic O1 and O139 ctxA (+) strains by the L-SSR, as well as by the PFGE methods. Each of these strains presented a unique L-SSR type and a unique PFGE type, thus indicating that the L-SSR have higher mutation rates compared to MNR-MLST in V. cholerae. Similar results were found in E. coli, where MNR-MLST could not differentiate among O157 strains (20) opposed to the L-SSRs (VNTRs) that have higher mutational rate and therefore have higher resolution power of closely related O157 E. coli strains (41). Interestingly, in Bacillus anthracis the mutation rate for L-SSR and MNR was found to be the same (40). The PFGE method is frequently used for epidemiological studies on account of its generally high discriminatory power, specifically in V. cholerae (2, 50). Here, the PFGE partitioned the 31 V. cholerae strains from various serological groups into 28 PFGE types versus the L-SSR method that separated all 32 strains into 27 SSR types. The L-SSR discrimination was shown to be as useful as the PFGE analyses in V. cholerae, resembling previous studies in E. coli and in Enterococcus faecalis (41, 77). On the other hand, the L-SSR method has the benefits of correlation to serological groups and typeability of all strains. Thus, the MNR-MLST and L-SSR typing (also termed MLVA) methods should be more suitable for evolutionary studies by their nature, since they parametrically detect multiple sequence variation that accumulates slowly compared to the highly discriminative PFGE method (51). Thus, multilocus SSR typing is a good tool for epidemiological and phylogeny studies of V. cholerae, as demonstrated with other bacterial species (22, 39, 41, 48, 78).
An additional example that the MNR-MLST can better predict a remote phylogenetic relation is demonstrated by the analysis of O37-129 strain (Fig. 2a; also see Fig. SA3 and SA4 in the supplemental material). Previously, a strain belonging to the O37 serogroup was shown to be responsible for the cholera outbreak in the Sudan in 1968 (5, 6). Interestingly, the O37-129 in the present study was isolated in Zanzibar during an unidentified cholera outbreak in February and March 2002. This strain harbors the nag virulence gene (31) but not the ctxA gene. The pandemic potential of cholera strains might be based on genomic features that are not only the acquisition of the TCP and the CT genes alone (23, 63, 68). According to MNR-MLST analysis, the O37-129 environmental strain shows genetic similarity to the pathogenic O1 and O139 ctxA (+) strains; therefore, it has the potential to evolve to the next pandemic strain by acquisition of TCP and CT genes.
In conclusion, SSR-based typing that combines L-SSR and MNR-MLST is a practical tool for differentiation between V. cholerae strains. The genotyping results obtained by this method are simply and easily compared between different labs and could be analyzed by high-throughput methods (71). The method combines genome-wide coverage of loci with different mutation rates, which could serve as an efficient tool for phylogeny studies and for rapid bacterial typing. Furthermore, it could meet the challenge of correctly assessing the risk of newly emerging epidemic variants.
| ACKNOWLEDGMENTS |
|---|
This research was supported by the Grand Water Research Institute, Technion; The Israeli Water Commission; Israel Science Foundation grant 1005697; and NATO project CBD.MD.SFP 981456.
| FOOTNOTES |
|---|
Published ahead of print on 20 December 2006. ![]()
Supplemental material for this article may be found at http://jcm.asm.org. ![]()
Present address: Department of Microbiology and Immunology, Uniformed Services University of the Health Sciences, 4301 Jones Bridge Rd., Bethesda, MD 20814. ![]()
| REFERENCES |
|---|
|
|
|---|