Previous Article | Next Article ![]()
Journal of Clinical Microbiology, December 2005, p. 5983-5991, Vol. 43, No. 12
0095-1137/05/$08.00+0 doi:10.1128/JCM.43.12.5983-5991.2005
Copyright © 2005, American Society for Microbiology. All Rights Reserved.
Måns Ullberg,1 and
Björn Herrmann1*
Department of Clinical Microbiology, Uppsala University Hospital, SE-751 85 Uppsala, Sweden,1 Biotage AB, Kungsgatan 76, SE-753 18 Uppsala, Sweden2
Received 4 May 2005/ Returned for modification 11 July 2005/ Accepted 13 September 2005
|
|
|---|
|
|
|---|
Several DNA-based methods have been evaluated for species identification of streptococci. DNA hybridization has been successfully used for detection of few species (4, 18) but has a limited use in broad range identification. Streptococcal species can be identified by measuring the tRNA intergenic lengths (8) and up to 31 species were distinguishable when the fragments were separated by capillary electrophoresis (2). Size separation can also be used for species identification when combined with amplification of selected targets (11, 23), arbitrarily primed PCR (33), and restriction of amplified rRNA genes (21, 37). The latter can discriminate a large number of species, but in all of these methods the identification is based on complex band patterns with limitations in resolution of closely related species. Sequence analysis has been applied for Streptococcus species identification, and the target sequences have included genes encoding functional RNA (16S rRNA [5], rnpB [40]), protein coding genes (sodaA [29], tuf [28], groESL [42], rpoB [9]), and noncoding spacer regions as ITS (6). Given the right target, sequencing has the discriminatory power to resolve closely related species, but has limitations in terms of ease of use and cost.
An alternative DNA sequencing technique, pyrosequencing, has been shown to accurately identify microorganisms on a large scale in a few hours (22, 31). Pyrosequencing is a real-time DNA sequencing method that, on average, analyzes 50 to 60 bases by detecting the release of pyrophosphate. It is faster, less expensive, and easier to perform than conventional sequencing.
In the present study, the target for pyrosequencing analysis was the rnpB gene coding for the RNA subunit of RNase P. This ribonucleoprotein is involved in tRNA maturation and is present in all cells and subcellular compartments that synthesize tRNA but is best characterized in the division Bacteria (reference 25 and references therein). The gene has been shown to be an excellent target for differentiation of bacterial species of the diverse genera Streptococcus (40), Chlamydia (16), and Legionella (32). The rnpB transcript consists of approximately 400 nucleotides and has a structure that enables some regions to vary greatly between species in both nucleotide composition and size, whereas others are conserved for retained enzymatic activity (15). In our previous study, sequence analysis of rnpB from 79 streptococcal strains, including 50 type strains, proved this gene to be suitable for phylogenetic analysis. In the present study, differentiation of 43 streptococcal species was achieved by pyrosequencing analysis of two short, multivariable regions of the rnpB gene. For assay validation, 113 clinical strains were analyzed by using pyrosequencing in parallel with two commercially available biochemical test systems.
|
|
|---|
In the present study the rnpB gene was sequenced in an additional 30 culture collection strains (Table 1). A local streptococcus rnpB database was created from the sequences of 48 type strains and 59 additional strains subsequently referred to as reference strains. The 107 database strains comprised in total 90 rnpB sequence variants. To validate the pyrosequencing assay 113 clinical blood isolates were used. These strains were isolated by using BacT/ALERT (bioMérieux, Marcy l'Etoile, France) and subculture on Columbia agar supplemented with defibrinated horse blood. Preliminary identification was achieved by using Rapid ID 32 Strep (bioMérieux) and other routinely used tests. The isolates represented all streptococcal species found in blood cultures during 28 months at Uppsala University Hospital and were chosen with a bias toward uncommon species and strains within the mitis group.
|
View this table: [in a new window] |
TABLE 1. Streptococcal reference strains in the local database of rnpB sequencesa
|
PCR amplification. Primers were selected in conserved gene regions for the amplification of a fragment comprising about two thirds of the rnpB gene in streptococci. In silico analysis of available rnpB sequences in GenBank did not indicate primer annealing to nonstreptococcal species. The primers did not yield any PCR product when tested on DNA from the closely related genera Enterococcus and Gemella. The primer A1118FP (5'-GTGCAATTTTTGGATAATCG-3') and the 5' biotinylated primer A1121RPB (5'-TGGGTTGCTAGCTTGAGG-3') generated a DNA fragment between 242 and 254 bp, including the variable regions P3 and P9. The amplification reaction contained 1.5 mM MgCl2, 0.2 µM each primer, 0.2 mM deoxynucleoside triphosphate, and 2 U HotStar Taq polymerase (QIAGEN) in PCR buffer. A total of 5 µl of target DNA (QIAamp DNA Minikit, QIAGEN) was added to a 45-µl reaction mixture. The PCR was performed according to the following program: 15 min of enzyme activation at 94°C, followed by 45 cycles of 94°C for 30 s, 58°C for 30 s, and 72°C for 30 s, and a final step of 72°C for 5 min.
Pyrosequencing. The two variable regions P3 and P9 in 43 Streptococcus type strains were sequenced in the sense direction. Single-stranded DNA was prepared from 10 µl of PCR product by using the standard protocol for the Vacuum Prep Tool (Biotage AB, Sweden). The sequencing primer for the P3 region was A1123FS (5'-CAATTTTTGGATAATCG-3') and for the P9 region, A1171FS (5'-AATAAGCCTAGGGA-3'). Pyrosequencing was performed by using the SQA reagent kit with the addition of 1 µg of single-strand binding protein (SSB; Amersham Biosciences, Sweden) on a PSQ 96MA instrument (Biotage). The reproducibility of the pyrosequencing assay was evaluated for seven type strains (S. equi, S. dysgalactiae, S. gordonii, S. pneumoniae, S. pyogenes, S. vestibularis, and S. thermophilus). Each type strain was sequenced 20 times. Validation of the pyrosequencing assay was performed on 113 clinical isolates from blood cultures, and the sequences obtained from the P3 and P9 region were compared to sequences in the local database by using the nucleotide-nucleotide BLAST algorithm. Clinical isolates were identified as the species giving the highest sequence alignment score when the first 30 nucleotides of each region were compared to a database of type and reference strain sequences.
rnpB sequencing. To extend the rnpB database the DNA sequence of a PCR fragment, comprising 97.5% of the rnpB gene, was determined in the reference strains listed in Table 1. A selection of the clinical isolates was also sequenced. Sequencing of the nearly complete rnpB gene was performed by conventional methods as previously described (40), and the sequence obtained is hereafter referred to as the entire gene sequence. Identification of clinical strains based on entire gene sequences was performed by using BLAST in the same manner as with sequences obtained by pyrosequencing.
In silico sequence analysis. Sequences were aligned by using CLUSTAL W and manually edited to obtain homology. To quantify the variability in the gene the Shannon-Wiener index (39), also expressed as the entropy, was computed for each position in the multiple alignment. Principal coordinate analysis (14) was computed on Kimura two-parameter correlated distances and pairwise deletion gap handling by using a MATLAB function developed in house. Phylogenetic analysis was performed by using the neighbor-joining algorithm and 1,000 bootstrap replicates.
Biochemical tests. The two commercially available streptococcus identification systemsRapid ID 32 Strep (RID32) and the automated VITEK 2 (bioMérieux)were used to identify the 113 clinical samples also analyzed by the pyrosequencing assay. RID32 was performed according to the manufacturer's instructions and was visually interpreted. The VITEK 2 ID-GPC test card was used which identifies 20 streptococci subspecies, species, and species groups. Confidence levels "acceptable identification" and above in VITEK 2 were judged equally and an identity of >80% in RID32 was considered to be a single species result.
|
|
|---|
![]() View larger version (13K): [in a new window] |
FIG. 1. Shannon-Wiener index for each position in a multiple alignment of the rnpB gene from 43 streptococci type strains. The three variable regions corresponding to the P3, P9, and P10 helixes are indicated in the picture.
|
|
View this table: [in a new window] |
TABLE 2. Pairs of type strains with rnpB sequences that have two or fewer differences in total in the two regions
|
|
View larger version (17K): [in a new window] |
FIG. 2. P9 region pyrograms from S. mitis (A) and S. oralis (B). Generated sequences are shown below the pyrograms. Variable nucleotides, indicated by bold type in the sequence, generate peaks indicated by arrows in the figure.
|
Reproducibility test. The two variable regions in the rnpB gene were sequenced 20 times for seven Streptococcus species in order to test the reproducibility of the assay. For both regions, the reproducibility was high, with a median read length of 51 for P3 and 54 for P9. Longer read lengths would have been obtained if the number of nucleotide dispensations had been increased, but this was not relevant for species identification. In 267 of the 280 reactions (95%) more than 30 bases were correct. Poor results were only obtained for S. gordonii due to a six-base homopolymer in the P3 region; in 10 of the 20 reactions, only 17 bases were correctly interpreted.
Pyrosequencing analysis of clinical samples. A total of 113 clinical samples were analyzed by pyrosequencing, and all except eight isolates resulted in single species identification. The results were compared to the identifications from the commercial systems VITEK 2 and Rapid ID 32 Strep. For isolates with ambiguous results in the three test systems the rnpB gene was sequenced according to the method of Tapp et al. (40) to confirm the identification result from the pyrosequencing assay. In addition, 10 isolates of each of the common species S. agalactiae, S. pneumoniae, and S. pyogenes were sequenced in order to further investigate the sequence variation and extend the database.
The first 30 nucleotides of the two regions were aligned to sequences in the local rnpB database. Identical sequences in both regions were found for 96 (85%) of the 113 isolates. All 113 isolates are listed in Table 3 according to the species identifications from the pyrosequencing assay. The assay inconclusively identified eight isolates. None of the pyrosequencing identification results were in conflict with the identifications based on the entire rnpB sequence. The intraspecies diversity was small for the strains identified by the pyrosequencing assay as S. pyogenes, S. agalactiae, and S. dysgalactiae with identical or a single nucleotide difference in the P3 and P9 region compared to the respective type strain. Entire rnpB sequences of 10 isolates identified as S. pyogenes and 10 of S. agalactiae showed only minor (
2 nucleotides) deviations from the type strains sequences.
|
View this table: [in a new window] |
TABLE 3. Species identification for the clinical isolates in the pyrosequencing assay presented with their corresponding result in the biochemical test systems VITEK 2 and Rapid ID 32 Strep
|
Sixty isolates were pneumococci and viridans group streptococci, a group that comprises closely related species. The entire rnpB gene was sequenced in a majority of these isolates. Principal coordinate analysis (PCoA) reveals the relative difference in sequence similarity between strains. It was here used to visualize the discriminatory power of the pyrosequencing assay. The combined P3 and P9 region was analyzed for type strains, reference strains and clinical isolates of three viridans streptococci groups: the anginosus group, the salivarius group, and the mitis group (Fig. 3A, C, and E). Because of the close relatedness of S. mitis, S. oralis and S. pneumoniae, these species were poorly resolved and a fourth analysis was carried out on these species (Fig. 3G). To compare the clustering performance of the P3/P9 region with that of rnpB, PCoA was also computed on the sequences of the entire gene showed in Fig. 3B, D, F, and H. The resolution of the clustering in the P3/P9 analysis was slightly lower compared to the clustering of the entire gene sequences because of the decreased amount of data. However, no clusters were in conflict for any of the four groups when the results from the analysis of the combined P3 and P9 regions were compared to the results from analysis of the entire gene.
![]() View larger version (18K): [in a new window] |
FIG. 3. The first two principal axes resulting from principal coordinate analysis of the combined P3 and P9 region sequences (left) and the entire rnpB sequences (right) of species groups within viridans group streptococci. (A and B) Anginosus group; (C and D) salivarius group; (E and F) mitis group; (G and H) S. mitis, S. oralis, and S. pneumoniae. Type strains are denoted in the plots, reference strains are indicated with dots, and clinical isolates are indicated by an "x." The number of clinical isolates can be found in Table 3. Indistinct clusters are circled or separated by a line for clarity. The isolate indicated with a question mark could not be identified either by pyrosequencing or by sequencing of the entire gene.
|
The type strains and 12 reference strains of the salivarius group form three distinct clusters in the PCoA of the whole rnpB gene (40) (Fig. 3D). The same clusters appear, albeit less distinctly, when the data are reduced to include only the P3 and P9 regions (Fig. 3C). This implies that the few nucleotide differences in the two regions are species specific. The rnpB gene sequence of four of the clinical isolates identified by pyrosequencing as S. salivarius and two identified by pyrosequencing as S. vestibularis was determined. These isolates cluster with their respective reference strain in PCoA of the rnpB gene, as well as the combined P3 and P9 region (Fig. 3C and D).
In the principal coordinate analysis of the mitis group (S. mitis, S. oralis, S. pneumoniae, S. gordonii, S. parasanguinis, S. sanguinis, S. infantis, and S. peroris) and S. pneumoniae distinct clusters were formed for the species S. gordonii, S. parasanguinis, and S. sanguinis in the entire gene plot, as well as in the plot with the two variable regions (Fig. 3E and F). S. infantis could not be distinguished from S. peroris; thus, the two species form a separate cluster together. The resolution was too low to discriminate between the three species S. mitis, S. oralis, and S. pneumoniae but was significantly increased when these three species were reanalyzed separately (Fig. 3G and H). In the separate analysis, S. oralis forms a distinct cluster in the analysis of the entire gene sequences and is also clearly separated from S. mitis and S. pneumoniae in the P3/P9 plot, even though the cluster is less obvious. The S. mitis and S. pneumoniae clusters are distinguishable in the entire gene analysis except one clinical isolate, which is situated between the clusters. However, since S. mitis is a more divergent species than S. pneumoniae, it is more likely that the isolate belongs to S. mitis. This isolate is one of the two that could not be conclusively identified by pyrosequencing. The other inconclusively identified isolate clusters with S. mitis in the entire gene analysis. No distinct clusters for S. mitis and S. pneumoniae were seen in the analysis of the P3/P9 region, but a separation line derived from the identifications based on the entire gene sequence could be drawn (Fig. 3G).
A phylogenetic analysis of rnpB sequences from type strains, reference strains, and clinical isolates for which the entire gene had been sequenced was performed. Type and reference strains belonging to groups for which no clinical isolate had been found were not included to reduce the size of the analysis. A majority of the strains formed species-specific clades with bootstrap values >90%; thus, the obtained species identity by pyrosequencing was supported by the phylogenetic analysis. The S. pneumoniae and S. salivarius clades had lower values (82 and 65%, respectively), and the strains in the anginosus group and the S. mitis strains did not constitute a single clade because of the large intraspecies variability.
Comparison of the pyrosequencing assay with VITEK 2 and Rapid ID 32 Strep. In 58 (51%) of the 113 clinical isolates, all three assays resulted in the same species identity. The pyrosequencing assay was concordant with VITEK 2 in 85 cases (75%), whereas the corresponding figure for RID32 was 88 (77%). Noteworthy, the concordance between the two biochemical systems was only 71%. The VITEK 2 system resulted in less-discriminatory identifications compared to pyrosequencing for 11 isolates, whereas the reverse was found in three cases. Furthermore, in the VITEK 2 system, eight isolates were unidentified and seven were identified as other species than by the pyrosequencing assay. In RID32 eight isolates were identified with lower and four with higher species discrimination compared to pyrosequencing. Thirteen were identified as other species than in the pyrosequencing assay.
Table 3 shows the species identity obtained by pyrosequencing and the corresponding identification from the biochemical test systems. Full accordance between the three systems was seen for all isolates identified as S. pyogenes, for one isolate identified as S. mutans, and for a majority of the isolates identified as S. agalactiae and S. dysgalactiae. The RID32 system could subtype S. dysgalactiae and 16 of the 21 isolates were determined to the subspecies equisimilis, whereas the remaining were identified to the species level or as ambiguous. VITEK 2, however, cannot distinguish between S. dysgalactiae and other group G streptococci.
For isolates of the viridans and bovis groups, the three test systems were in less agreement. The anginosus group was better resolved with the biochemical systems. Four isolates were inconclusively identified in the pyrosequencing assay. Two had rnpB sequences identical to the S. anginosus type strain and were identified as S. anginosus by both biochemical systems, and the two with sequences identical to S. constellatus were accordingly identified as S. constellatus. The nine salivarius group isolates were identified by the pyrosequencing assay as seven strains of S. salivarius and two strains of S. vestibularis. For a majority of these isolates, the three systems were in agreement. However, one isolate identified as S. salivarius and one as S. vestibularis by both sequencing and VITEK 2 were typed as S. bovis biovar II and S. salivarius, respectively, by RID32.
Isolates identified as species in the mitis group and S. pneumoniae displayed the most ambiguous results. RID32 identified S. oralis and S. parasanguinis in concordance with pyrosequencing. S. parasanguinis is not included in the database of VITEK 2, and these three isolates came out as either unidentified or S. mitis. Furthermore, VITEK 2 is unable to distinguish between S. oralis and S. mitis, resulting in low discrimination for these species. VITEK 2 identifies all four isolates of S. gordonii. For a majority of the 17 isolates identified as S. pneumoniae the three systems were in accordance. However, without supplementary tests four isolates using VITEK 2 and six isolates using RID32 were identified as S. mitis or S. oralis or inconclusively identified. S. infantis and S. peroris were not included in the database of either biochemical systems and resulted in incorrect identification as another species or as unidentified. The two isolates that were inconclusively identified as S. mitis or S. pneumoniae by pyrosequencing, one of which could not be conclusively identified by entire gene sequencing, were identified as S. mitis/S. oralis by both biochemical test systems.
|
|
|---|
rnpB has the required characteristics to serve as a target for broad range pyrosequencing analysis with highly variable regions close to conserved primer targets. The 16S rRNA gene shares these features, but the discriminatory capacity of the P3 and P9 regions of rnpB is higher than that of the V1 region of 16S rRNA gene, which has been used for broad-range bacterial identification by using pyrosequencing (22). We found V1 region sequences in GenBank from 40 of the 43 Streptococcus species included in the present study. These sequences comprised 33 unique variants, in contrast to the combined P3/P9 region of rnpB that had 42 variants in 43 species. Protein coding genes can also exhibit sequence variation in closely related species, but the variability is often spread out in the gene and would therefore require analysis of longer fragments for species identification.
Sequence-based identification provides discrete data that are less susceptible to problems of reproducibility and interpretation compared to methods based on band patterns or biochemical properties of organisms. Biochemical tests may identify up to 25% of the strains incorrectly or ambiguously (13, 24). Nevertheless, gene sequences in bacterial species may contain strain variation that must be investigated to enable accurate identification. In the present study, the P3 and P9 sequence fragments of rnpB obtained for 113 blood isolates were compared to a database of near-complete rnpB sequences containing 91 sequence variants representing 108 streptococcal strains. Since 85% of the fragments from the clinical isolates were found in the reference database, it is concluded that the intraspecies sequence variation on the whole is low in rnpB. PCoA has previously been shown to be a powerful method for cluster analysis of sequences (17). In our study, PCoA analysis of sequence variation in strains of closely related species showed that the information in rnpB sequences enable species identification without overlapping clusters and that the same clusters appear with a similar resolution when the P3/P9 fragments of the gene is analyzed. The comparison of the pyrosequencing assay with the two biochemical systems for streptococci identification revealed that pyrosequencing had the highest resolution and the fewest cases of ambiguous results.
Compared to conventional sequencing technology, pyrosequencing is restricted to the analysis of short target sequences, a limitation that does not affect the final result if the sequence is shown to contain sufficient information for reliable identification. Instead, the sequences obtained by pyrosequencing do not have to undergo time-consuming evaluation, and the sequence alignment is easy to overview and handle. A technical limitation, also seen in conventional sequencing, is the imperfect resolution of homopolymers, leading to a decreased reproducibility of the pyrosequencing assay used for streptococci. However, this limitation did not affect the accuracy of species identification in the present study. We have shown that Streptococcus species can be differentiated by sequence analysis of 60 nucleotides in the rnpB gene. Other targets used for streptococcal identification have a similar resolution in species discrimination, although using 5 to 20 times as many nucleotides (5, 6, 28, 42).
The resolution of streptococci identification provided by the pyrosequencing assay is higher than most routine diagnostic identification systems, but high-resolution identification is not crucial in most cases from a clinical point of view. It is, however, generally considered important to differentiate between S. pneumoniae and oral streptococci, notably S. oralis and S. mitis. Species identification is rewarding when the prognosis of infection can be improved. This is the case for the anginosus group where the three species have different virulence to cause abscess formation (7). Another example is the importance of careful identification of septicemic bovis group streptococci associated with colon cancer (34).
Viridans group streptococci are difficult to differentiate, especially by phenotypic systems (12, 13) but also by molecular methods (2, 5). The inability of our assay to distinguish the type strains of S. anginosus and S. constellatus was not surprising since there is considerable heterogeneity within S. anginosus (Fig. 3A and B) (20), and its type strain is atypical for the species (40, 44). In addition, S. intermedius forms more than one cluster (Fig. 3A and B) (19). The heterogeneity may be explained by recombination, which has been shown in the 16S rRNA gene for this group (38). The pyrosequencing assay can identify S. intermedius in any of the two clusters and a subset of S. anginosus dissimilar to the type strain. However, if recombination has occurred, then rnpB is not a suitable target for differentiation of the anginosus group. A reclassification of separate strain clusters could improve the taxonomy and facilitate strain identification.
S. pneumoniae has previously been reported to be closely related to S. mitis and S. oralis and genetically inseparable from S. mitis (6, 43). In routine diagnostic tests S. pneumoniae is distinguished from S. mitis/S. oralis by optochin susceptibility and bile solubility. The isolate found in between the cluster of S. pneumoniae and S. mitis in the PCoA analysis was bile insoluble and optochin resistant, indicating that it belongs to S. mitis. However, S. pneumoniae can be optochin resistant and bile insoluble, as well as S. mitis and S. oralis being optochin susceptible (1, 43). One of our isolates, identified as S. pneumoniae by rnpB sequence analysis, was optochin susceptible in 5% carbon oxide and bile insoluble. This isolate has a single nucleotide difference compared to S. pneumoniae type strain in rnpB but was identified as S. mitis/S. oralis by both biochemical systems and has an antibiotic resistance pattern atypical for S. pneumoniae strains. A second isolate identified as S. pneumoniae by sequencing and as S. mitis/S. oralis by the biochemical systems was optochin resistant and bile insoluble. This isolate has 98% rnpB sequence identity to the type strain of S. pneumoniae and 95% to S. mitis. What species this isolate actually belongs to is a matter of species definition.
Two of the clinical isolates were species in the bovis group, a group with confusing nomenclature. One was identified as S. gallolyticus by both pyrosequencing and VITEK 2. In Rapid ID 32 Strep, S. gallolyticus is not included in the database, and the isolate was identified as S. bovis biovar I. However, biovar I correlates well with the genotype of S. gallolyticus (36). The other bovis group isolate was identified as S. bovis biovar II.2 by both biochemical test systems and as S. gallolyticus by genotyping. A recently characterized species, Streptococcus pasteurianus, is closely related to S. gallolyticus and has a unique Rapid ID 32 Strep biotype pattern similar to the pattern of this isolate (29; data not shown), indicating that the isolate might belong to S. pasteurianus. In all, the pyrosequencing assay can identify S. equinus, S. gallolyticus, S. infantarius, and possibly S. pasteurianus, but it cannot differentiate the subspecies of S. gallolyticus or S. infantarius. The clinical isolates in the present study were prepared exclusively from blood isolates. To further evaluate the system other clinically relevant streptococci isolates should be included.
In this assay a sequence fragment obtained by pyrosequencing was associated to a species in a local database using BLAST, which is not optimized for short sequences and relatively small databases. Since the present study was completed, specific software, called IdentiFire (Biotage), has been developed to find the best alignment using Pyrosequencing data.
In summary, the rnpB based pyrosequencing assay here described can reliably identify a large range of Streptococcus species and has a resolution similar to that of using sequence data of complete genes. It is a simple, fast, and reproducible method to be used as an alternative or complement to conventional systems.
Present address: Biology Education Centre, Uppsala University, Box 592, SE-751 24 Uppsala, Sweden. ![]()
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»