| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Previous Article | Next Article ![]()
Journal of Clinical Microbiology, September 2007, p. 2985-2992, Vol. 45, No. 9
0095-1137/07/$08.00+0 doi:10.1128/JCM.00630-07
Copyright © 2007, American Society for Microbiology. All Rights Reserved.
,
Division of Molecular Pathology, Department of Pathology, Texas Children's Hospital, Houston, Texas,1 Department of Pathology, Baylor College of Medicine, Houston, Texas2
Received 21 March 2007/ Returned for modification 9 May 2007/ Accepted 18 July 2007
| ABSTRACT |
|---|
|
|
|---|
| INTRODUCTION |
|---|
|
|
|---|
The implementation of DNA sequencing in diverse clinical laboratory settings requires the availability of user-friendly technologies that minimize labor and maximize cost-effectiveness. DNA pyrosequencing, or sequencing by synthesis, was first introduced in 1996 as a rapid and less expensive alternative to traditional Sanger DNA sequencing (27). Since its inception in the mid-1990s, DNA pyrosequencing assays have been developed for diverse applications, including genotyping, single nucleotide polymorphism detection, and microorganism identification (21). Pyrosequencing has been used to detect point mutations in antiviral or antimicrobial resistance genes as a strategy for molecular resistance testing (12, 20, 34). Pyrosequencing has been applied to organism identification by combining short-stretch DNA sequencing with signature matching in the well-characterized phylogenetic target, the 16S rRNA gene (14, 30), in addition to in a variety of target genes in bacteria (9, 13, 24, 32). Although pyrosequencing yields limited amounts of DNA sequence information, highly informative target sequences within the 16S rRNA gene facilitated the identification of pathogens, such as Helicobacter pylori (23) and Mycobacterium species (31). Pyrosequencing of the 16S rRNA gene was also used to develop a "molecular Gram stain" in order to rapidly classify bacteria as gram positive or gram negative by using molecular methods (16). A follow-up study (17) documented that molecular Gram stain results agreed with culture results in 85.7% of cases versus agreement in only 35.7% of cases with conventional Gram stains. Pyrosequencing also categorized pathogens associated with cases of neonatal sepsis based on group-specific signature sequences (15).
Microbial DNA sequencing applications often target regions within 16S rRNA genes for broad-range identification of different groups or individual species. Bacterial 16S rRNA genes consist of eight highly conserved and nine variable regions (33). V1 and V3 represent two distinct variable regions within the 16S rRNA gene, and these regions are the targets for the pyrosequencing-based identification assay presented in this study. The assay capitalizes on the highly conserved nature of 16S rRNA genes by positioning amplification and sequencing primers in the conserved regions flanking variable regions, specifically V1 and V3, thereby allowing primers to theoretically amplify most bacterial pathogens. The primers targeting the V1 and V3 regions were originally developed for pyrosequencing-based classification of bacteria using 10 nucleotides from each region (14), and the same primers were also utilized for the detection of bacterial contamination of water samples used in PCRs (10).
The molecular microbiology laboratory at Texas Children's Hospital implemented routine DNA pyrosequencing based on previously described parameters (14) in order to identify clinical isolates refractory to biochemical identification. DNA pyrosequencing-based identification coupled with culture findings and biochemical results provided a robust approach for more accurate identification of bacterial pathogens. The data presented here included a total of 414 patient isolates evaluated during a 31-month time period from December 2003 to July 2006. This polyphasic strategy facilitated a cost-effective approach for improved diagnostic bacteriology services by integrating DNA sequencing with conventional methods in the clinical laboratory.
| MATERIALS AND METHODS |
|---|
|
|
|---|
DNA extraction. Pure bacterial cultures were submitted for DNA extraction on plated media, and suspected mixed cultures were purified or rejected. Bacterial DNA was extracted with the Mo Bio UltraClean microbial DNA kit (Mo Bio Laboratories, Inc., Carlsbad, CA) according to the manufacturer's instructions. A 10-µl inoculating loop of bacteria served as starting material for the extraction, and chromosomal DNA was eluted in a final volume of 35 µl elution buffer. DNA quantitation was performed by absorbance spectrophotometry of purified DNA in the NanoDrop ND-1000 spectrophotometer (NanoDrop Technologies, Wilmington, DE).
DNA amplification. Separate PCRs were amplified for the V1 and V3 regions. Each 50-µl reaction mixture consisted of 0.8 mM deoxynucleoside triphosphates (Applied Biosystems, Foster City, CA), 2.5 mM MgCl2 (Applied Biosystems), GeneAmp 10x PCR Gold buffer (Applied Biosystems), 0.2 µM of each primer, 1.25 U AmpliTaq Gold DNA polymerase LD (Applied Biosystems), and 10 nanograms of bacterial DNA. Previously described primers were used to amplify the V1 and V3 regions of 16S rRNA genes (14). Nucleotide positions refer to positions in the Escherichia coli 16S rRNA gene. Bio-pBR5 (positions 6 to 27, 5'-biotin-GAAGAGTTTGATCATGGCTCAG-3') and pBR-V1 (positions 120 to 101, 5'-TTACTCACCCGTCCGCCACT-3') were used for V1 amplification, and Bio-B-V3 (positions 1047 to 1027, 5'-biotin-ACGACAGCCATGCAGCACCT-3') and pJBS.V3 (positions 947 to 967, 5'-GCAACGCGAAGAACCTTACC-3') were used for V3 amplification. PCR was performed in the GeneAmp PCR system 9700 (Applied Biosystems) under the following cycling parameters: 10 min at 95°C, 35 cycles of 95°C for 40 s, 55°C for 40 s, and 72°C for 60 s, followed by a single cycle of 72°C for 60 s.
DNA pyrosequencing. The amplified products for V1 and V3 were prepared for pyrosequencing by using the recommended protocol for the vacuum prep tool (Biotage AB, Uppsala, Sweden). For each reaction, 40 µl of the biotinylated PCR product was used in the preparation. To prepare the sequencing plate, purified PCR products were resuspended in 40 µl of annealing buffer with 0.3 µM sequencing primer. Primers pBR-V1 and pJBS.V3 were used as DNA sequencing primers for the V1 and V3 regions, respectively. Pyrosequencing was originally performed on the PSQMA instrument (Biotage) using the PSQ SQA kit, but transition to the PSQ Gold SQA reagent kit was made on March 16, 2005.
Organism identification. The resulting DNA pyrograms were automatically analyzed by the PSQMA software (version 2.1). Initially, all pyrograms were manually reviewed and the reviewer determined the number of bases that were used in the subsequent search. In December 2005, the process was revised so that the bases that were assigned "good quality" or "check quality" by the software were automatically used in the search, with a manual review of all bases utilized when no clear match was observed. The selected bases were used to search the Ribosomal Database Project (RDP), version 9.36 (http://rdp.cme.msu.edu/) (3). A minimum of 15 bases in at least one of the two regions (V1 and V3) was required to proceed in searching the RDP database. The sequence match tool of the RDP website was used with search preferences assigned to type strains, sequences from individual isolates, nearly full-length sequences (>1,200 bases), and "good quality" sequences. Organisms with a 100% DNA sequence match were compared with the microbiologic findings, such as morphological and biochemical features, and final bacterial identifications were established using this polyphasic strategy.
Sanger sequencing of the 16S rRNA gene. A subset of clinical isolates evaluated by DNA pyrosequencing was also studied by conventional Sanger (dideoxy) sequencing. The oligonucleotide primers 16S 8F (5'-AGAGTTTGATCCTGGCTCAG-3') and 16S 1541R (5'-AAGGAGGTGATCCAGCCGCA-3') were used for PCR amplification of a 1,533-bp amplicon. Following PCR amplification, the 16S 8F oligonucleotide primer was employed for DNA sequencing by Seqwright DNA Technology Services (Houston, TX). Resulting sequences were truncated for maximum quality and searched using the RDP and GenBank databases.
| RESULTS |
|---|
|
|
|---|
|
All 38 isolates not identified by DNA pyrosequencing were submitted for Sanger (dideoxy) DNA sequencing. Eleven isolates either produced no sequence after multiple attempts or provided no significant matches. Eight isolates provided matches to two genera, 5 isolates were identified to the genus level, and 14 isolates were identified to the species level. Table 2 provides a list of organisms identified from this group of 38 isolates not identified by DNA pyrosequencing.
|
Correlation of DNA pyrosequencing data with data obtained by Sanger DNA (dideoxy) sequencing. A total of 27 isolates previously identified by DNA pyrosequencing were submitted for Sanger (dideoxy) DNA sequencing, but informative sequences were obtained for 26 of 27 isolates. For this group of isolates, the average number of bases obtained by Sanger sequencing was 601, and this number contrasts with the relatively low average of 63 bases (V1 plus V3) for the final 4-month period of this pyrosequencing study. Comparative DNA sequencing results obtained by pyrosequencing and Sanger sequencing are presented in Table 3. When DNA pyrosequencing yielded identifications and Sanger sequence was obtained (n = 26), the results matched the genera (26 of 26) obtained with Sanger sequencing in 100% of the cases examined (Table 3).
|
Performance improvements with advances in pyrosequencing chemistry and informatics. Several improvements to the pyrosequencing methodology have been incorporated since the assay implementation in December 2003. Improvements occurring during the time period of the study included the introduction of enhanced assay reagents (PSQ Gold; Biotage, Inc.), a key sequencing software upgrade (version 2.1 PSQ 96MA; Biotage, Inc.), and improvements in the search engine of the RDP. The average number of cumulative bases per isolate was 61 for the entire study (V1 plus V3 sequence data), and Fig. 1A depicts the changes in the mean numbers of cumulative bases per 4-month interval throughout the course of the study.
|
Two key developments during the study period improved the bioinformatics tools for analyses of DNA pyrosequencing data. The software for DNA pyrosequencing was upgraded from version 1.0 to version 2.1 in July 2004, and the new software version included improved algorithms for accurate base calling in homopolymeric regions. In September 2004, the RDP released version 9 (version 8 was used previously). A key development in version 9 was the incorporation of an updated, phylogenetically consistent, higher-order bacterial taxonomy as proposed by Garrity et al. (8). Other improved features in the RDP allowed users to select type isolates for particular bacterial species depending on established criteria, facilitating more focused refinement of the sequence matching process. DNA sequences from a total of 113 isolates were queried using RDP, version 8, and species-level identifications were made for 36% of the isolates. Of the 259 isolates sequenced and analyzed with RDP, version 9, species-level identifications were established for 57% of the isolates. Changes in sequencing chemistry as described were occurring in parallel and may be partly responsible for the improved accuracy of pathogen identification.
| DISCUSSION |
|---|
|
|
|---|
As shown in the current study, specific phenotypic groups, such as nonfermenting gram-negative bacilli and gram-positive bacilli may be difficult to identify despite successful culture and attempted biochemical testing. Nonfermenting gram-negative bacilli represent a group of pathogens that are difficult to identify by conventional biochemical methods and may require DNA sequencing for identification (1). In this study, gram-negative bacilli accounted for 91% of respiratory tract isolates and 58% of all isolates by Gram stain morphology. Prior studies have highlighted the identification challenges with unusual gram-negative bacilli. Compared to results of extensive phenotypic methods, partial 16S rRNA gene sequencing identified 97.2% and 89.2% of atypical, aerobic, gram-negative bacilli to the genus and species levels, respectively, in one study (29). DNA sequencing yielded unambiguous results with the highest concordance to established microorganism identifications relative to fatty acid profiling and carbon source utilization patterns. Even gram-negative bacilli that are usually considered straightforward, such as P. aeruginosa, may be challenging to identify by phenotypic methods. Pediatric patients with cystic fibrosis often yield P. aeruginosa isolates that are not identified by phenotypic methods (7), and this study included 33 such isolates that were identified unambiguously only by DNA pyrosequencing.
Gram-positive organisms, such as those of the genera Microbacterium and Rothia, pose challenges for biochemical approaches to bacterial identification. The genus Microbacterium represents a group of gram-positive bacilli that yielded multiple clinical isolates refractory to biochemical identification in this study. A prior study (19) documented the potential clinical significance of Microbacterium isolates obtained from the peripheral blood of patients with myeloid leukemia. This study (19) also highlighted the limitations of phenotypic methods for Microbacterium species identification. Interestingly, all Microbacterium isolates in this study were also obtained from peripheral blood specimens, emphasizing their potential importance in immunocompromised children. Routine identification of gram-positive cocci may benefit from DNA sequencing applications. The performance of DNA pyrosequencing was superior to that of the VITEK 2 biochemical testing panel for streptococcal speciation, and 75% of the group-level identifications were concordant between both methods (11). Rothia species represent a distinct group of gram-positive cocci that were detected in this set of isolates identified by DNA pyrosequencing. Specifically, Rothia mucilaginosa was identified from several pediatric patients, and its identification highlights the potential pathogenicity of this organism in immunocompromised patients. This species was reclassified from the genus Stomatococcus (4) and represents a challenging group of organisms that often are not classified or are misidentified by conventional approaches.
Advances in the chemistry of DNA pyrosequencing improved the performance of this methodology during the time of clinical testing covered in this study. The addition of recombinant E. coli single-stranded DNA binding protein to the pyrosequencing reagents (PSQ Gold) enhanced the performance of pyrosequencing in prior published studies (6, 26) and in the current study. The addition of Ssb represents a key modification in pyrosequencing chemistry that has been associated with improvements in read length, reduction in mispriming events, increased enzymatic efficiency, and greater accuracy of sequencing data (26). This study suggests that the addition of recombinant Ssb resulted in improved bacterial identification. Further refinements in DNA pyrosequencing may further increase read lengths and augment the ability to improve diagnostic accuracy. A recent study implicated the diminished efficiency of apyrase as a possible culprit of limited read lengths and provided evidence for superior performance of a three-enzyme system (minus apyrase) (22). This study provided evidence that replacement of apyrase with a carefully orchestrated washing step may improve read lengths to greater than 300 bases (22).
The application of bioinformatics strategies to pathogen identification presents several challenges. First, databases used for the identification of clinical isolates should utilize phylogenetically validated databases, such as the RDP (http://rdp.cme.msu.edu), instead of open sequence databases, such as GenBank or EMBL, that lack phylogenetic validation of microorganisms. Prior bacterial identification studies have usually queried generic databases, such as GenBank, with success. However, database standards should be carefully considered in the context of phylogeny, especially when considering DNA sequencing strategies for the identification of clinical isolates on a routine basis. Secondly, local databases may augment comprehensive internet-accessible sequence databases by enabling the storage of specific sequence information of local isolates or pathogenic clones. Patterns of sequence variation as demonstrated in the study examining alpha-hemolytic streptococci (11) may be examined temporally and in the context of disease and pathogen. The commercial IdentiFire database (Biotage) facilitates the storage of DNA sequences obtained from local clinical isolates. The software offers the ability to immediately search a user-created database using the pyrosequencing data. Such local databases facilitate comparisons of isolates circulating in particular geographical regions or individual hospitals. These local sequence databases also make faster searches possible with greater diagnostic accuracy since a restricted pathogen database can be easily queried and matched with recurrent phenotypic features of local difficult-to-identify pathogens. An initial evaluation of this approach has shown that the vast majority of identifications can be made using automated searches of local databases as opposed to a manual review of pyrosequencing data with open database queries.
As part of a comprehensive polyphasic strategy, pathogen identification could be enhanced in the diagnostic laboratory by the integration of DNA sequencing with biochemical testing and other phenotypic approaches. However, several limitations should be addressed before widespread implementation of such methodologies can occur in the clinical setting. Drawbacks to the assay include the inability to definitively identify approximately 10% of isolates submitted for sequencing in this study, primarily due to limited read lengths. Refinements in this technology resulting in increased amounts of sequencing data by real-time sequencing will strengthen the ability to accurately identify pathogens to the species level. Additional sequencing primers, which target different regions of the 16S rRNA gene, intergenic sequences, or other coding sequences, may facilitate improved diagnostic accuracy despite limited sequencing data yields. The inability to sequence lengthy homopolymeric tracts represents another hurdle for DNA pyrosequencing. Improvements in base calling by software-related algorithm enhancements, manual data interpretation, and refinements in the sequencing chemistry have generated more robust capabilities of handling 3- to 5-base homopolymers. Longer homopolymers should be avoided during the assay design process. Similar to other sequencing strategies, DNA pyrosequencing is limited by the need for pure bacterial isolates. Mixed cultures lead to poor sequence quality and inconclusive data and may limit diagnostic yields with clinical isolates if not effectively purified on plated medium prior to sequencing. Sources of exogenous bacterial DNA contamination that may confound DNA sequence-based identification include reagents used for assay performance. Initial development of DNA pyrosequencing applications in this study emphasized the importance of bacterial DNA contamination in sequencing reagents. One source of bacterial DNA contamination was determined to be commercial preparations of thermostable DNA polymerases, and 16S rRNA gene amplification of contaminating DNA precluded the generation of interpretable sequence data. Upon switching to a specialized formulation of a thermostable DNA polymerase that minimized the presence of bacterial DNA (e.g., AmpliTaq Gold LD [Applied Biosystems]), background DNA sequence signals were minimized and the reliability of DNA pyrosequencing improved markedly.
The implementation of routine DNA sequencing in the clinical laboratory necessitates multiple considerations, including workflow, process integration, laboratory reporting, and personnel competencies. Multiple advantages of DNA pyrosequencing over conventional dideoxy sequencing strategies include user friendliness, a streamlined labor component, and enhanced overall cost-effectiveness. In clinical laboratory settings today, the molecular diagnostics and microbiology laboratories are separate or molecular microbiology is a distinct section within a larger microbiology operation. Because cultured isolates are handled and tested initially in the microbiology laboratory, guidelines must be established for the submission of isolates for DNA sequencing. Cutoff values, such as probability-based biochemical identifications below 90%, provide discrete boundaries for the determination of decision-making points in the workup. Microbiologists must be intimately involved in the process in order to determine when an isolate requires molecular strategies for identification so that resources are managed effectively. The responsibility of selecting appropriate organisms for DNA pyrosequencing lies with the microbiologist at the medical technologist and managerial levels. Specimens that require further evaluation may necessitate the involvement of the technical or medical director. Submitted samples are sequenced in the molecular pathology laboratory, and resulting database queries are returned to the microbiologist(s) for the compilation of polyphasic (genotypic and phenotypic) data of each isolate prior to making a final determination. The collaborative nature of this process requires a laboratory structure and workflow that facilitates multiple interactions, including sample processing, data interpretation, and laboratory reporting. The ongoing, although belated, implementation of molecular methods for pathogen identification will foster enhanced collaborations in clinical laboratories and ultimately will improve the diagnostic accuracy of infectious diseases.
| ACKNOWLEDGMENTS |
|---|
J.V. received financial support from the National Institutes of Health (VTEU N01 A1025465), the Defense Advanced Research Projects Agency, and the Department of Pathology at Texas Children's Hospital for the performance of these studies.
| FOOTNOTES |
|---|
Published ahead of print on 25 July 2007. ![]()
Supplemental material for this article may be found at http://jcm.asm.org/. ![]()
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Antimicrob. Agents Chemother. | Clin. Microbiol. Rev. |
|---|---|
| Clin. Vaccine Immunol. | ALL ASM JOURNALS |
|---|