Diagnostics of Neisseriaceae andMoraxellaceae by Ribosomal DNA Sequencing: Ribosomal Differentiation of Medical Microorganisms

ABSTRACT Fast and reliable identification of microbial isolates is a fundamental goal of clinical microbiology. However, in the case of some fastidious gram-negative bacterial species, classical phenotype identification based on either metabolic, enzymatic, or serological methods is difficult, time-consuming, and/or inadequate. 16S or 23S ribosomal DNA (rDNA) bacterial sequencing will most often result in accurate speciation of isolates. Therefore, the objective of this study was to find a hypervariable rDNA stretch, flanked by strongly conserved regions, which is suitable for molecular species identification of members of the Neisseriaceae and Moraxellaceae. The inter- and intrageneric relationships were investigated using comparative sequence analysis of PCR-amplified partial 16S and 23S rDNAs from a total of 94 strains. When compared to the type species of the genera Acinetobacter, Moraxella, andNeisseria, an average of 30 polymorphic positions was observed within the partial 16S rDNA investigated (corresponding toEscherichia coli positions 54 to 510) for each species and an average of 11 polymorphic positions was observed within the 202 nucleotides of the 23S rDNA gene (positions 1400 to 1600).Neisseria macacae and Neisseria mucosa subsp.mucosa (ATCC 19696) had identical 16S and 23S rDNA sequences. Species clusters were heterogeneous in both genes in the case of Acinetobacter lwoffii, Moraxella lacunata, andN. mucosa. Neisseria meningitidis isolates failed to cluster only in the 23S rDNA subset. Our data showed that the 16S rDNA region is more suitable than the partial 23S rDNA for the molecular diagnosis of Neisseriaceae andMoraxellaceae and that a reference database should include more than one strain of each species. All sequence chromatograms and taxonomic and disease-related information are available as part of our ribosomal differentiation of medical microorganisms (RIDOM) web-based service (http://www.ridom.hygiene.uni-wuerzburg.de/ ). Users can submit a sequence and conduct a similarity search against the RIDOM reference database for microbial identification purposes.

Classification and diagnostic systems for microorganisms have historically been based on the existence of observable characteristics. However, because of limitations in the discriminatory power of these characteristics, problems have arisen in identification and diagnosis (29). A more recent approach for classification and identification of microorganisms involves the comparison of genetic characteristics. These molecular methods are becoming increasingly important in microbiological diagnostics (23). They are an expansion of or an alternative to phenotyping techniques if one or more of the following conditions are met: (i) microorganisms cannot be cultivated or are difficult to cultivate, (ii) organisms grow only slowly and are poorly differentiated, (iii) growth of organisms represents a hazard to laboratory staff, (iv) a suitable test method for phenotyping is not available, and (v) the extent of infection is to be quantitated (e.g., the virus load). Guidelines similar to the phenotypic methods of Koch have already been established for molecular techniques used in the identification of microorgan-isms involved in particular diseases (9). In most cases, one of the following genomic structures is chosen as target for a molecular diagnosis test: (i) DNA sequences bearing the code for toxic or pathogenic factors, (ii) DNA sequences of specific antigens, (iii) specific DNA plasmid sequences, (iv) DNA sequences bearing rRNA codes, and (v) small sequences, mostly species specific, which are noncoding. The rRNA genes (rDNA) are particularly suitable for identification purposes since they are ubiquitous to all living organisms. They occur as multicopy genes, making their detection relatively easy, and are composed of conserved, variable, and highly variable regions so that probes may be designed to meet a desired level of specificity. Furthermore, they are essential for survival and may be used as a molecular clock for phylogenetic studies (11,20,25,40,41).
Existing sequence databases and analytical tools (e.g., the National Center for Biotechnology Information (NCBI) Gen-Bank or Ribosomal Database Project) are not optimal for accurate identification of clinically relevant microorganisms (17). The contents of these databases suffer many drawbacks including the presence of ragged sequence ends, faulty sequence entries (due to error-prone sequencing techniques used earlier), absence of quality control of sequence entries, noncharacterized entries, outdated nomenclature, and lack of type strains pertaining to many clinically important microorganisms. Furthermore, the results are not presented in a user-friendly manner. Our ribosomal differentiation of medical microorganisms (RIDOM) project is a new initiative and attempts to overcome these problems (12).
Culture collection strains of Neisseriaceae and Moraxellaceae were studied since these families not only contain established human pathogens such as N. meningitidis and Neisseria gonorrhoeae but also contain other species which are important as emerging causes of opportunistic infections (19). Molecular identification of these particular isolates should be a good challenge for molecular diagnostic systems in general because these organisms belong to a group of bacteria that is naturally competent and frequently exchanges chromosomal genes. This exchange process could considerably complicate molecular diagnosis and identification. According to Bøvre, the family Neisseriaceae previously consisted of the genera Neisseria, Kingella, Acinetobacter, and Moraxella, the latter genus containing the subgenera Moraxella (rod-shaped bacteria) and Branhamella (cocci) (2,3). On the basis of DNA hybridization and phylogenetic rDNA sequence analysis results, it is now suggested that Neisseria, Kingella, and Eikenella species are grouped in the family Neisseriaceae in the ␤ subclass of the Proteobacteria. The genera Acinetobacter, Moraxella, and Psychrobacter are removed from the Neisseriaceae and included in the family Moraxellaceae in the ␥ subclass of the Proteobacteria (26,27,32). This classification system is still evolving and therefore not complete.
In practice, a defined and limited sequence run must suffice for the identification process in most cases. To decide the target that best meets the requirements for identification, coherent variable regions of the rRNA operon were studied in a total of 94 Neisseriaceae and Moraxellaceae strains. The sequence traces and further taxonomic and disease-related information on these strains have also been deposited on our RIDOM web server for prototypic demonstration purposes. The same partial 16S and 23S rDNA sequences were also used to examine the phylogenetic relationships among species of the Neisseriaceae and Moraxellaceae families.
(This study was presented in part at the 99th General Meeting of the American Society for Microbiology, Chicago, Ill., 31 May to 3 June 1999.)

MATERIALS AND METHODS
Bacterial strains and growth conditions. The strains investigated in this study are listed in Table 1. Culture collection isolates, including the type strains, were used in this analysis when available. Strains were grown on 5% human blood and chocolate agar plates at 22, 28, or 37°C with 5% CO 2 . All isolates were identified by conventional biochemical methods (19).
In vitro amplification and DNA sequencing of the 16S and 23S ribosomal RNA genes. A loopful of bacterial cells for extraction of DNA was washed with distilled water and incubated in 200 l Tris-EDTA buffer for 10 min at 100°C. The suspension was vortexed and centrifuged at 8000 ϫ g for 1 min. Two microliters of the supernatant were used for PCR amplification. PCR was performed in a total volume of 50 l containing 200 M deoxynucleoside triphosphates (dATP, dCTP, dGTP, and dTTP), 5 pmol of each primer, 5 l of 10-fold concentrated polymerase synthesis buffer, and 1 U of AmpliTaq DNA polymerase (PE Biosystems, Weiterstadt, Germany). Thermal cycling reactions consisted of an initial denaturation (80°C, 5 min) followed by 28 cycles of denaturation (94°C, 0.45 min), annealing (53°C or 60°C for 16S or 23S rDNA PCR, respectively, 1 min), and extension (72°C, 1.5 min), with a single final extension (72°C, 10 min). Reactions took place in a dedicated automated DNA thermal cycler (GeneAmp 2400, PE Biosystems). Negative controls containing water in place of template DNA were run in parallel in each run. The amplicons were sequenced with the PCR primers using the Taq-cycle (Big)-DyeDeoxy Terminator kit and the protocol recommended by the manufactor (PE Biosystems). Centri-Sep columns (Princeton Separations, Adelphia, N.J.) were used for purifying the sequencing products. Sequences were determined by electrophoresis with the ABI Prism 377 or 310 semiautomated DNA sequencers (PE Biosystems). The nucleotide sequences from both DNA strands were determined in this manner. The broad-range primers SSU-bact-27f (5Ј-AGA GTT TGA TCM TGG CTC AG -3Ј) and SSU-bact-519r (5Ј-GWA TTA CCG CGG CKG CTG -3Ј) reported by Lane (15) were applied for 16S ribosomal DNA (rDNA) PCR and sequencing, whereas the universal primers LSU-bact-1399f (5Ј-GAT GGG AAR CWG GTT AAT ATT CC-3Ј) and LSU-bact-1602r (5Ј-CAC CTG TGT CGG TTT SGG TA -3Ј) were used for amplification and sequencing of 23S rDNA. Identical or near-identical 23S rDNA primer binding sites have already been described by Van Camp et al. (36) and by Ludwig et al. (16). Ambiguities were resequenced and at least 98% percent of the complete double-stranded sequences of the 16S and 23S rDNA targets were obtained.
Analysis of the rDNA sequences. The region from base positions 54 to 510 for the 16S rDNA and the region from positions 1400 to 1600 for the 23S rDNA were analyzed (corresponding to Escherichia coli 16S and 23S rDNA positions, respectively). Sequences from primer regions were therefore not included in this analysis. Sequences were aligned using the CLUSTAL W program (34). This program was also used to construct phylogenetic trees from distance matrices using the neighbor-joining method of Saitou and Nei (28) with the correction for multiple substitutions option turned off. The mean sequence divergence level within each taxon and between each pair of related taxa and genera was calculated as the mean of all pairwise comparisons.
RIDOM implementation. The software selected for the RIDOM project had to be freely available for academic use as well as efficient and applicable without the need of a specific platform. FASTA version 2.0 is used for similarity searches, whereas CLUSTAL W version 1.7 is utilized for constructing multiple alignments and nearest-neighbor trees (22,34). C-source code is available in the case of both programs. Phylogenetic trees are drawn with the aid of DRAWTREE version 3.5 of the PHYLIP software package (7). A MySQL (MySQL AB, Stockholm, Sweden) database holds all taxonomic, disease-related, and species information. Depending on the user query results, dynamically generated HTML pages are published with the aid of an APACHE HTTP server. The database, FASTA, and CLUSTAL W are linked to the common gateway interface of the Web server using programmed Perl script, which also interprets similarity search results. The client's World Wide Web browser should support at least HTML version 3.2. Some HTML extensions such as JavaScript, frames, and cascading style sheets are used occasionally for greater clarity of presentation. These extensions, however, are not essential and therefore older browsers can also be used. Due to the frequent use of tables, however, text-only browsers such as Lynx are not well suited to the task of viewing the contents of RIDOM.

RESULTS
A total of 202 nucleotides of the 23S rRNA gene from 94 isolates and 457 bp of the 16S rDNA from 90 Neisseriaceae and Moraxellaceae strains were determined by direct DNA sequencing (Table 1). When compared to the type species of the genera Acinetobacter, Moraxella, and Neisseria, an average of 30 polymorphic positions were observed within the partial 16S rDNA for each species and an average of 11 polymorphic positions were observed within the 202 nucleotides of the 23S gene ( Table 2). Some of the polymorphic positions were found to be species specific, and a subset of these were present in    every isolate of the species concerned (conserved polymorphism). It is interesting that differences in diversity within a species group were observed, depending on which gene was being considered. For example, in the case of N. mucosa, four and three alleles were observed for the 16S and 23S genes, respectively, whereas M. lacunata exhibited two and three alleles, respectively. On the other hand, the complexity of N. meningitidis and N. weaveri were similar, as exemplified by the number of alleles observed for the 16S and 23S genes. In this context alleles are defined as observed rDNA sequence differences of different isolates of the same species, as determined by direct sequencing of PCR products (10).
Intraspecies variation for the 16S and 23S genes was highest in A. lwoffii (4.35 and 3.4%), M. lacunata (1.66 and 2.49%), M. catharralis (0.55 and 2.25%) and Neisseria mucosa (1.64 and 1.05%), respectively. Acinetobacter isolates showed the highest average intraspecies variation (2.10 and 1.76%), followed by Moraxella (0.90 and 1.50%) and Neisseria (0.36 and 0.42%), respectively. Interspecies variation within a genus (Fig. 1) ranged from 6.2 to 6.1% for Moraxella spp. to 4.7 to 4.9% for Acinetobacter spp. Interspecies variation between the genera (Fig. 1)  To demonstrate the relationships between species, the aligned DNA sequences were used to produce a distance matrix for pairs of sequences by using the neighbor-joining method of Saitou and Nei (28). All species used in this study fell into one of two large groups on the tree. One group included all isolates of the genera Chromobacterium, Eikenella, Iodobacter, Kingella, Neisseria, and Oligella and comprises the ␤ subclass of the Proteobacteria. The other group contained all isolates of the genera Acinetobacter, Moraxella, Psychrobacter, and Suttonella and constitutes the ␥ subclass of the Proteobacteria. It is interesting that Oligella spp. clustered with the ␤ subclass of Proteobacteria according to 16S rDNA sequence analysis, but to a separate group as analyzed by partial 23S rDNA sequences.
N. macacae and N. mucosa subsp. mucosa ATCC 19696 had identical 16S and 23S rDNA sequences. The 23S genes of Moraxella equi, M. lacunata ATCC 17967, and of Neisseria elongata subsp. glycolytica and N. elongata subsp. nitroreducens also showed sequence identity. The species clusters were heterogeneous in both genes for A. lwoffii, M. lacunata, and N. mucosa. N. meningitidis isolates failed to cluster in one group only in the case of the 23S rDNA subset. These N. meningitidis isolates grouped neither according to serotypes nor to multilo-cus enzyme electrophoretic types. No clustering according to the Moraxella subgenera Branhamella or Moraxella could be observed.
The EMBL database contained 19 sequences from the same culture collection strains that covered the 16S rDNA region examined in this study. These sequences were compared against our newly generated sequence chromatograms. The EMBL sequences contained 56 ambiguities, whereas our own sequences showed only 3 ambiguities. Furthermore, we detected eight substitutions, two insertions, and one deletion that could be attributed to possible sequencing errors in these published sequences (i.e., an error rate of 0.78%).

DISCUSSION
Our objective was to find a hypervariable rDNA stretch, flanked by strongly conserved regions, which can be used for molecular species identification of species within the Neisseriaceae and Moraxellaceae families. The longest coherent variable region in the 16S rDNA fulfilling these criteria spans the region from E. coli positions 54 to 510 and in the 23S rDNA spans the region from positions 1400 to 1600 (16). This is well illustrated by the quantitative map of nucleotide substitution rates in bacterial rDNAs published by Van de Peer et al. (37). The inter-and intrageneric relationships of members of the Neisseriaceae and Moraxellaceae were therefore investigated by carrying out comparative sequence analysis of PCR-amplified partial 16S and 23S rDNAs of these regions in a total of 94 strains. An ideal region should show a low intraspecies and a high interspecies variability. When the DNA sequences of the 16S and 23S rRNA genes were used as a measure of the intraspecies and interspecies distances within a genus, no great differences between the two regions could be observed (Fig. 1). Only in the case of the 23S rDNA in Neisseria was the interspecies variation significantly lower than that of the 16S rDNA. The interspecies distances between genera for the 23S rDNA sequences on the other hand, were consistently higher than those for the 16S rDNA. A comparison of an absolute measure (i.e., polymorphic positions), however, showed that 16S rDNA always had significantly more variable positions and this was because the relevant region in the 16S rDNA was more than twice as long as that in the 23S rDNA (Table 2). We therefore concluded that the selected region of the 16S rDNA is more suitable than that of the 23S rDNA for identification purposes because of its greater length.
The intraspecies variation within A. lwoffii alone was nearly as high as the interspecies differences in the genus Acinetobacter in general, with the result that molecular identification of this heterogeneous species will be relatively difficult. This finding pinpoints a potential problem in the interpretation of rDNA sequence analysis and indicates that a reliable classification system will require a complete genetic database. The taxonomy of Acinetobacter is, however, still incomplete and evolving, with only seven species currently named (nomenspecies) out of a total of 18 known genomic species, which indicates that a further subdivision of Acinetobacter species may be needed (5,13,38). Another potential limitation of the 16S and 23S rDNA method is the insufficient discriminatory power for recently diverged species, as seen in this study in the case of N. macacae and N. mucosa subsp. mucosa ATCC 19696 (8, 21, 39). On the other hand, these two entities are even by phenotypic means not clearly distinct species. In general, a sequential (e.g., second line) sequencing of the more variable 16S/23S rDNA spacer region (1,14), and a polyphasic approach (i.e., a combination with other pheno-or genotypic techniques) (31), should solve this problem. A further problem in using the rDNA sequencing approach for identification purposes is the possible intercistronic heterogeneity between different rRNA operons (18). Despite these limitations, DNA sequence-based microbial identification is expected to play a major role in clinical microbiology laboratories in the future because of its speed, reproducibility, and potential for automation. Highquality DNA sequence data, in combination with DNA microarray techniques, may revolutionize diagnostic laboratory procedures (35).
The lack of adequate quality control procedures for public database entries and the associated difficulties when this data is used in medical diagnostics is documented by the high error rates of sequences detected in this study. Therefore, a diagnostic library preferably should rely on culture collection strain sequences and make their primary sequence data (i.e., the sequence chromatograms) available to users for purposes of intersubjective control of their data.
According to 16S rDNA sequences analysis, the genera Chromobacterium, Eikenella, Iodobacter, Kingella, Neisseria, and Oligella form one cluster, which is a part of the ␤ subclass of the Proteobacteria. Acinetobacter, Moraxella, Psychrobacter, and Suttonella are separated into another cluster belonging to the ␥ subclass of the Proteobacteria (6,13,24). Signature sequence positions for these subclasses determined by Woese (41), most notably the E. coli position 485, also support this grouping. In contrast, the relationships among subgroups in our study differ slightly depending on the gene analyzed and also differ from previous results. This might be due to interspecies recombination events, since these species belong to a group of bacteria that frequently exchange chromosomal genes (30).
New RIDOM entries of bacterial genera are compared with the American Society for Microbiology's Manual of Clinical Microbiology to guarantee that all medically relevant pathogens are included (19). For quality control reasons, only sequences from strains held in culture collections are included in the database. In addition, the electropherograms of the sequence are deposited on our World Wide Web server thus allowing detailed comparison of the sequences generated. The classification of species entries is made in accord with the NCBI's "Guidelines and Conventions for the purpose of Biological classification." Therefore, we implemented the phylogeny tree of the Ribosomal Database Project (17) to reflect current phylogenetic knowledge on prokaryotes. In the case of nomenclatural terms, we adhered as closely as possible to the recommendations of the draft BioCode as found at the Royal Ontario Museum Website (http://www.rom.on.ca/biodiversity /biocode/biocode1997.html). To ensure updated taxonomic entries and avoid outdated nomenclature, bacterial names are checked with the aid of the Deutsche Sammlung von Mikroorganismen und Zellkulturen's Bacterial Nomenclature up-todate database (http://www.dsmz.de/bactnom/bactname.htm), which is based on valid names published in the International Journal of Systematic and Evolutionary Microbiology. The hierarchic ordering and naming of different levels within the "or-gan-browser" is adapted from the "NLM-MeSH Tree Structures-Category C. Diseases" as found at the National Library of Medicine Website (http://www.nlm.nih.gov/mesh/). The Council for International Organizations of Medical Sciences' International Nomenclature of Diseases (4) and the World Health Organization's International Classification of Diseases (42) were used to relate standardized disease terms to specific microorganisms and to facilitate low background noise links to Internet databases.
The logic incorporated in the recently published and now commercially available MicroSeq system (PE Biosystems, Forster City, Calif.) is comparable to that in the RIDOM system (33). It also uses nonragged 16S rDNA sequences for microbial identification purposes. The database of MicroSeq currently contains over 1200 full-length, high-quality ATCC culture collection bacterial strain sequences. A feature-rich Macintosh analysis software enables the comparison of any unknown sequence with the sequences in the database. However, some fundamental differences exist between the two systems. Most notable of these is the fact that the RIDOM system, due to its open hypertext structure, allows the incorporation of other useful Internet sources. Another important difference is the inclusion of phenotypic methods and non-rDNA targets for species identification purposes (a polyphasic approach). Furthermore, the RIDOM approach is far-reaching in that it not only tries to include sequences and species names in its database but also includes additional information related to taxonomy and disease. Finally, RIDOM is specifically designed for medically important organisms for both humans and animals, whereas MicroSeq in its current form concentrates on environmental isolates for food and pharmaceutical industry quality control needs.
The RIDOM system currently offers the Neisseriaceae and Moraxellaceae dataset for general use and demonstration purposes, and a rapid and constant increase in the number of entries pertaining to other classes of bacteria and fungi is now under development. If the similarity search score is too low because the species in question is not yet incorporated in our database, the user may directly conduct a search of the NCBI GenBank using NCBI's sequence similarity search tool BLAST. Even if the user has no sequence for comparison, our database can still be searched or an Internet metasearch for species information related to nomenclature, phylogeny, or disease can be carried out. The RIDOM persistent uniform resource locator is http://purl.oclc.org/net/ridom, which is currently associated with the following URL: http://www.ridom .hygiene.uni-wuerzburg.de/. E-mail contact is possible using the address webmaster@ridom.hygiene.uni-wuerzburg.de.
In conclusion, our data show that it is possible to identify most Neisseriaceae and Moraxellaceae species by partial rDNA sequencing and that the 16S rDNA region examined in this study is more suitable for molecular diagnosis than the partial 23S rDNA. A genetic database should be exhaustive, that is, it should include more than just one representative strain of each species. A molecular diagnosis system should involve both different molecular targets and additional analytical procedures, since not all species can be differentiated by partial 16S rDNA sequencing alone.