Department of Population Medicine and Diagnostic Sciences, College of Veterinary Medicine, Cornell University, Ithaca, New York 14853,1 Gastroenteric Disease Center, Department of Veterinary Science, The Pennsylvania State University, University Park, Pennsylvania 16802,2 Microbial Evolution Laboratory, National Food Safety and Toxicology Center, Michigan State University, East Lansing, Michigan 488243
Received 27 July 2005/ Returned for modification 28 September 2005/ Accepted 11 January 2006
| ABSTRACT |
|---|
|
|
|---|
| INTRODUCTION |
|---|
|
|
|---|
In order to ensure public health and to monitor biological threats, a rapid, sensitive, and specific diagnostic assay is necessary for the identification of E. coli pathotypes. However, the bacterial genomes are extremely dynamic, and the ability of the organisms to acquire genetic elements, such as pathogenicity islands and virulence factors, from one another in the environment makes it difficult to identify the pathogens (16). Currently employed diagnostic assays, such as biochemical and immunological marker assays, PCR, reverse transcription-PCR, nucleic acid hybridization assays, and other bioassays are not comprehensive because they focus on the specific detection of a single target rather than multiple indicators of the pathogen. DNA microarrays provide the obvious method for exploring the genome at the molecular level. Screening of multiple markers makes it possible to determine the genetic and virulence profiles of a single strain or to distinguish one strain from others. Increasing the number of genetic regions examined will increase the confidence of correct identification and is especially important for E. coli, in which virulence and genetic profiles are pertinent since they may change due to lateral gene transfer.
The recently emerging DNA microarray or gene chip technology allows us to comprehensively screen thousands of genes arrayed on a single glass microscopic slide, making microarrays potentially useful for the typing of bacterial pathogens. Microarrays have been used for the differentiation of bacterial and viral pathogens and the identification of virulence factors (2-4, 10, 18, 20-23, 25, 26). However, a drawback with the current research is that the typing of bacterial species by the use of DNA microarrays is based purely on a few virulence genes, some of which have been shown in many studies to be shared between many pathotypes and cannot be conclusive determinants for the differentiation of pathotypes.
In this study, we used a new approach by taking advantage of the genomic sequences of E. coli and have developed an oligonucleotide spotted array (70-mers) representing the known E. coli pathotype virulence genes (those of EHEC, EPEC, UPEC, ETEC, EAEC, and EIEC), specific genes (those of E. coli O157 EDL933, E. coli K-12 MG1655, and E. coli CFT073), common genes (those of E. coli O157 EDL933, E. coli K-12 MG1655, and E. coli CFT073), and negative controls (core genes of Salmonella and dimethyl sulfoxide buffer without oligonucleotides). Standardization of the DNA microarray was done with reference strains of E. coli, and then the validity of the array was assessed with known clinical pathotypes of E. coli. The pathotype category, the virulence profile, and its relation with other categories were determined; the results indicated that the oligonucleotide DNA microarray can be widely applicable for clinical diagnosis and epidemiological surveys.
| MATERIALS AND METHODS |
|---|
|
|
|---|
|
Microarray printing. The probes (70-mers) were synthesized (Illumina Inc.), suspended in 50% dimethyl sulfoxide, and spotted in triplicate onto Ultra-GAPS glass slides (Corning Inc., Corning, N.Y.) at the Cornell Microarray Core Facility (www.bigredspots.cornell.edu). Autoblank was also used as a negative control.
DNA preparation and labeling. The genomic DNA of the E. coli pathotypes was prepared according to the manufacturer's instructions by using DNeasy kits (QIAGEN). The harvested genomic DNA was digested with Sau3AI (New England Biolab, Beverly, Mass.) and was purified by using a QIAquick PCR purification kit (QIAGEN). The purified fragments were labeled according to the protocol of P. Brown (http://cmgm.stanford.edu/pbrown/protocols/4_genomic.html). The purified fragments were mixed with 15 µg of random hexamers (Amersham, Piscataway, N.J.), boiled for 5 min, and immediately cooled on ice. Deoxynucleoside triphosphates (6 nmol each of dATP, dGTP, and dTTP and 3 nmol of dCTP [Amersham]), 10 U of Klenow enzyme (New England Biolabs), and 3 nmol of Cy3-dCTP (Amersham) were added, and the mixture was incubated for 2 h at 37°C. The labeled probes were purified and concentrated with Microcon YM-30 (Millipore) to a volume of 12 µl or less. To the concentrated probe, 1 µl of 10 mg/ml salmon sperm DNA, 1 µl of 4 mg/ml yeast tRNA, 3.5 µl of 20x SSC (1x SSC is 0.15 M NaCl plus 0.015 M sodium citrate), and 1.2 µl of 5% sodium dodecyl sulfate (SDS) were added, and the mixture was made up to 20 µl with water. The hybridization mixture was boiled for 2 min, and the denatured probe was kept at 37°C for 30 min prior to hybridization.
Microarray experiments and data analysis. The microarray chip contained oligonucleotides (70-mers) representing open reading frames (ORFs) of the virulence genes from each E. coli pathotype (EHEC, EPEC, UPEC, ETEC, EAEC, and EIEC), specific genes (those of E. coli O157 EDL933, E. coli K-12 MG1655, and E. coli CFT073), common genes (those of E. coli O157 EDL933, E. coli K-12 MG1655, and E. coli CFT073), and negative controls (core genes of Salmonella and autoblanks). Each slide had triplicate spots of each ORF. Immediately before use, the slide was incubated in prehybrization solution (5x SSC, 0.1% SDS) at 55°C for 1 h and was then rinsed with 1x SSC, 0.2x SSC, and finally, Milli-Q water. The slides were dried either by blowing nitrogen gas on them or by centrifugation (600 rpm for 5 min). Denatured probe was slowly applied to the slide through the edges of a LifterSlip coverslip (Erie Scientific, Portsmouth, N.H.), the slide was immediately placed in a hybridization chamber (Corning, Elmira, N.Y.), and hybridization was allowed to occur overnight by submerging the slide in a 55°C water bath. The slides were successively washed with the following three solutions: solution I, 2x SSC-0.1% SDS at 60°C for 30 min; solution II, 1 x SSC at 37°C for 10 min; solution III, 0.2x SSC at 37°C for 10 min; and solution IV, Milli-Q water at 37°C for 5 s. Finally, the slides were dried by centrifugation or with nitrogen gas, as described above, and scanned on a GenePix 4000A scanner (Axon Instruments, Union City, Calif.).
Each microarray experiment with reference strains was repeated five times, and the experiments with the clinical isolates of the E. coli pathotypes were repeated twice. Since the array was spotted in triplicate, a single hybridization yielded three sets of data for each gene. Fluorescence data for triplicate spots on one slide were collected separately by the use of Genepix pro 6.0 software. The data were then analyzed by using Avadis 3.3 prophetic software (http://avadis.strandgenomics.com/). The pixel intensity value of each spot was corrected for the background and averaged for triplicate spots. The log value for the average pixel intensity was taken and was subjected to descriptive statistical analysis. The data were filtered in such a way that genes which showed a mean value of more than two times the standard deviation (P < 0.05) were considered positive. Hierarchical clustering was performed by using the Pearson absolute method as the distance matrix.
| RESULTS |
|---|
|
|
|---|
|
EHEC EDL933 showed 100% reactivity to all of its virulence and specific genes. Although the EPEC E2348/69 genome has not been completely sequenced, some of its virulence phenotypes are well characterized. Typical EPEC strains are known to have a locus of enterocyte effacement (LEE) region and an EAF plasmid. The presence of tir and genes encoding the type III secretion system suggest the presence of LEE, whereas the presence of bfp and per indicates the presence of the EAF plasmid. EPEC also shares a common set of virulence factors with EHEC, which includes the genes of the LEE region; these genes were designated EHEC-EPEC virulence genes, although the probe sequences were derived from EPEC E2348/69. On the basis of these criteria, EPEC E2348/69 showed 100% reactivity to its specific genes and the EHEC-EPEC virulence gene probes, whereas EHEC EDL933 showed only 74% reactivity to the EHEC-EPEC virulence gene probes. The nonreactive probes of EDL933 include orf19, tir, eae, espA, espB, and espD. These genes are already known to be variable on the basis of their sequences in EDL933. Since the gene probes constructed were based on the sequence of E2348/69, EDL933 was not reactive. However, the probes for the tir-2, espA2, and espB2 genes, which were designed on the basis of the EDL933 genome sequence, were reactive only to EDL933 and not E2348/69, indicating the uniqueness of the probes used.
CFT073, the prototype strain of UPEC, showed 100% reactivity to CFT073 virulence and specific genes.
EAEC is characterized to have the following virulence genes: aaf, agg, astA, and pet (12). Our array detected astA, aafA, aafB aggR, and pet by use of the reference strain of EAEC (O42). It also showed reactivity to all the EAEC virulence genes except aggA (91%). The aggA-specific gene probe was derived from EAEC isolate (O3:H2) and not EAEC O42, which might be the reason for the lack of reactivity.
Salmonella core genes were considered the negative control. These genes were derived by comparison of the genomes of different isolates of Salmonella and were absent from E. coli K-12 and E. coli O157:H7. As expected, 100% reactivity of the Salmonella core gene probes with isolates of Salmonella enterica serovar Typhimurium was observed (data not shown). E. coli pathotypes showed reactivity to 4 of 81 oligonucleotide probes specific for Salmonella. This might be because the Salmonella core genes established by McClelland et al. were compared with only two E. coli genomes (14).
The autoblank, which consisted of spots of dimethyl sulfoxide without any probes, was created as another negative control to make sure that nonspecific hybridization to the slide was not occurring. None of the autoblanks showed a fluorescent signal.
The specificity of the array for the differentiation of reference E. coli pathotypes was 98%. This specificity was estimated on the basis of the numbers of true-negative spots (autoblanks) and false-negative spots (afaD, ibeB).
Validation of virulence, pathotype-specific, and common gene probes with clinical isolates of E. coli pathotypes. All three clinical isolates of EHEC showed 100% reactivity to EHEC-specific virulence gene probes. With regard to the EDL933-specific gene probes, two clinical isolates (C2856 and C2860) showed reactivities of 96.5 and 98%, respectively, and one clinical isolate (C2860) showed only 77% reactivity. Similar to reference strain EDL933, clinical strains C2856, C2858, and C2860 showed less reactivity to EHEC-EPEC-specific gene probes (69.5, 52, and 65%, respectively), indicating that the genomes of clinical strains are more closely related to the EDL933 genome. All the clinical strains of EHEC showed lesser reactivities to K-12-specific genes and CFT073-specific genes (Table 2).
EPEC clinical isolate C2816 was 82 and 96% positive for the EPEC and EPEC-EHEC virulence genes probes, respectively, while EPEC clinical isolate C2814 was 88 and 100% positive, respectively, indicating that these two strains belong to the EPEC pathotype.
J96, another prototype strain of UPEC, showed 81 and 90% reactivities to CFT073 virulence genes and specific genes, respectively. Probes for the papA, sfaA, fsoE, papGII, sat, and iucD genes failed to react with the J96 strain. Probes for CFT073-specific genes kpsE, kpsD, and kpsM (K15 capsule) and probes for two hypothetical genes also showed no reactivity to strain J96. The K15 capsule locus of strain J96, however, has not been characterized so far.
UPEC clinical strains C2824, C2828, and C2832 showed 86, 86, and 68% reactivities to reference strain CFT073-specific virulence gene probes, respectively, and 98, 97, and 89% reactivities to CFT073-specific genes, respectively. It is well known that UPEC strains exhibit greater diversity in their virulence genes, but screening with typical UPEC CFT073-specific genes indicates that these clinical isolates belong to the UPEC pathotype with a divergence in their virulence genes.
EAEC contains heterogeneous virulence markers. In comparison with the reference strain (O42), both EAEC clinical strains showed reactivity to aggR, but only one of them was positive for astA and one of them showed reactivity to aggD, aggB, and aggA. The clinical isolates of EAEC showed 32 to 40% and 12 to 16% reactivities to the K-12- and EDL933-specific genes, respectively.
The clinical isolates with a known phenotype of EIEC and ETEC showed 100% and 28% reactivities to the virulence gene probes for these pathotypes, respectively, in the microarray analysis. Since we used gene probes for all the toxins and various colonization factors synthesized by different ETEC isolates, the overall percentage of reactivity to ETEC virulence genes is low. However, ETEC isolates showed reactivity to their respective genotypes (toxins) in the array. ETEC clinical isolates showed marked differences in reactivity with ETEC virulence genes, like the EAEC isolates did (Table 2), indicating that these categories contain heterogeneous virulence markers.
The reactivities of the individual gene probes for each pathotype are represented in the supplemental material (Appendix S2). All the clinical isolates clustered with their respective reference E. coli pathotypes, indicating the potential for the use of microarray analysis for differentiation of the pathotypes. The diversity of pathotypes and their genetic relatedness are illustrated in Fig. 1.
|
| DISCUSSION |
|---|
|
|
|---|
Recently, microarrays targeting multiple virulence factors have been developed for the detection of pathogenic E. coli (1). However, the microarray analysis was evaluated as positive or negative by the color of the fluorescence intensity. Each spot in the array is subjected to variability in intensity, and the evaluation of the array by the color of the fluorescence intensity is difficult, especially when the fluorescence intensity is compared with the background, which may eventually lead to artifacts. We have calculated the background locally for each spot rather than globally for the entire image, which might have enhanced the quality of each spot. The microarray data were analyzed by taking the log value of the median without the background, and the data were further filtered statistically by taking two times the standard deviation (P < 0.05) of the median value. Pathogenic E. coli strains associated with human and animal diseases are remarkably diverse and show variability in the genes encoding surface antigens or virulence factors (5). These variable genes may not bind or may bind poorly to the probes in the array. Seventy-mer oligonucleotide probes rather than cDNA PCR probes were used in this multiple-target assay, since oligonucleotide probes are a cost-effective alternative to cDNA PCR probes. Furthermore, the oligonucleotide-based arrays provide a reduction in cross-hybridization and an increase in the differentiation of highly homologous regions (6, 11, 23).
The microarray's ability to detect virulence, pathotype-specific, and common genes was primarily analyzed with reference strains of E. coli pathotypes, such as nonpathogenic K-12 MG1655, EHEC O157 EDL933, EAEC O42, EPEC E2348/69, and UPEC CFT073. To validate the array further, clinical isolates representing each E. coli pathotype were used. The results indicate that microarray analysis could potentially differentiate E. coli pathotypes not only on the basis of virulence genes but also on the basis of specific and common genes. Even the potentially powerful microarray showed false-positive results for afaD, ibeB, and traT. Bekal et al. developed a microarray targeting virulence factors for the detection of pathogenic E. coli, and that microarray also showed false-positive results for aagA and cdt (1).
Comparison of the LEE regions of EPEC E2348/69 and EHEC E. coli O157 showed that tir, eae, espA, espB, and espD are more diverse (7, 13). The diversity is evident in our study with EHEC reference and clinical isolates. UPEC isolates also contained heterogeneous virulence genes. UPEC prototype strain J96 showed reactivity to papC, papE, papG1, papGII, papx, papK, papJ, papH, and papI but not to papA or papA2. None of the clinical isolates showed reactivity to papA or papA2, supporting the papA subunit diversity in UPEC isolates (8, 19). EIEC isolates also contained different virulence genes (1, 17), but the clinical isolates of EIEC evaluated in this study had similar virulence profiles.
E. coli pathotypes are categorized by the presence of virulence factors. The virulence factors are acquired from numerous sources, including bacteriophages, plasmids, and the genomes of other bacteria (5). It is apparent that the virulence genes in the pathogenic island are transferable and could contribute to the heterogeneity of E. coli strains (5). Therefore, it is ideal to track the pathotypes on the basis of the virulence genes. However, it would be better to include nontransferable and specific genes of E. coli pathotypes, as the categories of pathotypes keep on increasing. Screening with multiple markers such as virulence, specific, and core genes of E. coli help us to identify the emergence of the new pathotypes and would also allow us to assess the relative genetic and virulence profiles of a single strain in comparison with particular pathotypes.
An unknown pathotype can be identified primarily by its percentage of reactivity to the virulence genes of EHEC, UPEC, EPEC, EAEC, EIEC, and ETEC. The pathotype-specific genes constructed on the basis of the available genome sequences of strains EDL933, CFT073, and K-12 enhance the accuracy of identification of EHEC and UPEC. However, ETEC and EAEC are not well characterized compared to the other pathotypes, and their virulence genes are heterogeneous among isolates. Therefore, the reactivities of certain genes, especially with toxins and fimbrial types (ETEC sta, stb, and lt; EAEC astA and aagR) can be considered important criteria that can be used to denote the pathotype of an isolate. Although the clinical isolates of the E. coli pathotypes showed gene contents similar to those of reference strains, variations in gene content between pathotypes and between each strain were also observed by DNA microarray analysis. This indicates that the oligonucleotide array not only allows us to differentiate between pathotypes but also is useful for rapid strain characterization. The ability to characterize genome deletions of clinical isolates relative to the sequences of reference strains may allow us to reconstruct the phylogeny. The microarray that was developed could differentiate the major pathotypes, but the analysis of reference and clinical isolates of EHEC 2, EPEC 2, and DAEC strains may enhance its validity for wider application for the diagnosis of E. coli infections. Since the virulence genes are heterogeneous within each category, especially for ETEC, EAEC, and UPEC, the establishment of core genes of each pathotype would help us to identify the pathotype. With the completion of more E. coli genomic sequences, the inclusion of more specific genes from each pathotype may enhance the identification of pathogenic E. coli strains and their pathotypes.
| ACKNOWLEDGMENTS |
|---|
| FOOTNOTES |
|---|
Supplemental material for this article may be found at http://jcm.asm.org/. ![]()
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Antimicrob. Agents Chemother. | Clin. Microbiol. Rev. |
|---|---|
| Clin. Vaccine Immunol. | ALL ASM JOURNALS |
|---|