Previous Article | Next Article ![]()
Journal of Clinical Microbiology, October 2006, p. 3752-3759, Vol. 44, No. 10
0095-1137/06/$08.00+0 doi:10.1128/JCM.00998-06
Copyright © 2006, American Society for Microbiology. All Rights Reserved.
Center for Biologics Evaluation and Research, Food and Drug Administration, Rockville,1 W. Harry Feinstone Department of Molecular Microbiology and Immunology, Johns Hopkins University Bloomberg School of Public Health, Baltimore, Maryland,2 Measles, Mumps, Rubella, and Herpesvirus Branch, Centers for Disease Control and Prevention/WHO Global Measles Reference Laboratory, Atlanta, Georgia,3 Victorian Infectious Diseases Reference Laboratory/WHO Western Pacific Region Measles Reference Laboratory, Melbourne, Victoria, Australia,4 Vaccine-Preventable Virus Infections Unit, National Institute for Communicable Diseases, Johannesburg, Republic of South Africa,5 Institute of Molecular Biology, State Research Center of Virology and Biotechnology "Vector," Koltsovo, Novosibirsk Region, Russian Federation6
Received 12 May 2006/ Returned for modification 6 June 2006/ Accepted 20 July 2006
|
|
|---|
|
|
|---|
The classification system for wild-type MV is based on the sequences of the 450 nucleotides coding for the 150 amino acids at the C terminus of the N protein (24, 32, 34, 35, 37). The intergenotype diversity within this region of the N gene is greater than 2.5%, and the most divergent MV genotypes differ by as much as 12%. Measles viruses are classified into eight clades (A to H) that are currently subdivided into 23 recognized genotypes (A, B1 to B3, C1, C2, D1 to D10, E, F, G1 to G3, H1, and H2) (24, 37). Sequence analysis of PCR products is currently the most practical, cost-effective, and accurate method for MV genotyping. Other genotyping methods, such as restriction fragment length polymorphism (RFLP) (22, 29), the heteroduplex mobility assay (HMA) (18), refractory mutation analysis (27), genotyping by nucleotide-specific multiplex PCR (19), and real-time PCR (31), have been proposed. Most of these methods have inherent shortcomings and can differentiate only a limited number of MV genotypes. RFLP-based methods depend on the availability of restriction sites suitable for analysis, and the results of HMA are often difficult to interpret and reproducibility is low. Some of these techniques are technically challenging, expensive, and difficult to standardize. Furthermore, these methods are not amenable to high-throughput screening and lack the sensitivity of sequence analysis. Data from these various alternative approaches cannot be easily reported to a central database, so results obtained in different laboratories cannot be easily compared.
As the WHO measles laboratory network expands, additional methods for the genetic characterization of MV will be desirable. For example, high-throughput screening techniques may be necessary if the number of specimens increases significantly. Characterization of larger regions of the genome or complete genomes may be required in order to increase the sensitivity of the molecular epidemiologic analysis and to efficiently monitor multiple genetic characteristics of the virus.
DNA microarray technology is an efficient tool for rapid genetic analysis of microorganisms including transcription profiling, resequencing, single-nucleotide polymorphism (SNP) analysis, and genotyping of bacterial and viral pathogens (4-6, 10, 21). Short oligonucleotide probes (oligoprobes) enable discrimination of samples with minor genetic differences. MV genotypes may differ by only a few nucleotides, and these differences are not always conserved even within a specific genotype. Unique signature nucleotide patterns capable of distinguishing all genotypes may not be readily identifiable, and accurate microarray discrimination of closely related genotypes presents a challenge. We propose a novel approach for microarray design and analysis that relies on the recognition of patterns of hybridization signals from a large number of genotype-specific and control oligoprobes. This method has allowed us to correctly identify the genotypes of most tested samples, including a previously unidentified genotype that was not included in the initial microarray design.
|
|
|---|
Sample amplification, transcription, and hybridization.
Samples of cDNA/DNA for the assay validation were initially amplified using previously described primers (MV 60 and MV 63) and cycling conditions (26). Second-round amplification of DNA samples was conducted using two newly designed PCR primers, Measles_F (GCTATGCCATGGGAGTAGGAGTGGAACTTG) and Measles_R_T7 (TAATACGACTCACTATAGGGCGGCCTCTCGCACCTAGTCTAGAAG). The latter contained the bacteriophage T7 promoter (underlined in the sequence) for facilitation of later in vitro transcription. The 50-µl reaction mixture contained 2.5 U of HotStarTaq DNA polymerase, 1x reaction buffer supplemented with 2.5 mM MgCl2 (QIAGEN, Chatsworth, CA), 0.4 µM each primer, 0.2 mM each deoxynucleoside triphosphate, and 2 µl (
100 ng) of template DNA. PCR was performed using GeneAmp PCR system 9700 or 2720 (Applied Biosystems, Foster City, CA) with an initial 15-min activation at 95°C, followed by 40 cycles consisting of 40 s of incubation at 94°C, 40 s at 57°C, 1 min at 72°C, and a final 10-min extension at 72°C. Further in vitro transcription, RNA labeling, and hybridization were performed as previously described (30), except that second-round PCR products from the validation study were not purified prior to single-stranded RNA (ssRNA) synthesis.
Design of oligonucleotide probes. The genetic variability within the 450 nucleotides coding for the 150 amino acids at the C terminus of the N protein (nucleotides 1233 to 1682) was analyzed by aligning all of the sequences available in GenBank (more than 700). Preliminary analysis of these sequences allowed us to reduce the data set to 405 unique sequences. The final data set contained 15 sequences for genotype A, 2 for B1, 2 for B2, 18 for B3.1 (similar to New York.USA/94), 46 for B3.2 (similar to Ibadan.NIE/97/1), 29 for C1, 36 for C2, 12 for D1, 6 for D2, 22 for D3, 41 for D4, 29 for D5.1 (similar to Palau.BLA/93), 24 for D5.2 (similar to Bangkok.THA/93/1), 38 for D6, 7 for D7.1 (similar to Victoria.AUS/16.85), 6 for D7.2 (similar to Illinois.USA/50.99), 11 for D8, 4 for D9, 6 for E, 2 for F, 2 for G1, 3 for G2, 4 for G3, 33 for H1, and 7 for H2. Phylogenetic analysis of samples was conducted using Mega (version 3.1) software (20).
A set of oligoprobes specific to different MV genotypes was designed using custom Oligoscan (version 2.12) software. While most genotypes could be unambiguously identified by unique genotype-specific oligoprobes, some did not contain unique sequences suitable for probe design. The identification of these genotypes was based on recognition of unique combinations of signals from probes common to two or more genotypes. Due to the low level of genetic divergence between MV genotypes, the majority (46.5%) of the genotype-specific microarray probes differed from the sequences of other MV genotypes by only one nucleotide. Among other probes, 42.5% contained two mismatches, 9.6% contained three, and 1.4% contained four. To increase the reliability of MV genotyping, all genotype-specific oligoprobes were complemented with control oligoprobes in which the unique bases (located near the center of each probe) were replaced with the nucleotide most commonly observed in other genotypes. Therefore, the ratios of genotype-specific oligoprobes to control oligoprobes were used for analysis rather than the hybridization intensities from genotype-specific oligoprobes. The sequences and characteristics of all oligoprobes used in the study can be found in Table S2 in the supplemental material. The oligonucleotides had melting temperatures of
45°C and were synthesized by Operon Biotechnologies (Huntsville, AL).
Microarray design, manufacture, and hybridization conditions. The microarray consisted of 145 oligoprobes (71 pairs and 3 controls) spotted three times each onto the surfaces of CodeLink activated slides according to the protocols described previously (30). During microarray fabrication, each spotting mixture contained, besides the MV oligonucleotide, a quality control (QC) oligonucleotide with an arbitrary sequence unrelated to measles virus RNA (molar ratio, 20:1, respectively). Before hybridization, a Cy3-labeled ssRNA MV sample was mixed with a Cy5-labeled anti-QC oligonucleotide (with a sequence complementary to that of the QC oligonucleotide probe present in each spot.) The final hybridization mixture contained 1 to 2 µM Cy3-labeled ssRNA sample and 0.2 µM Cy5-labeled anti-QC oligonucleotide (complementary to the QC probe immobilized on the chip) in 1x MICROMAX hybridization buffer III (Perkin-Elmer, Boston, MA). Hybridization was conducted at 50°C for 60 min, followed by the standard washing procedure described previously (30).
Scanning and data analysis. The ScanArray 5000 microarray analysis system (Perkin-Elmer, Boston, MA) with 632-nm (for Cy5) and 543-nm (for Cy3) lasers was used. The Cy3 fluorescent signal and local background were measured for each microarray element and analyzed by ScanArray Express software (Perkin-Elmer, Boston, MA). Cy5 images were used for assessing the appearance of each spot, the spot morphology, and the uniformity of hybridization conditions among all surfaces of the microarray. The validation study was performed using an Axon 4200AL scanner and GenePix Pro software (version 6.0; Molecular Devices, Sunnyvale, CA).
The fluorescence intensities obtained from genotype-specific oligoprobes were divided by signals obtained from the respective control oligoprobes. If both the specific and control oligoprobes yielded weak hybridization signals (less than four times the local background), the ratios were ignored and considered to be zero.
The distance between two hybridization patterns was estimated based on the Pearson correlation coefficient (16) between two sets of hybridization data expressed as ratios (see "Design of oligonucleotide probes" above). The Pearson correlation coefficient was calculated using the formula
![]() |
and
are the respective averages for all probes in hybridization patterns for two samples. The distance between two hybridization patterns was postulated to be log(PXY), and the complete matrix of all pairwise distances was used to construct the dendrogram showing relationships between the hybridization patterns. The dendrogram was built by the topological optimization method previously described by Chumakov and Iushmanov (11, 15). All distance calculations and tree construction were done automatically by using custom Oligoscan software. For illustration purposes, after completion of the study, all hybridization pattern data were organized in the form of a rectangular table and analyzed by Cluster (version 3.0) software (12). Independent clustering of columns and rows of a data set were performed to classify different samples according to their hybridization profiles (column clustering) and to outline pairs of oligonucleotides associated with each group of MV strains (row clustering). The same agglomerative algorithm was used to perform both hierarchical clusterings. The distance matrix was calculated using a metric based on Pearson's correlation coefficient (see "Scanning and data analysis" above). The average linkage model for measuring distances between items was implemented in the Cluster program. Clustering results were visualized using Java TreeView (version 1.0.12) software (http://sourseforge.net/project/showfiles.php?group_id=84593). Exact binomial 95% confidence intervals (95% CI) for sensitivity and specificity were calculated using STATA software (version 7.0; StataCorp, College Station, TX).
|
|
|---|
Preliminary evaluation of the microarray was performed using 63 DNA samples, which included 20 previously tested reference strains (REF samples) (see Table S1 in the supplemental material); 41 coded MV DNA samples, also obtained from the Centers for Disease Control and Prevention (CDC samples); and 2 samples of F and E genotypes, provided by R. Fernández-Muñoz (Hospital Ramón y Cajal, Instituto Nacional de la Salud, Madrid, Spain) and G. Tipples (Public Health Agency of Canada, Winnipeg, Manitoba). The results of microarray analysis are shown in Fig. 1.
![]() View larger version (28K): [in a new window] |
FIG. 1. Hierarchical cluster analysis of hybridization patterns of viral samples. The color-coded data matrix shows ratios of fluorescence (specific to the respective control oligoprobes) arranged in both dimensions by hierarchical clustering. The last two rows show hybridization data from oligoprobes common for all MV strains. Individual hybridization profiles for each sample are shown in columns (labeled at the top of each column), and hybridization data from each oligoprobe are presented in rows, labeled on the right. Upper and lower color palettes show data matrix scales for specific/control oligoprobe ratios and normalized signals from broadly specific oligoprobes, respectively. The tree on the left shows clustering based on the similarity of hybridization oligoprobe specificities. The tree at the top shows clusters obtained by comparison of hybridization patterns of samples with different MVs. Genotypes are given at the top. Genotype B2 samples SA1, SA2, SA3, and SA15 were erroneously identified as belonging to genotype B3 (asterisked). VIDRL samples are from the Victorian Infectious Diseases Reference Laboratory.
|
![]() View larger version (33K): [in a new window] |
FIG. 2. Results of phylogenetic analysis of sequences of viral samples. A tree was built by the neighbor-joining method, using Mega (version 3.1) software (20), on the basis of the number of nucleotide substitutions. MV genotypes (boldfaced) are arranged outside the circle, and the strains belonging to each genotype are given inside the circle. Recently isolated genotype B2 strains (28) are boldfaced and italicized.
|
To assess the reproducibility of the microarray assay, several MV samples were independently analyzed using different lots of microarray slides prepared at different times (Fig. 1, samples B2 REF and B2 REF R, CDC 13 and CDC 13 R, C1 REF and C1 REF R, CDC 19 and CDC 19 R, CDC 20 and CDC 20 R, CDC 21 and CDC 21 R, and G1 REF and G1 REF R). In all cases, the hybridization patterns were almost identical. Samples belonging to the same MV genotype, with identical N gene sequences, always produced identical hybridization patterns (Fig. 1, samples CDC 11 and CDC 12 of genotype D4, samples CDC 19, CDC 20, and CDC 21 of genotype D8, samples G3 REF and CDC 30 of genotype G3, and samples CDC 33, CDC 34, and CDC 35 of genotype H1).
To demonstrate that the microarray assay was specific, we tested samples of other Paramyxovirinae, mumps virus and Nipah virus. The presence of PCR products for the Nipah virus samples was unexpected and was probably caused by nonspecific primer binding at the high template concentration. However, the RNA transcribed from those amplicons did not hybridize with any MV-specific oligoprobes of the microarray.
Validation of MV microarray genotyping. In addition, two different blinded panels of samples were evaluated at the Johns Hopkins Bloomberg School of Public Health using microarray slides made at the FDA laboratory. Of the 55 samples in panel 1, 39 represented 14 different MV genotypes, including genotypes D1 and D2, which were not previously tested during assay evaluation (see Materials and Methods). The remaining 16 samples of the panel represented other viruses also causing fever and rash. MV was detected and subsequently genotyped for 35 samples (sensitivity, 89.7% [95% CI, 75.8 to 97.1%]). The results of genotyping were 100% concordant with the genotypes previously identified by sequence analysis. All other samples from panel 1 were negative by both PCR and microarray hybridization (100% specificity). Panel 2 contained 15 samples of different MV genotypes. The microarray detected MV in 14 samples (sensitivity, 93.3% [95% CI, 68.1 to 99.8%]). It is not clear why the last sample failed to be amplified by PCR. The specificity for this panel could not be assessed because all samples were positive for MV. The genotypes of 10 of 14 samples were correctly identified by microarray analysis (71.4% genotype agreement). The reason why we failed to identify four samples of genotype B2 was a greater degree of genetic drift between these recently isolated MV strains (28) and the reference strain for genotype B2, isolated in 1983. Due to the 1.7 to 2.0% nucleotide difference between the recent wild-type strains and the reference sequence, the microarray mistakenly identified these samples as genotype B3.2. Sequence analysis of these genotype B2 isolates showed that correct genotype assignment by microarray was indeed possible but would require designing new oligoprobes. We intend to include these oligoprobes in newer versions of the MV microarray. Thus, the combined sensitivity of the microarray based on these two panels was 90.7% (95% CI, 79.7 to 96.9%). Because the reduction in sensitivity was caused exclusively by the failure to amplify MV RNA from samples that may have been compromised by inadequate international shipping conditions, we expect that sensitivity would be increased by testing samples that are shipped and stored under appropriate conditions.
|
|
|---|
The microarray technique described here is amenable to high-throughput implementation and enables simultaneous multilocus analysis of the pathogen genotype as well as accurate analysis of mixed pathogen populations. It should be noted that microarray technology has emerged recently and is still in the developmental stage. It certainly has the potential to be significantly improved and simplified and to become more robust and efficient, particularly for the purposes of pathogen detection and identification. For example, the replacement of fluorescent dyes (e.g., Cy5 and Cy3) with nanogold particles (1) will allow users to significantly reduce the cost of analysis and to use inexpensive imagers with digital cameras instead of sophisticated microarray scanners. Therefore, microarray technology has unique features that can make it very attractive and a valuable instrument for rapid detection and genotyping of different viral and bacterial pathogens.
Regardless of the platform, all microarray methods for genotyping of microorganisms are based on the use of relatively short (15- to 40-mer) oligoprobes that hybridize only with sequences from a particular subset of species. Although several computer programs are currently available for the automated design of unique microarray oligoprobes, the design of oligoprobes continues to be a challenge, particularly for closely related species that differ by one or a few point mutations. Possible cross-hybridization with unrelated oligoprobes also decreases the utility of this approach. The use of routine methods for the design of oligoprobes for phylogenetically informative parts of the N gene did not produce a sufficient number of unique genotype-specific oligoprobes. To overcome this obstacle, we used multiple group-specific oligonucleotides in addition to genotype-specific oligonucleotides. Thus, MV genotyping relied not only on the comparison of hybridization data from genotype-specific oligoprobes but also on analysis of complex hybridization patterns, which included signals from multiple oligoprobes specific to overlapping groups of MVs. Pairs of oligonucleotides that included specific and control probes were used to detect single-base substitutions.
To identify the genotype of an unknown MV sample, we compared its hybridization pattern with patterns obtained from reference strains. The Pearson correlation coefficient was used to evaluate the difference between hybridization patterns. Computing all possible pairwise comparisons between microarray results for clinical and reference samples resulted in a distance matrix used to construct a dendrogram showing the relatedness of samples. This technique, routinely used in microarray-based methods of transcription analysis, has not previously been used for genotyping. We found that MV genotypes identified by cluster analysis of microarray hybridization patterns were consistent with genotypes determined by sequencing. Therefore, microarray hybridization could be an alternative method for rapid, high-throughput genotyping of new clinical isolates, including those with novel genotypes. The procedures used in microarray studies are not yet fully optimized for routine use and are still relatively expensive. Despite these disadvantages, recent breakthroughs in microfluidics, silicon chips, electronics, and nanotechnology promise the creation of small, inexpensive, and user-friendly devices, combining different laboratory instruments into single, self-contained units that will dramatically reduce individual assay costs.
The main goal of our study was to evaluate the feasibility and usefulness of a pattern recognition approach, widely used in transcription profiling studies, for viral genotyping, particularly for discrimination among viruses with closely related sequences. This approach eliminates the need to design strictly genotype specific oligoprobes for all known genotypes and allows the use of oligoprobes that bind more than one genotype to produce interpretable genotyping information.
This work was partially supported by grants from the DHHS Biotechnology Engagement program (BTEP 16, to K.M.C.), the National Health and Medical Research Council of Australia (grant 282418, to M.A.R.), and the Bill and Melinda Gates Foundation (grant 3522, to D.E.G.) and by grants from the Elizabeth Glaser Pediatric AIDS Foundation (51331-28-PG) and the Thrasher Research Fund (02818-9) to W.J.M.
Supplemental material for this article may be found at http://jcm.asm.org/. ![]()
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»