Previous Article | Next Article ![]()
Journal of Clinical Microbiology, November 2006, p. 3940-3946, Vol. 44, No. 11
0095-1137/06/$08.00+0 doi:10.1128/JCM.01146-06
Copyright © 2006, American Society for Microbiology. All Rights Reserved.
Department of Epidemiology, School of Public Health, University of Michigan, Ann Arbor, Michigan,1 Central Arkansas Veterans Healthcare Center,2 Department of Neurobiology and Developmental Sciences, College of Medicine, University of Arkansas for Medical Sciences,3 Department of Epidemiology, College of Public Health, University of Arkansas for Medical Sciences,4 Arkansas Department of Health, Little Rock, Arkansas5
Received 2 June 2006/ Returned for modification 4 July 2006/ Accepted 27 August 2006
|
|
|---|
|
|
|---|
In addition, LSPs among M. tuberculosis clinical isolates have been reported. Among 100 clinical isolates, 68 regions of difference (RDs) of M. tuberculosis have been identified by DNA microarray analysis and sequencing (21). Some of the RDs had frequencies of deletion of more than 20% (21). Among these RDs, the RD105 deletion was exclusively found in all strains in a genetically related group, named the Beijing/W lineage (20); thus, the RD105 deletion is suggested to be a marker of Beijing/W lineage strains (20). In addition, RD142, RD150, and RD181 have been found to subdivide the Beijing/W lineage strains into four subgroups, a group with concurrent deletions of RD105, RD181, and RD142; a group with concurrent deletions of RD105, RD181, and RD150; a group with concurrent deletions of RD105 and RD181; and a group with only the RD105 deletion (20). The Beijing/W lineage includes a large number of M. tuberculosis strains circulating around the world. Strains from the Beijing/W lineage have been associated with global transmission and drug resistance (3). Three principal genetic groups, based on the single-nucleotide polymorphisms of the katG gene codon 463 and the gyrA gene codon 95, have been used to describe the divergence of M. tuberculosis complex strains (9, 10, 19). The Beijing/W lineage strains of M. tuberculosis were classified into group 1 (9, 10). Given the reported association between the deletions of the four RDs and the Beijing/W lineage, exploration of the relationship between the three genetic groups and the RD deletions with a population-based sample may provide new insight into the evolution of M. tuberculosis.
Despite the report of the relationship between a given set of RD deletions and a family of M. tuberculosis clinical strains, whether these RDs have epidemiological and clinical significance beyond their associations with the Beijing/W strains remain to be investigated. Current tuberculosis (TB) control strategies assume that all clinical strains of M. tuberculosis are equally transmissible and virulent in humans, but if different RD genotypes account for different biological attributes of M. tuberculosis strains regarding virulence and transmissibility, alternative TB control strategies may be in order. To gain a better understanding of the clinical and epidemiological relevance of the presence or absence of the RDs and to explore the usefulness of these RDs as markers for a particular lineage of subpopulations, we screened a clinically and epidemiologically well characterized population-based collection of clinical isolates for the presence or absence of the four RDs associated with the Beijing/W lineage and RD239, which was included because of its high frequency of deletion among clinical isolates reported previously (21); we also analyzed the association between the RD genotypes and the clinical and epidemiological characteristics of the patients.
|
|
|---|
Patient data. Patient information was obtained from the surveillance records of the Arkansas Department of Health and Human Services. This database included demographics, social and behavior characteristics, and clinical features.
The study protocols and procedures for the protection of human subjects were approved by the Health Sciences Institutional Review Boards of the University of Michigan and the University of Arkansas for Medical Sciences.
Detection of RD deletions. A two-step experiment, that is, microarray-based hybridization followed by PCR, was conducted to determine the presence or absence of RD105, RD142, RD150, RD181, and RD239 in our isolate collection. As described below, the microarray hybridization was conducted by using the Library on a Slide platform (29). As the first step of the screening, our microarray experiment conditions were set up so that we would minimize the chance of falsely detecting the presence of the RDs under investigation, thereby maximizing the chance for catching all the existing deletions. The identification of the true RD deletions among those found by the microarray-based hybridization was done by PCR. When the size of a PCR product was different from that of positive control strain H37Rv, automated DNA sequencing was conducted to identify insertions or deletions in the RDs.
Microarray-based hybridization. The genomic DNA of the study isolates, at a concentration of 1 µg/µl, was printed on Vivid gene array slides (Pall Life Sciences, West Chester, PA) by using a VersArray ChipWriter Pro system (Bio-Rad Laboratories, Hercules, CA). Each sample was printed twice on the same slide. A sequence within the 16S rRNA gene was used as a quantification probe to check the quantity of genomic DNA on the slides. All the hybridization probes were made by PCR with primers flanking a unique sequence within the RDs studied (Table 1). The PCR-amplified probes were purified and labeled with fluorescein-12-dCTP (Perkin-Elmer, Wellesley, MA) by using a BioPrime labeling kit (Invitrogen, Carlsbad, CA). The genomic DNA on the slides was hybridized with the RD probes by using a Super HYB kit with 50% formamide (Molecular Research Center, Inc., Cincinnati, OH) at 45°C. The slides were subsequently washed twice with a low-stringency buffer (2x SSC [1x SSC is 0.15 M NaCl plus 0.015 M sodium citrate], 0.1% sodium dodecyl sulfate [SDS]) for 5 min at room temperature and then twice with a high-stringency buffer (0.2x SSC, 0.1% SDS) for 20 min at 45°C. The slides were then incubated with a blocking solution containing 0.1 M Tris buffer, 0.3 M NaCl, and 10% blocking reagent (Amersham Biosciences, Piscataway, NJ) at room temperature for 1 h and then further incubated with a 5,000-fold-diluted conjugated antibody, anti-fluorescein-alkaline phosphatase Fab fragment (Roche, Basel, Switzerland) with blocking solution. The postdetection wash was done three times with a low-pH washing solution containing 0.1 M Tris (pH 7.5), 0.3 M NaCl, and 0.1% Tween 20 for 10 min and then with a high-pH washing solution containing 0.1 M Tris (pH 9.5), 0.1 M NaCl, and 0.01 M MgCl2 three times for 5 min each time. The colors of the spots on the slides were developed with an ArrayIt alkaline phosphatase kit (TeleChem, Sunnyvale, CA). The slides were scanned with an ArrayIt Microarray SpotWare colorimetric scanner (TeleChem).
|
View this table: [in a new window] |
TABLE 1. RDs evaluated and primers of probes for microarray hybridization
|
PCR and sequencing. PCR was conducted for all isolates showing RD deletions identified by microarray analysis. PCR was also done to investigate the deletion of the five RDs for isolates that did not have a sufficient amount of DNA for microarray-based hybridization (n = 102). In addition, to evaluate the sensitivity and the specificity of the microarray-based hybridization, we conducted PCR of RD105 for all 648 study isolates. The primers used for PCRs of different RDs were the same as those described previously (20), except for the primers for PCR of RD142 (5'-TCC GCG ACG ACG AAC AAC GAC GAC-3' and 5'-GGC GGC GGA GAC GAC AGC AGG ATT-3'). The BD Advantage 2 PCR kit (BD-Biosciences Clontech, Palo Alto, CA) was used for the PCRs for RD105 and RD239. The BD Advantage GC-2 kit (BD-Biosciences Clontech) was used for the PCRs for RD142, RD150, and RD181. The thermocycling parameters for PCR assays were as follows: for RD105, 1 cycle at 94°C for 1 min, followed by 26 cycles of 94°C for 30 s, 68°C for 30 s, and 72°C for 4 min 45 s, with completion with a final cycle at 72°C for 10 min; for RD142, 1 cycle at 94°C for 2.5 min, followed by 26 cycles of 94°C for 45 s, 68°C for 45 s, and 72°C for 4.5 min, with completion with a final cycle at 72°C for 10 min; for RD150, 1 cycle at 94°C for 2.5 min, followed by 26 cycles of 94°C for 45 s, 68°C for 45 s, and 72°C for 3 min 45 s, with completion with a final cycle at 72°C for 10 min; and for RD181 and RD239, 1 cycle at 94°C for 1 min, followed by 26 cycles of 94°C for 30 s, 68°C for 30 s, and 72°C for 2 min 13 s, with completion with a final cycle at 72°C for 10 min.
When a PCR failed to generate the expected product, PCR of the 16S rRNA gene was performed to confirm the quantity and the quality of the DNA templates (4). For isolates generating a PCR product with a size different from that of positive control strain H37Rv, as visualized on a 1.0% agarose gel in 1x TBE (Tris-borate-EDTA) buffer, automated DNA sequencing was conducted to detect insertions or deletions in the RDs by using the corresponding PCR forward and reverse primers. By consideration of cost-efficiency, a two-step strategy was applied for DNA sequencing. First, we randomly selected one-third of the PCR products from each size group for DNA sequencing. Second, if the sequencing results showed differences among the PCR products within the same group, all the remaining PCR products in this group were sequenced; if the sampled PCR products showed identical sequences, the remaining products from the same group were not sequenced. Sequence comparison was performed by using the software Edit Seq 5.02 and MegAlign 5.01 (DNAStar Inc., Madison, WI).
Statistical analysis.
The distributions of epidemiological and clinical characteristics among the different RD genotypes were compared by the
2 test or Fisher's exact test, as appropriate. When the association between clustering and the regions studied was analyzed, one isolate was randomly selected from each cluster. Considering the likelihood that cases in a cluster would not be independent, we used generalized estimating equations (GEEs) to control for potential intracluster dependence when we assessed the associations between the different RD genotypes and the clinical characteristics of the disease (15, 28). The magnitude of the associations was estimated by using the odds ratio (OR) and 95% confidence intervals (CIs). The disease sites were classified into thoracic and extrathoracic by using the definitions described previously (26). Briefly, thoracic TB was defined as disease sites confined to the lung, pleura, and intrathoracic lymph nodes, while extrathoracic TB was defined as cases of extrathoracic disease with or without concurrent disease within the thoracic cavity. Adjustment for the potential confounding by the four previously identified risk factors for extrathoracic TB was performed when the association between the RD genotype and the site of disease was analyzed by the use of logistic regression models. These factors included human immunodeficiency virus (HIV) serostatus, gender, race/ethnicity (26), and the genotype of the plcD gene (13, 27). To assess the individual effect of each of these four potential confounding factors, we first fit a base model that included only the RD genotype as an essential variable; we then added the four previously identified risk factors into the base model, one at a time, to fit an additional four models, designated models 2, 3, 4, and 5, respectively. A final model, designated model 6, was fit by adding all four risk factors for extrathoracic TB into the base model to adjust for the potential confounding of all four previously known risk factors for extrapulmonary TB. All statistical analyses were done with SAS, version 9.0 (SAS Institute).
Nucleotide sequence accession numbers. The GenBank accession numbers for RD105 accompanied by a 9-bp insertion, an 18-bp insertion, and a deletion at the 5' end of Rv0071 are DQ872637, DQ872638, and DQ872636, respectively. The sequence of the isolate that showed a partial deletion of RD105 can be found under GenBank accession number DQ872639. The sequences of isolates confirmed to have deletions of RD181, RD142, RD150, and RD239 can be found in GenBank under accession numbers DQ872640, DQ872641, DQ872642, and DQ872643, respectively.
|
|
|---|
Frequency distribution of genotypes of RDs. The results of microarray hybridization, PCR, and DNA sequencing were combined to determine the genotypes of the five RDs. Of the 648 isolates analyzed, 39 (6.0%) showed the deletion of RD105 and 1 (0.1%) had a partial deletion of RD105 (2.5 kb). The deletions of other RDs were found to be at a lower frequency compared with the frequency of the RD105 deletion. Thirteen (2.0%), 5 (0.8%), 31 (4.8%), and 18 (2.8%) isolates were found to have deletions of RD142, RD150, RD181, and RD239, respectively.
Relationship between the three principal genetic groups and RD genotypes. The relationship between the three principal genetic groups and the RD genotypes and the range of IS6110 copy numbers in each subgroup defined by RD genotypes were analyzed. Both principal genetic groups 1 and 2 were divided into subgroups by different combinations of deletions of the five RDs. Principal genetic group 3 remained undivided, as no deletion of the five RDs was found in this group (Fig. 1). Principal genetic group 1 was divided into three subgroups on the basis of the deletion of RD105 and RD239. None of the isolates with the RD105 deletion had the RD239 deletion, and all isolates with the RD105 or the RD239 deletion belonged to principal genetic group 1. The RD181 deletion divided the group with the RD105 deletion into two groups: one with the RD181 deletion and one without the RD181 deletion. The deletions of RD142 and RD150 further divided the group with the RD181 deletion into three groups: a group with the RD142 deletion, a group with the RD150 deletion, and a group without the RD142 and RD150 deletions. Principal genetic group 2 was divided into two groups on the basis of the RD105 deletion. In contrast to principal genetic group 1, the majority of the principal genetic group 2 isolates had no deletions in the five RDs investigated. None of the isolates in principal genetic group 3 showed any deletion in these RDs. Distinct IS6110 copy number ranges were observed for several subgroups defined by RD genotypes (Fig. 1).
![]() View larger version (17K): [in a new window] |
FIG. 1. Subgrouping of the three principal genetic groups of M. tuberculosis clinical isolates by deletions of RD105, RD142, RD150, RD181, and RD239. In the parentheses, the number before the slash is the number of isolates; the number after the slash is the number of strains. A strain is defined as a unique combination of IS6110 RFLP pattern and pTBN12 genotype. The range of IS6110-hybridizing band numbers for each subgroup of M. tuberculosis isolates is indicated by italicized numbers.
|
2 test, P < 0.0001). Association between RD genotypes and clinical and epidemiological characteristics. On the basis of all the combinations of RD deletions found in this study, the 648 study isolates were classified into one of seven subgroups: subgroup 1, no deletion detected; subgroup 2, the RD239 deletion; subgroup 3, concurrent deletions of RD105, RD181, and RD142; subgroup 4, concurrent deletions of RD105, RD181, and RD150; subgroup 5, concurrents deletion of RD105 and RD181; subgroup 6, deletion only of RD105; and subgroup 7, a partial deletion of RD105. The distribution of the clinical and epidemiological characteristics of the patients together with the genotypes of the plcD gene among these seven groups was analyzed. Statistically significant differences (P < 0.05) in the distributions of these deletions were found by race/ethnicity, age, plcD genotypes, and site of disease (Table 2).
|
View this table: [in a new window] |
TABLE 2. Frequency distribution of clinical, epidemiological, and genotypes of plcD gene among seven RD genotypes by Fisher's exact testa
|
Multivariate logistic regression analysis of association between RD genotypes and site of disease. The association between the site of disease and RD genotypes found by Fisher's exact test held true in all the logistic models described above. All the models showed that patients infected by an isolate with concurrent deletions of RD105, RD181, and RD142 and by an isolate with concurrent deletions of RD105, RD181, and RD150 were more likely to develop extrathoracic TB than patients infected by an isolate without any deletion of the five RDs investigated. The ORs of the RD genotypes in models 2 through 5 were similar to those obtained in the base model (data not shown). The adjustment for the potential confounding of the four previously known risk factors for extrapulmonary TB by model 6 strengthened the association between the concurrent deletions of RD105, RD181, and RD142 and the site of disease, while the association between the concurrent deletions of RD105, RD181, and RD150 and the site of disease was slightly reduced by this adjustment (Table 3).
|
View this table: [in a new window] |
TABLE 3. Association between RD genotype and extrathoracic TB assessed by logistic regression models by use of the GEE method without and with adjustment for HIV serostatus, gender, race/ethnicity, and plcD genotype
|
|
|
|---|
To our knowledge, this is the first report that concurrent deletions of RD105, RD181, and RD142 and concurrent deletions of RD105, RD181, and RD150 are associated with extrathoracic TB. RD105 includes Rv0071 through Rv0074, RD181 includes Rv2262c and Rv2263c, RD142 includes Rv1189 through Rv1192, and RD150 includes Rv1671 through Rv1674c. Among these genes, Rv1189 is known to encode sigma factor I, which has been reported to be present only in M. tuberculosis and M. bovis (17) and to be involved in the survival of M. tuberculosis during the host-free aerosol particle stage (17). As bacterial sigma factors combine with RNA polymerase to regulate the transcription of other genes, the truncation of sigI might inhibit the expression of other genes. Further exploration of the genes regulated by sigI will help provide an understanding of these associations. The majority of the genes (8/14) in these RDs have no known function. Our findings might provide a clue for future functional studies of these genes.
It should be noted that the observed associations of the two combinations of the RD deletions with extrathoracic TB do not imply causality, especially given the possibility that there are likely to be a significant number of mutations in other genes that coexist in the genomes of the study isolates and that were not detected by the present investigation; any of these genes could be causing the observed phenotype. Furthermore, the sample sizes of isolates with concurrent deletions of RD105, RD181, and RD142 and isolates with concurrent deletions of RD105, RD181, and RD150 were relatively small; their associations with extrathoracic TB remain to be confirmed by using a sample with a larger number of the isolates carrying these two combinations of RD deletions. If the association is confirmed, these RD deletions would be useful for predicting the clinical outcome of an M. tuberculosis infection in the human host.
Beijing/W lineage strains have been found to be associated with nosocomial and community outbreaks, global transmission, and drug resistance (3). Consistent with previous reports (11, 20), our study also found that all the Beijing/W lineage strains had the RD105 deletion and that the RD105 deletion is exclusively found in non-Beijing/W lineage strains. This suggests that the RD105 deletion can be a useful marker for the Beijing/W lineage strain. In contrast to the previous study, which used a selected sample of isolates (11, 20), our study used a 5-year population-based collection of isolates from Arkansas, providing convincing confirmation of the relationship between the RD105 deletion and the Beijing/W lineage.
The three principal genetic groups have been considered markers for the divergence of M. tuberculosis (19), and a recent phylogenetic study further divided these three principal genetic groups into seven single-nucleotide-polymorphism cluster groups (8). We found that two of the three principal genetic groups (groups 1 and 2) can be further subgrouped on the basis of the deletions of the five RDs investigated. Further investigation of the relationship between these RD deletions and the reported single-nucleotide-polymorphism cluster groups would broaden our knowledge of the evolution of M. tuberculosis. Interestingly, most of the subgroups of the principal group 1 isolates in our study have a distinct range of IS6110 copy numbers. The IS6110 copy number ranges in the subgroups based on the deletions of RD181, RD142, and RD150 are almost nonoverlapping (Fig. 1). We previously found that IS6110 insertions are often associated with deletion events in the plcD gene region of the genome (13, 27); however, we did not observe any IS6110 element adjacent to the RD deletions, suggesting that these deletions are not due to the insertion of IS6110. The causes for the observed distribution of the IS6110 copy number range among the RD genotype-defined subgroups remain to be investigated.
In this study, we observed that the isolates in RD genotype-defined subgroups 3 and 4 had the highest IS6110 copy numbers among all the subgroups with different RD genotypes and that these two RD genotypes were associated with extrathoracic TB. We therefore hypothesize that the isolates with higher numbers of IS6110 copies might have experienced more frequent gene interruptions by the insertion of IS6110 in their genomes compared with the number of interruptions in those with lower IS6110 copy numbers. RD genotypes 3 and 4 may be markers for a high frequency of IS6110 insertion, and genes interrupted by the IS6110 insertion in these isolates may play an important role in the pathogenesis of TB.
This study also found that a larger proportion of patients infected by the isolates with the RD105 deletion are of Asian origin and younger than age 65 years compared with the proportion of patients infected by isolates without the RD105 deletion. This finding may be explained by the fact that all the isolates with RD105 deletion are in the Beijing/W lineage, and these isolates are prevalent in Asia (1, 7, 24). The latter finding might suggest that the cases infected by isolates with the RD105 deletion are more likely to have resulted from recent transmissions, because it was previously found that the proportions of cases of TB disease attributable to recent transmission are higher for young individuals than for elderly individuals (25). The other observation from the study supporting this explanation is that the majority of the isolates with the RD105 deletion belong to the Beijing/W lineage, and isolates of this lineage are known to have caused TB transmission worldwide (3).
However, we did not find a significant association between the RD genotypes and clustering. This could be due to the relatively small sample size of the Beijing/W lineage strains in our study (16 of 424 strains). In addition, as Arkansas is a state with a stable and dispersed population (5), it is possible that the clusters defined by the genotyping results in our study do not always reflect recent transmission; instead, they may represent some transmissions that occurred in the remote past.
The comparison of the results of PCR with those obtained by the microarray-based hybridization showed that the microarray technique can accurately detect those isolates that do not have large deletions of the RDs studied. However, it overestimates the number of isolates with deletions. This detection error might have been caused by printing errors that caused missing spots on the microarray slides or by the printing of a suboptimal amount of DNA on some of the spots due to an overestimation of the DNA concentration for some of the samples. Therefore, it is necessary to perform PCR and DNA sequencing to confirm any deletions found by the microarray-based method. In addition, microarray analysis can detect only the deletion of the sequence complementary to the probe sequences.
In conclusion, this study has advanced our knowledge of the potential clinical and epidemiological relevance of the RD deletions. Functional studies of the genes within these RDs will allow us to gain a better understanding of the observed associations.
We acknowledge Kashef Ijaz's contribution to the establishment of the Arkansas Department of Health's surveillance database that was used for the study. We thank Dong Yang for excellent technical assistance in the culture of M. tuberculosis isolates and the preparation of the genomic DNA used for this study. We are grateful to the Tuberculosis Genotyping Laboratory of the Michigan Department of Community Health for their assistance in verifying the spoligotype of one study isolate.
Published ahead of print on 6 September 2006. ![]()
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2010 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»