| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Previous Article | Next Article ![]()
Journal of Clinical Microbiology, April 2009, p. 1119-1128, Vol. 47, No. 4
0095-1137/09/$08.00+0 doi:10.1128/JCM.02142-08
Copyright © 2009, American Society for Microbiology. All Rights Reserved.
,
Department of Medicine, Research Institute of the McGill University Health Centre, Montreal, Quebec, Canada,1 Respiratory Epidemiology and Clinical Research Unit, Montreal Chest Institute, McGill University, Montreal, Quebec, Canada,2 Laboratoire de Santé Publique du Québec, Sainte-Anne-de-Bellevue, Quebec, Canada3
Received 8 November 2008/ Returned for modification 25 November 2008/ Accepted 30 January 2009
|
|
|---|
6 IS6110 copies) or RFLP in combination with spoligotyping (for isolates with <6 IS6110 copies) do not stray across the LSP-defined lineage boundaries. However, our data also demonstrate the poor discriminatory power of either RFLP or spoligotyping alone for these low-IS6110-copy-number isolates. We believe that this independent validation of the LSP method should encourage researchers to adopt this system in investigations aimed at elucidating the role of strain variation in TB. |
|
|---|
While the fact that TB phenotypic diversity exists is no longer in dispute, there is heightened interest in epidemiological circles in establishing the relevance of this diversity to human disease. Of particular note are the recent publications suggesting that strains belonging to the M. tuberculosis W/Beijing lineage possess unique attributes that confer an increased ability to cause disease and to be transmitted within certain geographic settings (10, 12, 19, 20) or to cause extrapulmonary TB in certain patient ethnicities (7, 23). There is also the intriguing suggestion of an interaction between host and bacterial genotypes (6). To increase the power of these types of studies, there is a requirement for a robust system of classifying clinical isolates into a genetically defined set of clades or lineages, each member of which would be expected to share a unique set of phenotypic attributes. However, M. tuberculosis genotypes have traditionally been reported in the literature on the basis of rapidly evolving epidemiologic markers, such as IS6110 restriction fragment length polymorphism (RFLP) fingerprints, spoligotypes, or mycobacterial interspersed repeat units (3, 27). The use of relatively nondescript classifications afforded by these techniques often precludes direct comparisons being made across independent studies. Specifically, the boundaries that define frequently cited strain families such as the Haarlem, Latin-American/Mediterranean, or W/Beijing lineages are vague and inconsistent, and moreover, the nature of the evolutionary relationship between these lineages is unclear.
To understand the epidemiological consequences of M. tuberculosis variability, a less ambiguous system of nomenclature is required to demonstrate clearly the genetic and evolutionary relationships among independent clinical isolates. To date, the most effective means of assigning strains into a small number of unambiguous lineages is the method based on the detection of large sequence polymorphisms (LSPs) or regions of difference (RDs) that represent a series of well-characterized unique-event polymorphisms (deletions) (1, 14, 15, 21, 40). Importantly, these LSP-defined lineages are phylogenetically robust and, unlike IS6110 RFLP, spoligotyping, and mycobacterial interspersed repeat unit analysis, they do not suffer from problems associated with convergent evolution (15, 27). Using this approach, M. tuberculosis isolates are currently classified into six major global lineages, the majority of which tend to show a high degree of geographic restriction. Each of the six lineages is defined by a single ancestral LSP common to all isolates within that particular lineage. In some instances, sublineages have also been identified, each possessing its own unique LSP deletion event. Additional laboratories are also reporting on their own LSPs, such as RD(Rio) (RD174), which appear to define sublineages unique to specific locales (16, 29, 30). However, as Alland et al. have recently pointed out, it is important when building phylogenies based on LSPs to distinguish between those that are unique-event LSPs and those that occur repeatedly across multiple lineages because of their association with repetitive genetic elements (1).
Single nucleotide polymorphism (SNP)-based population analyses are also phylogenetically informative due to an inherent paucity of either synonymous or nonsynonymous substitutions within the M. tuberculosis genome (37). This lack of sequence heterogeneity ensures that the chance of encountering recurrent synonymous SNPs within independent lineages is extremely unlikely. To date, combinations of synonymous SNPs have been used to classify many of the same major lineages as those defined through LSP analysis (13, 15, 17). The major advantage that unique-event LSPs have over SNPs lies in the efficiency and technical simplicity with which isolates can be genotyped and assigned to a particular lineage. Using a multiplex PCR approach, large numbers of isolates can be screened for the lineage-defining LSPs in a high-throughput manner very rapidly. In contrast, a minimum of six SNP loci must be screened in order to make the same lineage assignments, and furthermore, as some of the lineages differ from each other by only a single SNP, extreme care needs to be taken in interpreting the results of these assays. In the future, it will prove immensely useful to continually update the broader global classification to incorporate newly identified LSPs and SNPs as they are uncovered.
The description of the six major global lineages was for the most part based on clinical isolates obtained in San Francisco, CA (14, 15). Because the majority of samples analyzed were collected at one site and represent only a tiny fraction of all TB cases diagnosed each year, the possibility exists that the model proposed by Gagneux et al. (14) suffers from sampling bias and, if so, would not be truly representative of the global situation. In the present study, we have carried out an independent replicate using a data bank of 798 isolates collected over a 6-year period from mostly foreign-born TB patients resident on the island of Montreal, Quebec, Canada. While San Francisco and Montreal are both cosmopolitan cities reflecting a large immigrant community, Montreal, unlike San Francisco, has very low rates of ongoing TB transmission (24). In this manner, our strain collection provides a "snapshot" of TB lineages resident in the more than 80 countries represented. With these isolates, we set out to determine (i) whether the framework outlined by Gagneux and colleagues was robust with the use of an independent sample set, (ii) whether the described association between lineages and geography still held true, and (iii) the association between the LSP/RD-defined lineages and RFLP analysis on the same isolates.
|
|
|---|
Molecular typing methods. The RD239, RD750, and RD105 deletions (14) are used to specifically define isolates belonging to the "Indo-Oceanic," "East African/Indian," and "East Asian" (or "W/Beijing") strain lineages, respectively. These regional names are a reference to the fact that, in their original description, each of these lineages appeared to be restricted to patients originating from certain geographic locations. RD9 is absent from all members of the M. tuberculosis complex (MTC) other than M. tuberculosis and M. canetti (5) and was used in the current study to identify potential atypical MTC infections, including those caused by M. africanum (annotated as M. tuberculosis West African-1 and -2 by Gagneux et al. [14]), M. microti, M. caprae, or M. bovis.
DNA was extracted from Lowenstein-Jensen agar slant-grown bacteria using cetyltrimethylammonium bromide-NaCl according to established protocols (2, 36). Lineage-defining LSPs were detected by multiplex PCR using the oligonucleotide primers specified in Table A1 in the appendix. In addition to a "forward" primer specific for the upstream region of each LSP, each multiplex reaction also included two "reverse" primers—one internal to the deleted region and one located immediately downstream of the LSP (14). The two reverse primers were designed so as to allow an unambiguous visual assignment of strain lineage based on the size of the PCR products obtained. Reactions were carried out using a 96-well format in 25-µl volumes that included 10 ng genomic DNA, 0.4 µM of each oligonucleotide, 200 µM deoxynucleoside triphosphates, 1.5 mM MgCl2, 1x Fermentas Taq buffer (+KCl), and 0.5 U Taq polymerase (Fermentas). Where necessary, reactions were optimized through the addition of 5% to 10% dimethyl sulfoxide (Sigma). A denaturation step was carried out at 94°C for 2 min, followed by 35 cycles of 94°C for 10 s, 58°C for 10 s, and 68°C for 30 s. PCR products were electrophoresed on 1% agarose gels and visualized under UV light following ethidium bromide staining.
|
View this table: [in a new window] |
TABLE A1. Primers used in LSP and DNA sequence analysis
|
DNA sequence analysis. Isolates belonging to the "Euro-American/African" lineage were identified through sequence analysis of a portion of the polyketide synthase 1-15 gene (pks1-15) that is known to contain a 7-bp deletion within all isolates of this lineage (14, 26). Oligonucleotides used for both PCR amplification and sequencing of this region are listed in Table A1 in the appendix. PCR conditions were identical to those described above, and sequence analysis of the reaction products was carried out at the McGill University and Genome Québec Innovation Centre. Statistical analysis for age, sex, and site of disease revealed no evident bias in those samples subject to sequencing.
|
|
|---|
Using a multiplex PCR-based approach, all 798 isolates were typed for the presence of the RD239, RD750, RD105, and RD9 LSP markers. The proportion of isolates within our Montreal TB data bank belonging to the Indo-Oceanic lineage (RD239 deleted) was determined to be 17.5%. Similarly, 9.0% of isolates were classified as East Asian (RD105 deleted), 5.9% as East African/Indian (RD750), and 0.8% as atypical MTC (RD9 deleted) (see Table S1 in the supplemental material).
Preliminary attempts to design a reliable multiplex PCR typing strategy for identification of strains that fall within the last of the six major strain families described thus far—the Euro-American lineage—were largely unsuccessful. This particular lineage is defined by a short (7-bp) deletion in the pks1-15 gene (14) and consists of strains previously classified as belonging to principal genetic groups II and III (37). In order to confirm that the remainder of the Montreal isolates did, in fact, belong to this large lineage, a representative sample of isolates were chosen among the 533 isolates found to be intact for each of the RD239, RD750, RD105, and RD9 LSPs. The region of pks1-15 containing the deleted segment was PCR amplified and sequenced for 270 (51%) randomly selected samples, and 100% of these were found to contain the expected 7-bp deletion characteristic of the Euro-American lineage (9, 14). Although this obviously does not discount the possibility that an additional, previously undescribed M. tuberculosis lineage exists among the 263 isolates that were not tested for the pks1-15 deletion, it is clear that the frequency of such an isolate(s) is expected to be extremely low in our sample population—certainly not higher than that observed for the atypical MTC lineages. As such, we have made the assumption that all of the isolates that were not classified as Indo-Oceanic, East Asian, East African/Indian, or atypical MTC belong within the Euro-American lineage. Therefore, this lineage accounts for a clear majority (67%) of isolates currently present on the island of Montreal (see Table S1 in the supplemental material). Importantly, none of the DNA samples analyzed throughout the course of this study were found to possess more than one of the markers examined, highlighting both the specificity of the lineage-defining LSP markers and the fact that cross-contamination of bacterial or DNA samples has not occurred to any discernible extent within our isolate data bank.
"Phylogeographic" associations of major M. tuberculosis lineages. Country-of-origin information was available for 730 patient isolates, and 606 (83%) of these were collected from patients who were self-identified as being foreign born. These foreign-born patients represented 80 different countries that were well distributed over 16 diverse geographic regions. Only the Pacific region was poorly represented in our patient cohort. Aside from Canada (17% of patients in the cohort for whom the country of origin was recorded), 19.3% of patients were from the Americas or the Caribbean, 8.6% from Europe, 21.8% from Africa and the Middle East, 12.5% from the Indian subcontinent, and 20.7% from Asia (Table 1).
|
View this table: [in a new window] |
TABLE 1. LSP deletion analysis of Montreal M. tuberculosis isolates organized by geographic region
|
2 = 5.22, P = 0.27).
![]() View larger version (30K): [in a new window] |
FIG. 1. Global distribution of countries and lineages represented by Montreal TB patients. Each color represents a distinct LSP-defined TB lineage. Large circles indicate where 10 cases were detected in patients originating from a single country. Small circles indicate <10 cases. Symbols incorporating multiple colors indicate that more than one lineage accounts for 25% of all cases originating from that particular country. Only those cases identified as being "unique" through IS6110 RFLP or spoligotype analysis (i.e., isolates not involved in forming a cluster of recent transmission) have been included. (The world outline map used in the preparation of the figure was obtained from WorldAtlas.com and is used with permission.)
|
![]() View larger version (22K): [in a new window] |
FIG. 2. Major countries and lineages represented among Montreal TB cases. The "top 10" countries of origin with the highest numbers of TB cases are shown and are stratified according to LSP-defined TB lineage. The upper panel consists of only the isolates identified as being IS6110 RFLP or spoligotype "unique." The lower panel consists of isolates involved in chains of recent transmission ("clustered"). Note that the relative scales of the two panels differ.
|
Comparative assessment of LSP, RFLP, and spoligotyping methods.
One would predict that isolates bearing identical IS6110 fingerprints would be classified within the same major lineage through LSP analysis. To confirm this, IS6110 patterns were compared for all isolates within the Montreal cohort. For the isolates with
6 IS6110 copies, 114 samples were identified within 44 clusters (data not shown). An additional 11 clusters involving 41 low-copy-number isolates were subsequently identified through spoligotyping. In both instances, the individual clusters ranged in size from two to nine patients/cluster (median = 2). The fact that the great majority of clusters involve
3 individuals (87%) is again indicative of the low rates of ongoing TB transmission in Montreal.
Where the country of origin is known, the total number of isolates associated with recent transmission was 134 (18.4% of all cases), and aside from East Africa, all geographic regions represented in the Montreal patient database were involved in one or more clusters (Table 1; see also Table S1 in the supplemental material). More importantly, all 55 clusters were composed of isolates within the same LSP lineage, thereby adding to the validity of using LSPs as lineage-defining markers. Of clustered cases, 81.3% involved the Euro-American/African lineage, 14.2% involved the Indo-Oceanic lineage, and 4.5% involved the East Asian lineage. Neither the RD750-deleted East African/Indian nor the RD9-deleted atypical MTC families were involved in clustered TB cases in Montreal (Fig. 2; see also Table S1 in the supplemental material).
For the low-IS6110-copy-number samples, it is interesting that when spoligotyping is not included as part of the analysis, a total of 92 isolates appear to form 20 clusters of cases. Of these, five are composed of isolates from different LSP lineages (Euro-American/African and Indo-Oceanic; data not shown). This result serves as a direct demonstration of the poor specificity of using IS6110 RFLP for epidemiological investigations where low-copy-number isolates are involved (27, 33). A similar phenomenon arises when spoligotyping is carried out independently of the IS6110 fingerprint for these same low-copy-number samples. In this case, 78 isolates appear to form clusters, of which three are composed of mixed lineages.
Epidemiological features of major M. tuberculosis lineages.
In Montreal, where transmission and incidence rates of TB are low, it is difficult to undertake a comparative assessment of virulence or transmissibility for each of the major lineages—particularly when almost 70% of active cases in Montreal are due to a single lineage. However, for the cohort of patients originating from countries where a mixture of lineages was found to cause infection, we assessed whether or not there were any observable lineage-specific trends in regards to causing early (i.e., patients diagnosed with active TB
5 years following immigration to Canada) or late (>5 years following immigration) disease. As TB transmission in Montreal occurs relatively infrequently, the "early" group most likely represents those patients originally infected outside Canada. The "late" group could also include patients infected while living in Montreal, where exposure in the general community is most likely to be due to the Euro-American/African lineage. We would predict that among ethnically related individuals infected with a range of strains from distinct lineages, if a particular lineage is more virulent and therefore more prone to cause active disease, then it should appear more frequently among the cases occurring during the early time frame. Similarly, if a particular lineage is more transmissible, then we would predict an increase in the frequency of cases due to this lineage over time. However, from Fig. 3A, it can be seen that the Indo-Oceanic, East Asian, and Euro-American/African lineages appear equally likely to cause disease during both the early and late time frames. There was also no indication that the lineage that predominates in Montreal, the Euro-American/African lineage, was becoming established over time within these ethnic communities (Fig. 3B). Only the East African/Indian lineage (RD750) demonstrated a trend toward causing active disease more often among newly arrived (68.8%) than among established immigrants.
![]() View larger version (14K): [in a new window] |
FIG. 3. Lineage-related trends over time for patients from "mixed-lineage" countries. (A) For foreign-born patients with either "early" ( 5 years following arrival in Canada) or "late" (>5 years after arrival in Canada) TB, the sum of the lineage-specific trends for each of the countries listed in panel B is presented. The former are most likely to represent patients infected outside Canada; the latter could also include patients infected within Montreal, where general exposure is most likely to be due to the Euro-American/African lineage. (B) The Euro-American/African lineage data presented in panel A, arranged by country. Only those countries of origin where multiple lineages are present (each of which represents a significant proportion of cases within the countries involved) have been included. For both panels, all TB cases are presented (unique plus clustered).
|
2 = 2.39, P = 0.66).
![]() View larger version (14K): [in a new window] |
FIG. 4. Major M. tuberculosis lineages and primary site of disease among foreign-born TB patients residing in Montreal. The proportions of foreign-born patients infected with each of the major LSP-defined lineages and displaying either pulmonary or extrapulmonary disease are presented. Both unique and clustered TB cases have been included in the analysis.
|
|
|
|---|
For both the Montreal and San Francisco patient cohorts, the majority of isolates were determined to be part of the highly ubiquitous Euro-American/African lineage that comprises 67% and 48% of Montreal and San Francisco isolates, respectively. In both cases, this particular lineage was isolated from TB patients originating from each of the geographic regions represented by the studies. In this respect, the Euro-American/African lineage is unique, and its peculiar distribution pattern is presumably a function of multiple migration routes involving the individuals infected with these strains. For both cohorts, this lineage is clearly predominant throughout the Americas, Europe, and Africa, and together with Canadian-born patients, patients from Haiti and the Democratic Republic of Congo comprise more than half of all Montreal cases involving this lineage. Interestingly, the Haitian community continues to be a "high-risk" group for developing active TB within Montreal, as has been noted previously in two retrospective studies carried out on patient samples collected between 1992 and 1998 (24, 34). Although not attempted in the present study, it is possible to classify isolates of the Euro-American/African lineage into a number of distinct sublineages that show a degree of geographic restriction through the use of additional LSP or SNP markers (13, 14, 17).
Although the Indo-Oceanic clade appeared to be largely restricted to Southeast Asia within the San Francisco patient population, a significant proportion of these isolates (27%) were also found in Montreal patients originating from the Indian subcontinent. As this lineage retains the TbD1 region, it may be tempting to speculate that this lineage arose in the Indian subcontinent or Southeast Asia. However, a small number of these isolates also occur throughout much of Africa. It is therefore possible that this lineage has its origins in Africa and was transported east along human migration routes into regions where, for reasons of host-pathogen adaptation, it has continued to flourish. A recent publication by Wirth et al. has also suggested an African origin for this particular lineage (42). As in the Gagneux et al. analysis, the RD105-deleted East Asian or W/Beijing lineage was largely restricted to patients of Asian (East or Southeast) origin. The last of the major lineages, namely, the East African/Indian lineage, occurs most frequently among Montreal patients originating from countries within the Indian subcontinent, once again reflecting the San Francisco data. However, unlike the latter study, we identified very few East African patients harboring this lineage. In Montreal, one-quarter of all Middle Eastern patients were infected with East African/Indian isolates.
While the level of ongoing TB transmission in Montreal is quite low in comparison to other major urban centers (when identical RFLP matching criteria are used, transmission is estimated at 4 to 12% [18, 24, 35]), 100% of the 44 high-IS6110-copy-number clusters identified using identical matching criteria were found to be comprised of isolates belonging to the same LSP-defined lineage. Similarly, each of the 11 low-copy-number clusters identified through a combination of RFLP and spoligotyping contained isolates within the same strain lineage. However, in the case of the low-copy-number isolates, the analysis of IS6110 fingerprints or spoligotypes alone is not sufficient to correctly identify isolates implicated in chains of TB transmission (27, 33). Although not directly tested as part of this study, it is likely that the poor correlation of the spoligotyping technique is not a unique property of the low-IS6110-copy-number isolates. Hence, while obviously not a substitute for RFLP in contact or transmission investigations, we believe that use of LSP typing may serve as a supplement for ensuring that clustered or matched isolates have been correctly identified, particularly when dealing with low-IS6110-copy-number isolates or where spoligotyping is used in the absence of IS6110 RFLP. In support of this, Gutacker et al. have also recently observed that M. tuberculosis isolates bearing <6 IS6110 copies are able to be represented across very different SNP-defined phylogenetic lineages (17).
In Montreal, the proportion of foreign-born patients displaying extrapulmonary sites of TB disease is significantly higher than that of the Canadian-born cohort (31% versus 17.5%;
2 = 7.16, P = 0.0075). Although these observations may, in part, reflect the greater strain diversity that exists among immigrant patients, the fact that we observed no significant association of any of the major lineages with extrapulmonary disease would tend to support the idea that one or more sociological, host, or clinical/diagnostic factors are involved. Human immunodeficiency virus infection is not expected to play a significant role here as the overall rate of TB-human immunodeficiency virus coinfection in Montreal is estimated to be below 10% (4, 34).
Gagneux et al. have recently proposed that the major M. tuberculosis lineages have evolved so as to become adapted to specific host genetic backgrounds and are much more likely to transmit and cause disease among patients of the same ethnicity (14). Although our data regarding the lack of assimilation of the predominant Euro-American/African lineage into immigrant communities in Montreal tend to support this interesting hypothesis (Fig. 3B), we also feel that at this stage we cannot discount the possibility that a lack of social mixing among different ethnic groups may also have contributed to this phenomenon.
Within the unique setting of Montreal, where we have a low overall incidence of TB and low rates of transmission, there is little evidence to support the suggestion that the East Asian lineage (RD105) is more highly virulent, more highly transmissible, or more likely to cause extrapulmonary disease as has been previously suggested (7, 10, 12, 19, 20, 23). However, our findings clearly do not preclude the possibility that this lineage may be associated with these phenotypes in other epidemiological settings where ongoing transmission of these strains is taking place and where clinical and public health interventions are less efficient than those in Montreal. Similarly, while it may be tempting to speculate that the East African/Indian lineage (RD750) is less transmissible (due to the fact that it does not appear among any of the clustered cases) and yet more likely to result in active disease (Fig. 3A), it would appear equally likely that this effect is attributable to any number of sociological factors associated with the Indian subcontinent communities that almost exclusively harbor this strain.
In conclusion, our data strongly support the global population structure of M. tuberculosis as originally proposed by Gagneux and colleagues. While the ranges of countries represented in the two independent studies are quite distinct, the geographical associations of the major LSP-defined M. tuberculosis lineages remain very clear. Finally, the need for a consensus among researchers as to how to define and describe families of genetically related isolates that presumably share common phenotypes or disease-related traits has never been more urgent for the types of strain-linked association studies that are becoming commonplace these days (6, 7, 10, 20, 23, 29). At present, the LSP/RD framework would appear to be the most robust and universally applicable model that will hopefully become the standard among epidemiological researchers.
|
|
|---|
We also decided to compare the sublineage distributions for one of the major strain families between the San Francisco and Montreal TB patient cohorts. For this purpose we chose the East Asian (W/Beijing) lineage, which has five well-defined sublineages based on the presence of specific LSPs (14, 39). Again, the overall distribution of the five sublineages was largely conserved between the two patient cohorts, with group 3 strains (RD105, RD207, and RD181 deleted) accounting for the vast majority of all East Asian isolates. This particular sublineage comprises 67% and 82% of East Asian isolates within the San Francisco (39) and Montreal databases, respectively. The next most common sublineage was the group 4 sublineage (RD105, RD207, RD181, and RD150 deleted), which accounts for 20.5% (San Francisco) and 11.3% (Montreal) of East Asian strains. Group 2 strains (RD105 and RD207 deleted) make up 8% (San Francisco) and 5% (Montreal) of isolates. Finally, the remainder of Montreal isolates fell within the group 1 subcategory (1.6%; RD105 deleted only), while the remainder of San Francisco isolates belong in group 5 (4.5%; RD105, RD207, RD181, and RD142 deleted). Whether these shared trends reflect the actual sublineage distribution extant within the East Asian and Southeast Asian regions or are reflective of certain patient characteristics common to both the San Francisco and Montreal immigrant populations is unknown at present. There were no obvious associations between particular countries of origin and any of the individual sublineages (data not shown).
For our set of isolates, the sequence of the RD207 deletion was determined not to be exactly as reported previously (14). A putative IS6110-related transposase sequence included in the original description of RD207 was found to be present within the RD207-deleted strains, albeit in an inverted orientation with respect to H37Rv. Immediately upstream of this sequence, we also identified an additional five unique spacer regions that form part of the direct repeat region of the genome (data not shown).
Primers used in LSP and DNA sequence analysis are shown in Table A1.
Published ahead of print on 11 February 2009. ![]()
Supplemental material for this article may be found at http://jcm.asm.org/. ![]()
|
|
|---|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»