Previous Article | Next Article ![]()
Journal of Clinical Microbiology, January 2002, p. 172-181, Vol. 40, No. 1
0095-1137/02/$04.00+0 DOI: 10.1128/JCM.40.1.172-181.2002
Copyright © 2002, American Society for Microbiology. All Rights Reserved.
Department of Microbiology, University of Sydney, NSW 2006, Australia
Received 11 May 2001/ Returned for modification 9 August 2001/ Accepted 15 October 2001
|
|
|---|
|
|
|---|
In this paper, we look at the relationships among the seventh pandemic isolates of Vibrio cholerae, the etiological agent for cholera, from different countries and dates. Multilocus enzyme electrophoresis (MLEE) is not useful for this purpose, because all except the South American variant belong to a single electrophoretic type (31, 46). Sequencing of housekeeping genes, which was used to resolve relationships between pandemic clones and to environmental strains, is also not suitable for studying variation within the seventh pandemic clone, because no variation was found in any of several housekeeping genes sequenced (4, 20).
Other molecular methods, including pulse-field gel electrophoresis (PFGE) and ribotyping, have been employed to study the diversity and epidemiology of the seventh pandemic (6, 35, 46). Ribotyping revealed much variation within the clone and provided some interesting insight into its evolution (19). However, subsequent analysis of the molecular basis of ribotype variation revealed that it was generated by recombination between rrn operons (27). The localized nature of such recombination makes ribotyping less valuable for evolutionary studies, because any such change is readily reversed, making the true relationships of isolates difficult to reveal. PFGE is expensive and technically difficult to standardize. Furthermore, the variation detected by PFGE is often a result of genome instability (genome rearrangement, etc.) (29, 42). Dissimilarity in PFGE patterns is not a good measure of evolutionary divergence. An alternative method is necessary.
Amplified fragment length polymorphism (AFLP) is a novel PCR-based DNA fingerprinting method (44) revealing variation around the whole genome by selectively amplifying a subset of restriction fragments for comparison. The AFLP procedure involves digestion of DNA with EcoRI and MseI or another combination of enzymes with 6- and 4-base recognition sites. Double digestion with EcoRI and MseI will produce around 2,000 EcoRI-MseI fragments, 16,000 MseI-MseI fragments, and very few EcoRI-EcoRI fragments for a genome of 4,000 kb. This is followed by ligation of adapters to both ends of the fragments and amplification with primers designed so as to amplify EcoRI-MseI fragments. This preamplification step is followed by a second "selective" amplification with primers that include 1 or 2 bases in addition to the segment based on the restriction site. Each additional base reduces the number of effective substrates fourfold. With one selective base on each primer, the number of fragments is reduced 16-fold, which can be resolved to 1 base by using polyacrylamide sequencing gels. The ability to resolve to 1 bp enables precise comparison of bands across samples.
Many studies have shown the value of AFLP in typing of microorganisms (1, 2, 11, 13, 14, 21, 2325, 30, 34, 37). Recently AFLP was applied to the study of genetic diversity of clinical and environmental isolates of V. cholerae (15, 16), but again there was no variation observed within the seventh pandemic. In this study, we used AFLP with a range of primer pairs and found sufficient variation to draw conclusions about the genetic relationships of the isolates, revealing, for example, independent introductions of epidemic cholera into Africa in both the 1970s and the 1990s. The AFLP data in this study give new insights into the spread of cholera and open up the possibility of greatly enhancing our understanding of its epidemiology.
|
|
|---|
|
View this table: [in a new window] |
TABLE 1. Strains used in this study
|
|
View this table: [in a new window] |
TABLE 2. Adapters and primers used
|
Analysis of fingerprinting patterns.
Autoradiographs were analyzed manually and scored for the presence or absence of a band. Bands with a weak signal or blurred appearance were excluded, as well as bands from the two ends of the gel. Samples for each primer pair were run on the same gel; manual scoring of the presence or absence of polymorphic bands is fast and accurate. It was considerably more difficult to achieve current scoring by using commercial software. The band patterns from the manual scoring were converted to distance between isolates according to the Dice coefficient of similarity (33). The Dice coefficient was used to construct phylogenetic trees by using programs in the PHYLIP package (version 3.5, written by J. Felsenstein, Department of Genetics, University of Washington, Seattle). Both the unweighted pair group method with arithmetic mean (UPGMA) (38) and the neighbor-joining method (36) were used to obtain an indication of validity of the relationships by alternative methods. UPGMA has been widely used for clustering based on genetic distance. The major difference between the two algorithms is that UPGMA assumes equal rates of evolution along all branches. The Simpson index of diversity (D) was calculated by the formula D = 1 - {
[nj(nj - 1)]}/[N(N - 1)], where N is the number of strains tested and nj is the number of strains belonging to the jth type (12, 39). An in-house program, MLEECOMP (available upon request), was used to calculate the Dice coefficient and Simpson index of diversity. Bootstrap sampling (9) was done with MLEECOMP. In this analysis, the 966 fragments were sampled repeatedly to give a new set of the same size (966 fragments) in which by chance some of the original fragments will be omitted and others will be present more than once. This derived set of fragments was then used to generate a new tree that was compared with the original tree at each node. The above resampling and tree generation process was repeated 1,000 times. Each node divides the strains into two groups, and the proportion of the 1,000 samples that gives the same division as in the original set is the bootstrap value for that node. A value of 100% means that each of the 1,000 resampled data set gave the same distribution of strains at that node.
|
|
|---|
|
View this table: [in a new window] |
TABLE 3. Numbers and types of polymorphisms observed in the 45 seventh pandemic isolates
|
![]() View larger version (8K): [in a new window] |
FIG. 1. Relationships among the three toxigenic clones, U.S. gulf isolate, sixth pandemic isolate, and isolates of the seventh pandemic clone constructed by using the UPGMA algorithm based on the Dice coefficient obtained after pairwise comparison of AFLP variation.
|
![]() View larger version (35K): [in a new window] |
FIG. 2. Dendrogram of the 45 isolates of the seventh pandemic clone constructed by using the UPGMA algorithm based on the Dice coefficient obtained after pairwise comparison of AFLP variation. Bootstrap values are percentages of 1,000 replications and are indicated at the nodes if greater than 50%. The strain information given is name, year and place of isolation, and ribotype. Ribotype data were from our previous study (19).
|
How robust are the branches? The robustness of the branching order is commonly assessed by bootstrap analysis. A value of 50% or more is usually indicative of support for a branch. However, only two branches met this standard: one for the placement of prepandemic isolates and the other for placement of the earliest seventh pandemic isolate, M793. However, bootstrap analysis is a very stringent test. In a data set in which there is a large number of strains, a small proportion of atypical strains can lead to greatly reduced bootstrap values. This will occur if, for example, some strains are frequently partitioned differently from the original during resampling at a given node due to recombination, or parallel or reverse mutation, even if the majority of strains give a consistent distribution. The good correlation of major branches with locality indicates that the groupings are phylogenetically meaningful. We therefore looked at band distribution (Table 4) for evidence of support of major branches with low bootstrap values. For the division of clusters I and II, there is good but not absolute support by band patterns 2, 3, 7, 8, 12, 16, and 20. Within cluster I, patterns 10, 11, and 18 distinguish Ia from Ib and Ic, and patterns 22 and 23 distinguish Ib and Ic. Patterns 13, 14, and 15 also gave support to cluster Ic.
|
View this table: [in a new window] |
TABLE 4. Distribution of 50 informative fragments among the 45 isolates of the seventh pandemic clone
|
|
|
|---|
Ribotyping has been the most useful method for revealing variation within the seventh pandemic clone (68, 19, 22, 35, 45). The strains used in this study had been typed by ribotyping previously (19), allowing a direct comparison of the discriminatory power of the two techniques. The D values for ribotyping are 0.714 for the BglI digest, 0.664 for the SalI digest, and 0.715 for the data combined. Four of the 16 AFLP primer pairs have a higher D value than ribotyping. Therefore, AFLP, even with a single primer pair, easily surpasses the power of ribotyping.
It is known that different enzyme and/or primer combinations produce AFLP patterns of different complexity (13). A recent study by Jiang et al. (16) showed that AFLP distinguished closely related toxigenic V. cholerae strains, but failed to differentiate the 26 seventh pandemic isolates by using a single pair of ApaI-TaqI primers of +1 selection. We used EcoRI and MseI and obtained very useful subdivision by use of all primer pairs, also of +1 selection. The differences between the two studies may lie in the number of primer pairs used and the use of different enzymes.
Pandemic spread and AFLP relationships. The seventh pandemic started in Indonesia in 1961 and spread through Asia and the Middle East (3). There was a lull in the late 1960s, but it spread to Africa in 1970 during a major surge in the pandemic. The African outbreak started in the West (Guinea, Sierra Leone, Liberia, Nigeria, and other countries) and spread inland along rivers and trade routes. At the same time, there were outbreaks of cholera in North Africa (Libya, Tunisia, Algeria, and Morocco) and East Africa (Djibouti, Ethiopia, and Somalia), which were thought to have originated in the Middle East (40, 47) (Fig. 3). Although the majority of African isolates are in subcluster Ib, the relationship among the strains supports the proposal that there were at least two introductions of cholera in the 1970s. Isolates from West Africa (Sierra Leone [M809] and Senegal [M816]) are identical, but different from a strain (M810) from East Africa (Ethiopia) in 5 of the 50 informative bands (Table 4). The isolate M814 from North Africa (Morocco) was expected to be closer to M810, but is actually closer to the West African isolates, with only two differences in informative bands. This suggests that the North Africa outbreak was from a third source. However, there is no epidemiological evidence for the North and East epidemics originating from different sources. More isolates would be helpful, because we have used only a single isolate from each of the regions.
![]() View larger version (53K): [in a new window] |
FIG. 3. Epidemics of cholera in Africa. The relevant country or region is annotated on the map to assist interpretation of strain relationships and epidemiological data. Adapted from reference 47 with permission from the World Health Organization (the map in reference 47 shows the route of transmission and date of emergence in the 1970s).
|
In the 1991 resurgence of cholera in Africa, there were two main epidemic foci: in South and East Africa (Zambia, Mozambique, Malawi, and Angola) and in West Africa (40, 47). AFLP data from isolates from the1990s show close correlation with epidemiological data (Fig. 3). The two Malawi isolates (M826 and M829) are within cluster I, but distantly related to subcluster Ib, representing strains from the 1991 Southern and Eastern epidemic focus. Strains M827 and M828 in cluster II, from Guinea and Morocco respectively, are clearly from the 1990s West epidemic focus and not remnants of the outbreak in the 1970s based on comparison of M828 with the 1970s Morocco isolate (M814), which is in subcluster Ib of African isolates from the 1970s. The strain causing outbreaks in the West is clearly from Asia and may have been introduced before 1991. An isolate (M824) identical to the West epidemic focus isolates M814 and M828 was isolated from Algeria in 1987. Thus, the strain causing the 1991 outbreak in the West appears to have been present in the continent at least 3 years before that.
Cholera remained at very low levels in Africa from 1972 to 1977 and only came back to high levels in 1991 (40). Interestingly, during this period, strains that do not fit into the groups or clusters representing the major outbreaks were also isolated, suggesting yet more introductions of cholera. M823, a 1984 Algerian isolate, is in the subcluster Ic of generally Asian isolates, and M825, isolated in 1988 from Zaire, stands on its own. M825 was previously observed to be very different (19, 27). A 1975 isolate from Comoros off the coast of Africa, which was the only newly affected country that year (3), is not related to isolates from the start of the African epidemic, but to cluster II of Asian isolates.
The AFLP data presented above give a very detailed picture of the spread of cholera in Africa. Asia is regarded as the source of the seventh pandemic spread to Africa in the 1970s, and the AFLP relationships confirm that. The surprising finding is that the diversity in Africa parallels that found in Asia, suggesting that there were many introductions of cholera after the 1970s outbreak, indicating a continuous flow of cholera from regions of endemicity. Furthermore, the strain causing an outbreak could have been present in the local environment many years before. For example, the 1987 Algeria strain is identical in AFLP fingerprint to strains from the West epidemic focus of the 1991 upsurge of cholera. This is a very significant finding for the monitoring of cholera.
There are two other possible explanations to be considered for the observed pattern of variation in V. cholerae from Africa. One is that the variation arose independently in Africa. Jiang et al. (16) recently used AFLP to study genetic diversity of clinical and environmental isolates and showed that some clinical O1 isolates from Mexico are very different from the seventh pandemic clone isolates, suggesting an independent origin of these isolates. However, all African strains clearly fall within the seventh pandemic in the tree; they are unlikely to have originated in situ. We must also consider the possibility that the strains all evolved from the 1970s outbreak strain. The fact that 1991 West African epidemic isolates cluster with the 1980s Asian isolates argues strongly against this hypothesis.
The relationships among the strains isolated in the 1960s in Asia are less useful for inferring details of spread of the disease, perhaps because less variation had developed in the first few years of the pandemic. The pandemic traveled from one country to another in a well-documented fashion: first to Hong Kong and the Philippines in 1961, to Cambodia and Thailand in 1963, and then to India and Vietnam in 1964. (For a full list of the countries affected each year, see the review by Barua [3].) That the M807 isolated in 1966 in Vietnam is very similar to M803 isolated in 1961 from Hong Kong is consistent with the spread. However, M805 isolated from Cambodia in 1963 is more distantly related to M803 and M807.
The seventh pandemic spread to Latin America in 1991. One isolate (M830) from the1991 outbreak was included in this study. Latin American isolates differ from other seventh pandemic isolates with a new MLEE allele of leucine aminopeptidase (46). Although variation within Latin American isolates has been reported (5), all are generally very similar and differ from other seventh pandemic isolates (45). M830 represents the Latin American epidemic in our study and seems to be a divergent member of cluster II, generally comprising 1990s isolates. The origin of the 1991 Latin American epidemic is unclear, with speculation that it was imported from an area in which cholera is endemic but not represented in the strain collections usually used (41). Clearly AFLP analysis has demonstrated the power that, with more isolates, should make it possible to trace the source of this variant of the seventh pandemic clone.
Comparison of ribotype and AFLP variation. We previously identified 11 BglI ribotypes in the same 45 strains (19) used in this study. Ribotypes G and H are most frequent, with 13 and 21 isolates, respectively (Fig. 2). The difference between the two ribotypes is the presence of a tandem operon in ribotype G and its absence in ribotype H (27). All but one 1960s isolate are ribotype G. The parsimonious assumption is that the progenitor of the seventh pandemic was of ribotype G. The earliest ribotype H isolate is M807, isolated in 1966. The distribution of ribotype H isolates among ribotype G isolates (clusters Ib, Ic, and II) indicates that ribotype H has arisen from ribotype G several times. Most ribotype G isolates are in clusters Ia and Ic. However, two ribotype G isolates in cluster II are grouped with ribotype H isolates, suggesting that this may be a reversion of ribotype H to ribotype G by regenerating the tandem operon deleted in the ribotype H form.
The two ribotype N isolates, M663 and M799, also show independent derivations. M663 is very similar to M662 (ribotype Q) in cluster II, while M799 is far away in cluster I. It is also interesting to note that M817 and M812, both from Chad, have identical ribotypes (J), but different AFLP patterns.
We have previously shown that ribotype variation is mainly due to changes within the rrn operons that result from recombination between the operons (27). This comparison between AFLP and ribotype relationships confirms our previous suggestion that ribotyping should not be used for evolutionary studies, because changes of this nature could easily revert (26).
AFLP has very high discriminatory power, as shown in this and many other studies (17, 32, 43). It is well suited for epidemiological investigations of a homogenous clone, such as the seventh pandemic clone. However, it becomes impractical for AFLP to be used for routine typing if the proportion of polymorphic fragments is so low that multiple primer pairs have to be used to differentiate isolates. As discussed below, a major advantage of AFLP is that polymorphic fragments found to be useful can be isolated and used to develop a PCR-based typing scheme using only the most useful AFLP fragments.
Concluding comments. Evolutionary and epidemiological analyses of a clone are often hampered by the lack of methods to find variation, because few mutations will have accumulated if it is of recent origin. AFLP, which scans genomewide for variation, offers almost unlimited power by use of combinations of enzymes and selective primer extensions. We applied AFLP to 45 isolates of the V. cholerae seventh pandemic clone sampled over a 33-year period. Using all 16 pairs of EcoRI-MseI selective primers with a 1-base extension, AFLP was able to distinguish all but 4 pairs (M805 and M806, M809 and M816, M817 and M819, and M818 and M821) and one set of 3 (M824, M827, and M828) of the 45 isolates (Fig. 1 and Table 4). This demonstrates the sensitivity and value of AFLP for finding variation in a clone that arose in 1961 and has developed relatively little diversity.
The AFLP analysis has provided by far the best data yet on the evolution of the seventh pandemic clone, allowing us to track outbreaks in detail. This study confirmed the epidemiological evidence for two independent introductions of cholera into Africa in the 1970s. More significantly the two epidemic foci (Southern/Eastern and Western) in the 1991 upsurge of cholera in Africa were shown to be due to yet another two separate introductions. Several other introductions of V. cholerae during the 21-year period (1970 to 1991) were also indicated. Our data have further shown that a strain identical in AFLP to the strain causing the 1991 epidemic in West Africa had been isolated 3 years before, indicating that an introduced strain could be present in the continent some years before causing large outbreaks. These findings have significant implications for cholera epidemiology.
AFLP offers great potential for further study of the seventh pandemic clone. The set of strains we used are broadly representative of the seventh pandemic, but the details of the relationships will be much more informative if AFLP analysis is extended to a much larger set of strains. There is also great potential to make use of the AFLP variation observed. It is possible to clone and sequence the bands, which will give insight into the nature of the polymorphic markers. Recent completion of the genome sequence of a seventh pandemic isolate (10) will facilitate such an analysis, giving us a better understanding of the evolution of the seventh pandemic clone. It would then be possible to use the knowledge gained to design PCR primers that allow amplification of only isolates that have a particular AFLP band. This would allow a single multiplex PCR assay to replace the multiple AFLP gels used in this study, with the obvious potential to develop PCR-based typing for epidemiological analysis.
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»