Previous Article | Next Article ![]()
Journal of Clinical Microbiology, March 2006, p. 777-782, Vol. 44, No. 3
0095-1137/06/$08.00+0 doi:10.1128/JCM.44.3.777-782.2006
Copyright © 2006, American Society for Microbiology. All Rights Reserved.
Chemical and Biological Defence Section, Defence R&D CanadaSuffield, Medicine Hat, AB,1 Department of Computer Science, University of Saskatchewan, Saskatoon, SK,2 Public Health Agency of Canada, Winnipeg, MB, Canada3
Received 2 November 2005/ Returned for modification 12 December 2005/ Accepted 20 December 2005
|
|
|---|
|
|
|---|
B. anthracis belongs to the B. cereus group and is most closely related to B. cereus and B. thuringiensis. Multilocus enzyme electrophoresis and fluorescent amplified fragment length polymorphism (AFLP) analysis of the B. cereus group revealed a high degree of genetic variability but failed to identify distinct groups (5, 19). Although B. cereus and B. thuringiensis are broadly interspersed across all branches of the AFLP phylogenetic tree, B. anthracis shows very low genetic diversity and clusters to a subbranch of the phylogenetic tree that is distinct from branches where other members of the B. cereus group cluster (6).
Due to recent bioterrorism events, there has been an increased interest in B. anthracis, especially in its identification, detection, and molecular subtyping. B. anthracis is considered to be evolutionarily "young," lacking character homoplasy and containing few single-nucleotide polymorphisms (SNPs) (15). This lack of homoplasy may be due to its life history, which includes long periods of time as dormant endospores. B. anthracis is among the most monomorphic pathogenic bacteria described. Molecular typing techniques commonly used to differentiate between strains of other species generally fail to discriminate between B. anthracis strains, including AFLP (6, 7), multilocus sequence typing (14), and pulsed-field gel electrophoresis (4).
Several molecular typing methods, including SNP analysis and multilocus variable-number tandem repeat analysis (MLVA), have been more successful in discriminating between B. anthracis strains and have allowed the exploration of its phylogenetics. SNPs are rare in B. anthracis, but molecular typing by the use of these polymorphisms is possible due to the availability of multiple whole-genome sequences. SNP phylogenetic markers are evolutionarily stable, with mutation rates of approximately 1010 changes per nucleotide per generation (21). A set of canonical SNPs that distinguish the major clades of B. anthracis has been developed (9).
An MLVA method that exploits the copy number differences of nucleotide repeat sequences at six chromosomal loci and one locus for each of the two plasmids has been developed (8). MLVA loci have an increased mutation rate and a greatly increased number of allelic states compared to SNPs. .
B. anthracis isolates obtained during a natural outbreak or a bioterrorism event would have an extremely low level of genetic diversity. During such an event, canonical SNP analysis and MLVA may not distinguish isolates or closely related strains. To identify polymorphisms in populations with extremely low levels of genetic diversity, one could examine "hot spots," which are areas within the genome that have very high mutation rates. Single-nucleotide repeats (SNRs), also referred to as mononucleotide nucleotide repeats, are a type of variable-number tandem repeat (VNTR) that display very high mutation rates (as high as 6.0 x 104 mutations per generation) (9). Unlike some VNTR loci that have complicated repeat structures, SNRs are stretches of one kind of nucleotide that may vary in length between different bacterial isolates due to slip-strand mispairing (12). SNRs are more likely than other types of simple sequence repeats (SSRs) to undergo strand separation and base pair slippage, increasing the chance of slip-strand mispairing and causing a mutation at the SNR locus (1, 3). SSR analysis of Escherichia coli revealed that 93% of all mononucleotide repeats were A or T (3). The lower melting temperature, characteristic of A and T, increases the instability of the DNA helix, theoretically increasing the possibility of slip-strand mispairing, which may explain the A-T bias of SNRs (13, 20). SNRs have been identified in a number of bacterial species and have been used for multilocus sequence typing (2, 3, 16, 20). SNR markers have been suggested for use as part of a hierarchical typing scheme for B. anthracis; but their actual use, including target sequences or primer sequences, has not been described (9). This paper describes the discovery and analysis of SNR loci of B. anthracis, the comparison of these loci between B. anthracis strains with sequenced genomes, and the use of the most polymorphic loci as a way to differentiate isolates that are indistinguishable when they are analyzed by MLVA.
|
|
|---|
|
View this table: [in a new window] |
TABLE 1. B. anthracis strains used
|
Some of the MLVA amplicons were sequenced to establish the size of the amplicon and the VNTR. These MLVA PCR products were purified by using Montage PCR96 plates (Millipore, Nepean, Ontario, Canada) and were sequenced by using the MLVA primers as sequencing primers. All sequencing reactions were carried out in 20-µl reaction mixtures with Big Dye 3.1 Terminator chemistry (Applied Biosystems Inc.), and the sequences were analyzed on an ABI 3100 automated sequencer (Applied Biosystems Inc.). Contig assembly of the B. anthracis MLVA locus sequences was performed with Sequencher (Gene Codes Corp., Ann Arbor, MI).
Bioinformatics and primer design.
The B. anthracis sequences used in the in silico SNR analysis are summarized in Table 2. The fasta format sequences were processed by use of a perl script to identify all A-T polymorphisms with lengths of >6 nucleotides; the position of each polymorphism was then used to create a primer3 input file. The primer3 input file contained 250 bases of sequence flanking each side of the putative polymorphism, which was specified as the target for primer design (18). Polymorphisms with less than 250 bases of flanking sequence on either side were not considered in this analysis. Primer3 was run with the default options, producing five primer pairs (primer pairs 0 to 4) for each amplicon. The primer3 output was processed by use of a perl script to extract the amplicon sequences and write them to a fasta file. The fasta file was then used as input for tgicl (http://www.tigr.org/tdb/tgi/publications/TGICL.pdf) to cluster similar amplicons. The tgicl software generated a cluster file (which assigns each sequence to a unique cluster without assembling the sequences). Any cluster that did not contain at least one polymorphism longer than 9 bases was then removed from the analysis, as it allowed a manageable data set representing the equivalent of three codons. Nei's marker diversity index (D) was calculated as 1
(allele frequency)2 for each cluster by using predicted amplicon sizes for the first primer set (primer set 0).
|
View this table: [in a new window] |
TABLE 2. B. anthracis sequences used for in silico SNR analysis
|
|
View this table: [in a new window] |
TABLE 3. SNR primer sequences used to screen B. anthracis strains
|
|
|
|---|
From this preliminary screen of 39 candidate loci, 29 loci exhibiting polymorphisms (Table 4) were screened against a larger panel of B. anthracis strains (Table 1). Seven SNR primer sets (for loci CL50, CL47, CL32, CL63, CL55, CL77, and CL35) inconsistently or poorly amplified the SNR loci and were not used further in this study. The remaining 22 SNR primer sets were able to reproducibly amplify the SNR loci and thus were used for the subsequent analysis of B. anthracis strains of established MLVA types (Table 5). Amplicon sizes were as expected by fragment size analysis and/or direct sequencing, as were the differences in amplicon lengths between strains compared to the lengths determined by in silico analysis, with some minor differences (Table 6). There were small differences between the expected and the observed amplicon lengths of ±1 to ±3 bp. This small difference in amplicon size was consistent within amplicons produced from the same primer pairs and is likely due to the 5' modification of the primer. Many SNR amplicons produced multiple split peaks when they were analyzed on the ABI 3100 genetic analyzer. This may be due in part to incomplete 3' adenylation of amplicons by Taq polymerase. Phusion polymerase was used with some success to reduce the occurrence of split peaks. Phusion polymerase produces blunt-end products and thus eliminates variation of the PCR products due to incomplete 3' adenylation. Analysis on the ABI 3100 genetic analyzer usually produced a cluster of two or four peaks within 3 bp of each other. The first and fourth peaks, when they were present, had substantially less fluorescence than the second and third peaks. The peak with the highest fluorescence in the cluster was used to size the amplicons. The polymorphic loci were sequenced directly to confirm the polymorphisms. Sequence analysis allowed the comparison of the nucleotide sequences of the loci; however, some loci (loci CL10, CL33, and CL12) were difficult to sequence directly, possibly due to the length of the repeat region. With these three loci, the data from in silico analysis and successful direct sequencing of several strains allowed the size of the SNR to be established to ±2 bp. Sequence analysis was able to distinguish between alleles of amplicons that had multiple SNR regions that masked the polymorphisms present at each SNR region (CL1 locus), since fragment analysis at this locus was not informative. Many of the SNR loci showed limited polymorphisms when they were used to screen the B. anthracis strains selected (Table 6).
|
View this table: [in a new window] |
TABLE 4. In silico analysis of SNR loci from B. anthracis sequenced genomes
|
|
View this table: [in a new window] |
TABLE 5. MLVA typing results at eight loci for 19 B. anthracis strains
|
|
View this table: [in a new window] |
TABLE 6. Lengths of repeats at polymorphic loci for 19 B. anthracis strains
|
|
|
|---|
Molecular typing of B. anthracis has been possible due to the exploitation of VNTRs by MLVA. VNTR mutation rates are low enough to maintain their sizing through 100,000 generations with only one change in allele size (8). MLVA mutation rates are locus dependent but have been reported to be between 105 and 104 per generation in B. anthracis and greater than 103 mutations per generation in other bacterial species (9). The use of additional MLVA markers beyond the eight markers used in this study may differentiate between the strains used in this experiment. However, some SNR markers have higher mutation rates and perhaps higher diversity index values than most MLVA markers and therefore offer the best chance of discriminating between isolates with low levels of genetic diversity. It may not be appropriate to use these markers to establish phylogenetic relationships among diverse isolates due to homoplasy because of the high mutation rates of these markers (9). These SNR markers are best used as a molecular epidemiological tool for examination of very closely related isolates that are indistinguishable by MLVA, thereby allowing one to distinguish closely related strains more accurately at the terminal branches of the phylogenetic tree.
Alternative molecular typing methods that could provide isolate discrimination include whole-genome sequencing and microarray-based resequencing. Whole-genome sequencing reveals polymorphisms for typing purposes by allowing comparisons of entire genomes from isolates of interest; however, this technique remains cost prohibitive and is not feasible for large sample sizes (17). Microarray-based resequencing of B. anthracis has been carried out with 56 strains of B. anthracis (22). Resequencing allows one to survey large areas of the genome for strain-specific variations; however, the proper selection of the regions to be represented on the chip is crucial, since only a portion of the genome is examined. Although this technique is well suited to large sample sizes, it is cost prohibitive and is dependent on which portions of the genome have been exploited.
As expected, there is a positive correlation between diversity (D) and the length of the repeat, since larger mononucleotide SSRs are more likely to undergo slip-strand mispairing, resulting in greater variability in repeat length (1). A highly significant correlation between total repeat length and the number of alleles has been described for B. anthracis with larger repeat units (11). In our study, some plasmid SNRs were among the most polymorphic markers evaluated. Unlike chromosomal loci, plasmid loci are present in multiple copies and the detection of transient states of SNRs may be possible.
Differences in the number of polymorphic SNR loci for strains with identical MLVA genotypes were observed. There were two locus differences for 9609/9614/93-189C; two locus differences between Vollum and Vollum 1B; and seven locus differences between 17T5 and SK31, although the pXO1 MLVA locus was polymorphic between the two strains. When nine other B. anthracis strains with the same MLVA genotype were compared (strain 9604, 9807, 9911, 03-0139, 03-0191, 9937, 9946, 94-188C, and 200077), 22 different alleles were identified at a combined seven loci. Seven of the nine strains were distinguished from each other by the use of four SNR loci (the CL10, CL12, CL33, and CL76 loci). While our study demonstrates that SNR analysis does allow strains with the same MLVA genotype to be further distinguished from each other, two sets of isolates were not readily distinguished from each other. One set included strains 9609 and 9614, which were from the same outbreak. It is interesting that strains 9937 and 9946 were isolated in the same year at the same location (Alhambra, Alberta, Canada) and had distinctive SNR genotypes. The other set of SNR identical isolates were 03-0191 and 03-0139, which were both isolated from bovines in Manitoba, Canada, in 2003; but the nature of their isolation is not clear (they may have been from the same outbreak or even the same animal). Although the use of the most polymorphic SNR markers may be a prudent first step in attempting to distinguish between several isolates with identical MLVA genotypes (the CL10, CL12, CL33, CL76, and CL60 loci), any one of the SNR markers (Table 6) may prove to be discriminatory between isolates. This technique is laborious and may not be suited to high-throughput automation; it uses specialized molecular typing equipment but can be easily adopted by laboratories that perform MLVA. This technique allows isolates of B. anthracis to be distinguished from each other when other typing methods fail to discriminate them; therefore, in epidemiological studies or in forensic investigations, this may be the only technique that offers the discriminatory power required.
Screening of the SNR markers (Table 3) against a more genetically and geographically diverse group of B. anthracis isolates to determine the full breadth of the SNR polymorphisms would be an important next step. Also, testing of these markers against a larger group of isolates with the same MLVA genotype is crucial in order to determine the utility of these markers for the differentiation of B. anthracis isolates and for epidemiological analysis of B. anthracis outbreaks.
The excellent technical assistance of D. Johnstone, T. MacMillan, and M. Russell is noted with appreciation. Thanks go to Barry Ford and John Cherwonogrodzky for reviewing this work.
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»