Previous Article | Next Article 
Journal of Clinical Microbiology, August 2004, p. 3538-3548, Vol. 42, No. 8
0095-1137/04/$08.00+0 DOI: 10.1128/JCM.42.8.3538-3548.2004
Copyright © 2004, American Society for Microbiology. All Rights Reserved.
Characterization of a Trinucleotide Repeat Sequence (CGG)5 and Potential Use in Restriction Fragment Length Polymorphism Typing of Mycobacterium tuberculosis
Yayoi Otsuka,1 Pawel Parniewski,2,3 Zofia Zwolska,4 Masatake Kai,1,
Tomoko Fujino,1 Fumiko Kirikae,1 Emiko Toyota,1 Koichiro Kudo,1 Tadatoshi Kuratsuji,1 and Teruo Kirikae1*
International Medical Center of Japan, Shinjuku, Tokyo 162-8655, Japan,1
Centre for Medical Biology and Microbiology, Polish Academy of Science, Lodz 93-232,2
Centre for Medical Biology and Microbiology, Swietokrzyska Academy, Kielce 25-369,3
National Research Institute of Tuberculosis and Lung Diseases, Warsaw 01-138, Poland4
Received 16 November 2003/
Returned for modification 10 February 2004/
Accepted 13 April 2004

ABSTRACT
The genomes of 28 bacterial strains, including mycobacterial
species
Mycobacterium tuberculosis and
Mycobacterium bovis,
were analyzed for the presence of a special class of microsatellite,
that of trinucleotide repeat sequences (TRS). Results of a search
of all 10 possible TRS motifs (i.e., CCT, CGG, CTG, GAA, GAT,
GTA, GTC, GTG, GTT, and TAT) with five or more repeating units
showed that (CGG)
5 was highly represented within the genomic
DNA of
M. tuberculosis and
M. bovis. Most of the (CGG)
5 repeats
in the genome were within the open reading frames of two large
gene families encoding PE_PGRS and PPE proteins that have the
motifs Pro-Glu (PE) and Pro-Pro-Glu (PPE). (CGG)
5-probed Southern
hybridization showed that some mycobacterial species, such as
Mycobacterium marinum,
Mycobacterium kansasii, and
Mycobacterium szulgai, possess many copies of (CGG)
5 in their genomes. Analysis
of clinical isolates obtained from Tokyo and Warsaw with both
IS
6110 and (CGG)
5 probes showed that there is an association
between the fingerprinting patterns and the geographic origin
of the isolates and that (CGG)
5 fingerprinting patterns were
relatively more stable than IS
6110 patterns. The (CGG)
5 repeat
is a unique sequence for some mycobacterial species, and (CGG)
5 fingerprinting can be used as an epidemiologic method for these
species as well as IS
6110 fingerprinting can. If these two fingerprinting
methods are used together, the precise analysis of
M. tuberculosis isolates will be accomplished. (CGG)
5-based fingerprinting is
particularly useful for
M. tuberculosis isolates with few or
no insertion elements and for the identification of other mycobacterial
species when informative probes are lacking.

INTRODUCTION
DNA fingerprinting of the inserted IS
6110 element specific for
the
Mycobacterium tuberculosis complex is a powerful epidemiological
tool for visualizing DNA restriction fragment length polymorphisms
(RFLP) of
M. tuberculosis (
26). The major limitation of IS
6110-based
RFLP typing is the difficulty of discriminating genetic polymorphisms
of
M. tuberculosis isolates with only a few copies of the element.
In addition, there are two reports (
1,
29) that described IS
6110-based
RFLP as unstable, although other studies have confirmed a high
degree of stability (
5,
15). Yeh and colleagues (
29) indicated
that genotypes with IS
6110 were relatively unstable because
they changed rapidly compared with those based on another marker.
Alito et al. (
1) reported that a multidrug-resistant outbreak
strain changed rapidly, according to IS
6110 RFLP, over a period
of a few years.
A number of alternative typing methods for M. tuberculosis isolates that use genetic markers, such as polymorphic GC-rich repetitive sequences (PGRS) (19), tandem repeat sequences of 10 bp found in PPE family proteins (10), the direct repeat (9), a (GTG)5 repeat (28), IS1547 (6), katG (30), and tandem repeats of 40 to 100 bp (14, 24), have been reported.
Trinucleotide repeat sequences (TRS) comprise a class of microsatellites that are involved in human neurodegenerative diseases (27). Studies in Escherichia coli showed that these TRS, such as (CTG)n and (CGG)n, may effect genetic instability during DNA replication, transcription, and repair processes (17). (GAA)12 has been found in a plasmid of Mycoplasma gallisepticum (12, 13), and it positively regulates gene expression in this plasmid. It is not as well known whether bacterial genomes possess tandem repeat sequences. The types, lengths, and distribution of such sequences may serve as valuable markers for phylogenetic or epidemiologic studies of various bacteria.
In the present study, we searched for all possible TRS in various bacterial strains and found that M. tuberculosis and Mycobacterium bovis possess many (CGG)5 repeats. We also analyzed M. tuberculosis clinical isolates obtained from Japan and Poland with (CGG)5-based DNA fingerprinting and show that this method is useful for the genetic analysis of clinical isolates of M. tuberculosis.

MATERIALS AND METHODS
Bacterial strains.
The sources of mycobacterial strains used in this study are
listed in Table
1. Clinical isolates were obtained from the
International Medical Center of Japan (IMCJ) in Tokyo, Japan,
in 2001 and from the National Research Institute of Tuberculosis
and Lung Diseases in Warsaw, Poland, in 2000. These clinical
isolates were obtained from different patients. Drug susceptibility
testing was performed by conventional culture on solid media
with a proportion method (Wellpack; Japan BCG Laboratory, Tokyo,
Japan) or by a microdilution method with Vit spectrum SR (Kyokuto
Pharmaceutical Co., Ltd., Tokyo, Japan). The antituberculosis
drugs tested and the concentrations used were as follows: isoniazid,
0.2 and 1.0 µg/ml; rifampin, 40 µg/ml; ethambutol,
2.5 µg/ml; streptomycin, 10 µg/ml;
para-aminosalicylic
acid, 0.5 µg/ml; cycloserine, 30 µg/ml; ethionamide,
20 µg/ml; kanamycin, 20 µg/ml; enviomycin, 20 µg/ml;
and levofloxacin, 1.0 µg/ml. Drug resistance is defined
as resistance to at least one drug. Serial cultures were made
from
M. tuberculosis strain H37Rv and a clinical isolate from
Japan (IMCJ 541) and were passaged weekly over 9 weeks.
Genome sequence.
The genome sequences of 28 bacterial strains were downloaded
from the National Center for Biotechnology Information GenBank
database (
http://www.ncbi.nlm.nih.gov/genomes/MICROBES/Complete.html),
The Institute for Genomic Research website (
http://www.tigr.org/CMR),
the Sanger Center (
http://www.sanger.ac.uk), and the DNA Data
Bank of Japan (
http://www.ddbj.nig.ac.jp).
Isolation and restriction enzyme digestion of mycobacterial DNA.
Chromosomal DNA of the mycobacterial strains and M. tuberculosis clinical isolates were prepared as described previously (16, 26) with slight modifications. Briefly, for isolation of genomic DNA, M. tuberculosis strains were grown on egg-based Ogawa solid medium (Kyokuto Pharmaceutical Co., Ltd.) for 3 to 5 weeks. All bacterial cells from one slant were transferred to 400 µl of TE buffer (0.01 M Tris-HCl, 0.001 M EDTA [pH 8.0]), and the solution was heated at 80°C for 20 min to kill the bacteria. Fifty microliters of lysozyme (10 mg/ml) was added, and the tube was incubated overnight at 37°C. Seventy microliters of sodium dodecyl sulfate (10%) and 5 µl of proteinase K (10 mg/ml) were added, and the mixture was incubated for 10 min at 65°C. A 100-µl volume of 5 M NaCl and the same volume of an N-cetyl-N,N,N-trimethylammonium bromide (CTAB)-NaCl solution (4.1 g of NaCl and 10 g of CTAB per 100 ml) were added together. The tubes were vortexed and incubated for 10 min at 65°C. An equal volume of chloroform-isoamylalcohol (24:1) was added, the mixture was centrifuged for 5 min at 12,000 x g, and the aqueous supernatant was carefully transferred to a fresh tube. The total DNA was precipitated in isopropanol and was redissolved in 20 µl of 0.1x TE buffer. All restriction enzymes used in this study, AatII, AfaI, AluI, EcoRI, HinfI, MluI, NruI, NsbI, PstI, PvuII, SacI, Sau3AI, SalI, SmaI, XhoI, and XspI, were purchased from Takara Bio Inc. (Shiga, Japan). Chromosomal DNA was digested overnight with each restriction enzyme (1 U/µg of DNA) under the conditions specified by the manufacturer. The digested fragments were separated by electrophoresis on horizontal 1% agarose gel at 15 V for 20 h (14-cm gel) in 1x TAE buffer (0.04 M Tris-acetate, 0.001 M EDTA). A 1-kb DNA ladder and
DNA restricted with HindIII (Promega Corp., Madison, Wis.) were used as size markers. The gels were then stained with ethidium bromide, and the results were recorded photographically.
Southern blotting.
Gels were depurinated in 0.25 M HCl for 30 min and then denatured in 0.5 M NaOH and 1.5 M NaCl for 30 min. DNA fragments were transferred to an N+ Hybond membrane (Amersham Biosciences, Little Chalfont, Buckinghamshire, United Kingdom) overnight, and the DNA was fixed to the membrane by UV irradiation.
The IS6110 probe used in this study was a 245-bp DNA fragment amplified by PCR as described previously (26). Briefly, oligonucleotides INS1 (5'-CGTGAGGGCATCGAGGTGGC-3') and INS2 (5'-GCGTAGGCGTCGGTGACAAA-3') were used to amplify a 245-bp fragment from purified chromosomal M. bovis BCG DNA by PCR. The 15-mer oligonucleotide (CGG)5, 5'-CGGCGGCGGCGGCGG-3', was synthesized (Nippn TechnoCluster, Inc., Tokyo, Japan). These probes were labeled with horseradish peroxidase by the ECL direct system (Amersham Biosciences). Hybridization and detection were performed according to the recommendations of the manufacturer. Autoradiographs were obtained by exposing the membrane to X-ray film.
Analysis.
IS6110- and (CGG)5-based fingerprinting patterns were analyzed with Molecular Analyst Fingerprinting Plus software, version 1.6 (Bio-Rad Laboratories, Inc., Hercules, Calif.). To facilitate the comparison of the fingerprinting patterns, normalization was carried out with the use of molecular weight standards and the IS6110- or (CGG)5-fingerprinting patterns of two clinical isolates, IMCJ 541 and a Poland-derived isolate, no. 28 (P 28), on each gel. Each dendrogram was calculated with the unweighted pair group method with average linkage according to the supplier's instructions.

RESULTS
Presence of TRS in mycobacterial strains and other bacterial species.
To detect TRS among bacterial genomes and to determine the types
of TRS and their repeat sizes, we searched for all 10 possible
TRS motifs (i.e., CCT, CGG, CTG, GAA, GAT, GTA, GTC, GTG, GTT,
and TAT) of five or more repeating units with the BLASTN algorithm
(
2). Among 28 bacterial strains, the numbers of TRS displayed
large variation, with values ranging from zero to 38 (shown
in the extreme right column in Table
2).
M. tuberculosis strains
H37Rv and CDC1551 and
M. bovis possessed markedly more TRS copies
than other species examined. The majority of the other species
possessed fewer than 10 copies. Five strains,
Listeria innocua,
Listeria monocytogenes,
Staphylococcus aureus N315,
Thermoplasma acidophilum, and
Thermoplasma volcanium, did not possess any
TRS. The types of TRS varied (Table
2). (CCT)
5 did not exist
in any of the bacteria examined in this study. CGG repeats,
predominantly (CGG)
5, existed with high frequency in the genomes
of
M. tuberculosis strains H37Rv and CDC1551 and
M. bovis; the
frequencies of the appearance of CGG with five or more repeats
were one per 150 to 200 kb.
Neisseria meningitidis MC58 and
Pseudomonas aeruginosa possessed six copies of (CGG)
5 with a
frequency of one copy per 380 kb and five copies with a frequency
of one copy per 1,250 kb, respectively. Few (CGG)
5 repeats were
found in
E. coli K12-MG1655,
E. coli O157:H7 EDL933,
E. coli O157:H7 VT2-Sakai,
N. meningitidis serogroup A Z2491,
Salmonella enterica, and
S. enterica serovar Typhimurium. There were no
(CGG)
5 repeats in
Clostridium acetobutylicum,
Clostridium perfringens,
Helicobacter pylori 26695,
H. pylori J99,
L. innocua,
L. monocytogenes,
Mycobacterium leprae,
Mycoplasma genitalium,
Mycoplasma pneumoniae,
Mycoplasma pulmonis,
Rickettsia conorii,
Rickettsia prowazekii,
S. aureus Mu50,
S. aureus N315,
T. acidophilum,
T. volcanium,
and
Yersinia pestis. Other possible repeats of CTG, GAA, GAT,
GTA, GTC, GTG, GTT, and TAT were found sporadically among various
bacterial strains. However, only a few copies of these TRS were
found. For example, one copy of (CTG)
5 was found in
C. acetobutylicum,
one (CTG)
10 was found in
S. enterica serovar Typhi, two (CTG)
5 repeats were found in
S. enterica serovar Typhimurium, and one
(CTG)
5 and one (CTG)
6 repeat were found in
Y. pestis. Relatively
large TRS with 21 or 16 repeats were detected in
M. leprae and
Mycoplasma genitalium, respectively.
M. genitalium possessed
three types of TRS repeats (GAA, GTA, and GTT) and different
numbers of repeats [(GAA)
5, (GAA)
6, and (GAA)
16; (GTA)
5, (GTA)
7,
(GTA)
8, (GTA)
9, (GTA)
10, (GTA)
11, and (GTA)
16; and (GTT)
11].
Positions of (CGG)5, (CGG)6, and (CGG)7 in the genome.
The
M. tuberculosis and
M. bovis genomes consist of 4.4 and
4.3 Mb, respectively. All (CGG)
5, (CGG)
6, and (CGG)
7 repeats
in both
M. tuberculosis strains H37Rv and CDC1551 were located
between 0.05 and 4.0 Mb (Table
3). These repeats appeared to
be distributed randomly. In strain H37Rv, one (CGG)
7 was located
at 0.05 Mb, and one (CGG)
6 was located at 2.4 Mb. Five (CGG)
5 repeats were between 0.1 and 1.0 Mb, six were between 1.0 and
2.0 Mb, eight were between 2.0 and 3.0 Mb, and eight were between
3.0 and 4.4 Mb. In strain CDC1551, one (CGG)
6 repeat was located
at 0.05 Mb. Six (CGG)
5 repeats were between 0.1 and 1.0 Mb,
6 were between 1.0 and 2.0 Mb, 11 were between 2.0 and 3.0 Mb,
and 9 were between 3.0 and 4.4 Mb. In
M. bovis, four (CGG)
5 repeats were located between 0.26 and 1.0 Mb, five were between
1.0 and 2.0 Mb, seven were between 2.0 and 3.0 Mb, and six were
between 3.0 and 4.3 Mb (Table
3). Almost all of the (CGG)
5,
(CGG)
6, and (CGG)
7 repeats in
M. tuberculosis and
M. bovis were
located within the open reading frame (ORF), with the exception
of six (CGG)
5 repeats that were located between 1.1 and 3.96
Mb in strain CDC1551. Among these, the four (CGG)
5 repeats at
1.09, 3.74, 3.76, and 3.96 Mb were in the putative ORF with
authentic frameshift or point mutation (Table
3).
In strain H37Rv, the genes containing (CGG)
5 and (CGG)
6 encoded
the PPE and PE_PGRS families of proteins. A gene containing
(CGG)
7, PonA, encoded a penicillin-binding protein (Table
3).
In strain CDC1551, the genes containing (CGG)
5 encoded the PPE,
PE_PGRS, and PE families of proteins. A gene containing (CGG)
6 encoded a penicillin-binding protein (Table
3). In
M. bovis,
all genes containing (CGG)
5 encoded PPE and PE_PGRS family proteins,
with the exception of two genes that encoded probable conserved
membrane proteins (Table
3). In all three strains, the (CGG)
5 in the PPE genes translated to poly(Ala), and the (CGG)
5 and
(CGG)
6 in the PE_PGRS and PE genes translated to poly(Gly).
In both
M. tuberculosis strains, the (CGG)
6 and (CGG)
7 in genes
encoding penicillin-binding proteins translated to poly(Pro)
(Table
3). In
M. bovis, the two (CGG)
5 repeats in genes encoding
probable conserved membrane proteins translated to poly(Ala)
and poly(Pro) (Table
3). Most of the (CGG)
5 repeats within the
PPE genes were located in the N-terminal PPE domain of the genes
(data not shown). All (CGG)
5 and (CGG)
6 repeats within the PE_PGRS
genes consisting of PE and PGRS domains were located in the
PGRS domain (data not shown). Two (CGG)
5 repeats within the
PE family-related gene (MT2159) in strain CDC1551 were located
in the C-terminal domain of the genes (data not shown).
Genomic stability.
To examine whether (CGG)5 repeats in the genome are stable, two M. tuberculosis strains (H37Rv and IMCJ 541) were analyzed for (CGG)5- and IS6110-probed fingerprints. The fingerprint patterns among culture periods were identical for strain H37Rv (Fig. 1A). These findings were confirmed with strain IMCJ 541 (Fig. 1B). The data indicate that (CGG)5 repeats are stable in the genome for at least a few months. In the IS6110-probed fingerprints, the patterns did not change during the 9 weeks of culture of strain H37Rv or strain IMCJ 541 (data not shown), indicating that IS6110 inserts are also stable over a few months.
Comparison of fingerprints between M. tuberculosis strains H37Rv and H37Ra.
The virulent
M. tuberculosis strain H37Rv and its avirulent
derivative strain H37Ra were originally derived from the same
strain, H37 (
22,
23). It was reported that there are distinct
differences between these strains with respect to IS
6110-probed
fingerprint patterns (
3,
11). We investigated whether differences
exist between these strains with respect to (CGG)
5-probed fingerprint
patterns. DNA derived from the H37Rv and H37Ra strains were
digested with 16 restriction enzymes as described in Materials
and Methods. Unexpectedly, the patterns of (CGG)
5-based hybridization
showed no differences between the H37Rv and H37Ra strains (Fig.
2A). For example, the (CGG)
5-based RFLP patterns of PvuII-digested
fragments of H37Rv were identical to those of H37Ra (Fig.
2A,
PvuII). However, the IS
6110-based RFLP patterns of H37Rv were
markedly different from those of H37Ra, which were analyzed
with the use of the same blot of PvuII-digested fragments used
in the (CGG)
5-based RFLP analysis (Fig.
2B). In the IS
6110-based
RFLP patterns, H37Rv showed 9 bands, and H37Ra showed 11 bands.
Strain H37Rv but not H37Ra showed one band of 5.1 kb. Strain
H37Ra but not H37Rv showed three bands of 1.1, 2.3, and 3.0
kb.
IS6110- and (CGG)5-probed DNA fingerprinting of M. tuberculosis clinical isolates.
To assess the potential usefulness of (CGG)
5 as an epidemiologic
marker for
M. tuberculosis, 109 clinical isolates obtained from
Tokyo (76 isolates) and Warsaw (33 isolates) and the H37Rv and
H37Ra strains were analyzed by the IS
6110- and (CGG)
5-probed
fingerprint methods. For IS
6110-probed hybridization, DNA of
these isolates was digested with PvuII according to a standardized
protocol (
26). For (CGG)
5-probed hybridization, DNA of the isolates
was digested with AluI. When DNA of the H37Rv and H37Ra strains
was digested with AatII, EcoRI, MluI, NruI, NsbI, PstI, PvuII,
SacI, SalI, or XhoI, relatively higher-molecular-weight DNA
fragments were visualized by the probe with a minimum size of
1 to 3.5 kb and a maximum size of more than 10 kb (Fig.
2A).
When digested with AfaI, AluI, HinfI, Sau3AI, SmaI, or XspI,
DNA fragments of sizes of 0.5 to 8 kb were visualized. When
DNA of five clinical isolates selected at random were digested
with AluI, clear (CGG)
5 fingerprint patterns with 10 to 14 copies
of DNA fragments of 0.75 to 8 kb were detected (data not shown).
Although we used AluI for this fingerprinting method, other
enzymes may also be used.
IS6110 fingerprint patterns obtained from clinical isolates and the corresponding dendrogram are shown in Fig. 3A. IS6110 copies were detected in 110 of 111 isolates. One isolate from Japan had no copy. As indicated in Fig. 3A, 10 of 111 isolates (9.0% of tested isolates), including 8 isolates from Japan and 2 from Poland, possessed fewer than 6 copies of IS6110, which was insufficient to distinguish polymorphisms. Except for these 10 isolates with fewer than 6 copies of IS6110, the IS6110 fingerprint patterns of 101 isolates showed
28% similarity; 98 patterns were found (Fig. 3A). Five clusters with
44% similarity, including clusters Ia, IIa, IIIa, IVa, and Va, were detected (Fig. 3A). Cluster Ia was composed of seven Poland-derived isolates. Cluster IIa was composed of two H37 variants and 11 Japan- and 6 Poland-derived isolates. Cluster IIIa was composed of three Japan- and seven Poland-derived isolates. Cluster IVa was composed of four Japan- and five Poland-derived isolates. Cluster Va was composed predominantly of Japan-derived isolates (46 isolates from Japan and 2 from Poland). The majority of Japan-derived isolates (61%) and Poland-derived isolates (76%) belonged to cluster Va and to clusters Ia to IVa, respectively.
(CGG)
5 fingerprint patterns and the corresponding dendrogram
are shown in Fig.
3B. (CGG)
5 copies were detected in all clinical
isolates tested. The copy number ranged from 8 to 16, with a
mean of 13.0 ± 1.5 per isolate. The number of (CGG)
5 copies of Japan- and Poland-derived isolates ranged from 8 to
16, with a mean of 12.9 ± 1.5 per isolate and from 11
to 15, with a mean of 13.2 ± 1.3 per isolate, respectively.
A total of 104 (CGG)
5 fingerprint patterns were found with

50%
similarity (Fig.
3B). Four clusters with

70% similarity, including
clusters Ib to IVb, were detected (Fig.
3B). Cluster Ib was
composed of two H37 variants and 15 Japan- and 29 Poland-derived
isolates. Clusters IIb, IIIb, and IVb were composed of 9, 24,
and 10 Japan-derived isolates, respectively. Over half of the
Japan-derived isolates (57%) and the majority of the Poland-derived
isolates (88%) belonged to clusters IIb to IVb and to cluster
Ib, respectively (Fig.
3B).
Both the IS6110 and (CGG)5 fingerprint analyses showed an association between fingerprint pattern and geographic origin, indicating a correlation between them. Ten isolates that were indistinguishable by IS6110 RFLP because of the presence of few copies of the marker could be analyzed by (CGG)5 marker. Three and seven pairs of isolates were identical to each other in the IS6110 and (CGG)5 fingerprint patterns, respectively (Fig. 4). The three pairs P 1 and P 2, P 7 and P 10, and P 9 and P 13 were identical to each other in the IS6110 and (CGG)5 fingerprint patterns (Fig. 4A to C, respectively). The four pairs H37Rv and H37Ra, IMCJ 427 and IMCJ 432, P 3 and P 5, and P 17 and P 19 were identical to each other in the (CGG)5 fingerprint pattern but different in the IS6110 fingerprint pattern (Fig. 4D, E, F, and G, respectively). The data suggest that the (CGG)5 fingerprint patterns are more stable than the IS6110 patterns.
Occurrence of (CGG)5 among various mycobacterial strains.
We investigated the presence of (CGG)
5 repeat sequences in mycobacterial
species. (CGG)
5 hybridization patterns from various mycobacterial
species are shown in Fig.
5. Bands ranging from 0 to 20 in number
were seen.
Mycobacterium szulgai possessed 20 bands.
M. bovis BCG,
Mycobacterium marinum, and
Mycobacterium kansasii possessed
16 bands.
Mycobacterium nonchromogenicum,
Mycobacterium terrae, Mycobacterium gastri, Mycobacterium simiae,
Mycobacterium smegmatis,
and
Mycobacterium intracellulare possessed 14, 12, 8, 5, 5,
and 3 bands, respectively.
Mycobacterium peregrinum possessed
two bands.
Mycobacterium fortuitum and
Mycobacterium chelonae possessed one band.
Mycobacterium scrofulaceum, Mycobacterium avium,
Mycobacterium xenopi, and
Mycobacterium abscessus showed
no bands.

DISCUSSION
In this study, we found that various bacterial strains contain
TRS in their genomes. In humans, TRS are associated with hereditary
neurologic and neuromuscular disorders, including myotonic dystrophy,
Huntington's disease, Fragile X syndrome, and Friedreich's ataxia
(
27). These diseases result from TRS expansion such as (CTG)
n,
(CGG)
n, and (GAA)
n (
27). The TRS sizes associated with these
diseases are usually quite large. For example, 80 to 3,000 repeats
of CTG have been found in myotonic dystrophy, 230 to 2,000 repeats
of CGG have been found in Fragile X syndrome, and 200 to 900
repeats of GAA have been found in Friedreich's ataxia (
21).
These expanded TRS can form hairpin structures or intramolecular
triplex structures that result in genetic instability (
21).
The TRS sizes found in bacteria were relatively small. The largest
size TRS identified was 21 repeats of GAA in
M. leprae. The
most frequently identified TRS was five repeats of CGG in
M. tuberculosis and
M. bovis. TRS found in bacteria are not likely
to be linked to genetic instability because of the lower repeat
number.
The (CGG)5 TRS found in two strains of M. tuberculosis (H37Rv and CDC1551) and in one strain of M. bovis existed in genes encoding PE protein families, including a PE_PGRS subfamily and PPE protein families comprising 88 to 101 and 61 to 69 kinds of proteins, respectively, which occupy approximately 8% of the genome (4, 7, 8). The functional properties of (CGG)5 in these genes are unknown, but (CGG)5 should not play an important role in the development of the variations among different strains. (CGG)5 in the PPE genes was located in the conserved N-terminal domain PPE but not in the C-terminal variable domain containing the major polymorphic tandem repeats with the consensus sequence of GCCGGTGTTG (10, 18). (CGG)5 in the PE_PGRS genes was within the C-terminal variable domain containing the PGRS with the consensus sequence of CGGCGGCAA (18, 19). (CGG)5 in the PE_PGRS genes did not comprise part of the consensus sequence of PGRS. (CGG)5 was contained in 13 and 12 PE_PGRS genes in H37Rv and CDC1551, respectively. Among these genes, deletion or insertion was detected at one site of Rv1068c, two sites of Rv1087, and two sites of Rv1450c compared with their orthologs, MT1097, MT1118.1, and MT1497.1, respectively (data not shown). However, (CGG)5 was not near these sites, indicating that it did not directly affect the deletion and insertion of PE_PGRS genes. (CGG)5 in PPE, PE, and PE_PGRS genes translated to neutral-charged amino acids of poly(Ala) and poly(Gly), respectively, with no special substitution, indicating that these regions do not participate in the formation of unique structures within these proteins. Thus, the (CGG)5 sequences in these genes will likely not have characteristic properties regarding function.
It is unclear whether TRS in bacteria, particularly (CGG)5 in M. tuberculosis and M. bovis, participate in their pathogenesis. There was no difference between virulent strain H37Rv and the derived avirulent strain H37Ra in (CGG)5-probed fingerprinting (Fig. 2). No correlation was found between the virulency of mycobacterial species and the numbers of bands in (CGG)5-probed fingerprinting or copies of (CGG)5 (Table 2 and Fig. 5). For example, M. leprae had no (CGG)5 repeats (Table 2). Some rare etiologic agents of nontuberculous mycobacteria, such as M. smegmatis and M. szulgai (20), did possess several copies of (CGG)5 in their genomes (Fig. 5), whereas some common etiologic agents, such as M. avium, M. xenopi, and M. abscessus (20), possessed no (CGG)5 repeats (Fig. 5). These results indicate that (CGG)5 repeats do not participate directly in the virulency of mycobacterial species.
Whereas fingerprinting analysis showed that both (CGG)5 and IS6110 were sufficiently stable epidemiologic markers, (CGG)5 appeared to be more stable than IS6110 (Fig. 1). We were unable to find any differences between strains H37Rv and H37Ra in (CGG)5-probed fingerprinting by extensive studies with various restriction enzymes. However, four different bands were detected between these strains with PvuII-IS6110 fingerprinting (Fig. 2B). Lari et al. (11) compared H37Rv and H37Ra strains maintained at their institution by IS6110 fingerprinting with EcoNI, PstI, and PvuII and found different patterns between these strains. Bifani et al. (3) compared the PvuII-IS6110 fingerprints of 15 and 3 different catalogued variants of H37Rv and H37Ra, respectively. Ten distinct fingerprint patterns, making up nine H37Rv variants and one H37Ra variant, were identified. A discrepancy between IS6110- and (CGG)5-probed fingerprints of laboratory strains was observed in three pairs of clinical isolates (Fig. 4). In these cases, each isolate was identical in (CGG)5 fingerprinting pattern but differed in its IS6110 fingerprinting pattern. Our recent epidemiological case report of intrafamilial tuberculosis transmission showed that two clinical isolates from a father and son were identical in (CGG)5-probed fingerprinting patterns, whereas one different band was detected between them by IS6110-probed fingerprinting (25). Collectively, IS6110-probed fingerprint patterns changed more rapidly than did (CGG)5-probed patterns, suggesting that there are different mechanisms by which these patterns change. In other terms, although (CGG)5-probed fingerprinting will hardly detect a few mutations in a clone of M. tuberculosis, it will easily detect an origin among the clones. The (CGG)5-probed fingerprinting combined with IS6110-probed fingerprinting will provide more powerful information about tuberculosis epidemiology.
We collected and analyzed the isolates in this study in Japan and Poland. If isolates could be collected worldwide, it would provide more exact epidemiological data. In conclusion, the (CGG)5 repeat is a useful probe for DNA fingerprinting of M. tuberculosis, because all strains tested here possessed more than eight copies. In addition, (CGG)5-probed fingerprinting will be a useful tool for the investigation of M. bovis, M. marinum, M. kansasii, and M. szulgai.

ACKNOWLEDGMENTS
We thank M. Nakano (Jichi Medical School, Japan) for comments
on the manuscript and A. S. Swierzko (Centre for Microbiology
and Virology, Polish Academy of Sciences, Poland) for coordinating
the international collaborative study.
The study was supported by the Health Sciences Research grants from the Ministry of Health, Labour and Welfare and by the Research on Health Sciences focusing on Drug Innovation (KH11008) from the Japan Health Sciences Foundation.

FOOTNOTES
* Corresponding author. Mailing address: Department of Infectious Diseases and Tropical Medicine, International Medical Center of Japan, Toyama 1-21-1, Shinjuku, Tokyo 162-8655, Japan. Phone: 81 3 3202 7181, ext. 2838. Fax: 81 3 3202 7364. E-mail:
tkirikae{at}ri.imcj.go.jp.

Present address: Department of Anatomy and Developmental Biology, University College London, London WC1E 6BT, United Kingdom. 

REFERENCES
1 - Alito, A., N. Morcillo, S. Scipioni, A. Dolmann, M. I. Romano, A. Cataldi, and D. van Soolingen. 1999. The IS6110 restriction fragment length polymorphism in particular multidrug-resistant Mycobacterium tuberculosis strains may evolve too fast for reliable use in outbreak investigation. J. Clin. Microbiol. 37:788-791.[Abstract/Free Full Text]
2 - Altschul, S. F., W. Gish, W. Miller, E. W. Myers, and D. J. Lipman. 1990. Basic local alignment search tool. J. Mol. Biol. 215:403-410.[CrossRef][Medline]
3 - Bifani, P., S. Moghazeh, B. Shopsin, J. Driscoll, A. Ravikovitch, and B. N. Kreiswirth. 2000. Molecular characterization of Mycobacterium tuberculosis H37Rv/Ra variants: distinguishing the mycobacterial laboratory strain. J. Clin. Microbiol. 38:3200-3204.[Abstract/Free Full Text]
4 - Cole, S. T., R. Brosch, J. Parkhill, T. Garnier, C. Churcher, D. Harris, S. V. Gordon, K. Eiglmeier, S. Gas, C. E. Barry III, F. Tekaia, K. Badcock, D. Basham, D. Brown, T. Chillingworth, R. Connor, R. Davies, K. Devlin, T. Feltwell, S. Gentles, N. Hamlin, S. Holroyd, T. Hornsby, K. Jagels, B. G. Barrell, et al. 1998. Deciphering the biology of Mycobacterium tuberculosis from the complete genome sequence. Nature 393:537-544.[CrossRef][Medline]
5 - de Boer, A. S., M. W. Borgdorff, P. E. de Haas, N. J. Nagelkerke, J. D. van Embden, and D. van Soolingen. 1999. Analysis of rate of change of IS6110 RFLP patterns of Mycobacterium tuberculosis based on serial patient isolates. J. Infect. Dis. 180:1238-1244.[CrossRef][Medline]
6 - Fang, Z., C. Doig, N. Morrison, B. Watt, and K. J. Forbes. 1999. Characterization of IS1547, a new member of the IS900 family in the Mycobacterium tuberculosis complex, and its association with IS6110. J. Bacteriol. 181:1021-1024.[Abstract/Free Full Text]
7 - Fleischmann, R. D., D. Alland, J. A. Eisen, L. Carpenter, O. White, J. Peterson, R. DeBoy, R. Dodson, M. Gwinn, D. Haft, E. Hickey, J. F. Kolonay, W. C. Nelson, L. A. Umayam, M. Ermolaeva, S. L. Salzberg, A. Delcher, T. Utterback, J. Weidman, H. Khouri, J. Gill, A. Mikula, W. Bishai, W. R. Jacobs, Jr., J. C. Venter, and C. M. Fraser. 2002. Whole-genome comparison of Mycobacterium tuberculosis clinical and laboratory strains. J. Bacteriol. 184:5479-5490.[Abstract/Free Full Text]
8 - Garnier, T., K. Eiglmeier, J. C. Camus, N. Medina, H. Mansoor, M. Pryor, S. Duthoy, S. Grondin, C. Lacroix, C. Monsempe, S. Simon, B. Harris, R. Atkin, J. Doggett, R. Mayes, L. Keating, P. R. Wheeler, J. Parkhill, B. G. Barrell, S. T. Cole, S. V. Gordon, and R. G. Hewinson. 2003. The complete genome sequence of Mycobacterium bovis. Proc. Natl. Acad. Sci. USA 100:7877-7882.[Abstract/Free Full Text]
9 - Hermans, P. W., D. van Soolingen, E. M. Bik, P. E. de Haas, J. W. Dale, and J. D. van Embden. 1991. Insertion element IS987 from Mycobacterium bovis BCG is located in a hot-spot integration region for insertion elements in Mycobacterium tuberculosis complex strains. Infect. Immun. 59:2695-2705.[Abstract/Free Full Text]
10 - Hermans, P. W., D. van Soolingen, and J. D. van Embden. 1992. Characterization of a major polymorphic tandem repeat in Mycobacterium tuberculosis and its potential use in the epidemiology of Mycobacterium kansasii and Mycobacterium gordonae. J. Bacteriol. 174:4157-4165.[Abstract/Free Full Text]
11 - Lari, N., L. Rindi, C. Lami, and C. Garzelli. 1999. IS6110-based restriction fragment length polymorphism (RFLP) analysis of Mycobacterium tuberculosis H37Rv and H37Ra. Microb. Pathog. 26:281-286.[CrossRef][Medline]
12 - Liu, L., K. Dybvig, V. S. Panangala, V. L. van Santen, and C. T. French. 2000. GAA trinucleotide repeat region regulates M9/pMGA gene expression in Mycoplasma gallisepticum. Infect. Immun. 68:871-876.[Abstract/Free Full Text]
13 - Liu, L., V. S. Panangala, and K. Dybvig. 2002. Trinucleotide GAA repeats dictate pMGA gene expression in Mycoplasma gallisepticum by affecting spacing between flanking regions. J. Bacteriol. 184:1335-1339.[Abstract/Free Full Text]
14 - Mazars, E., S. Lesjean, A. L. Banuls, M. Gilbert, V. Vincent, B. Gicquel, M. Tibayrenc, C. Locht, and P. Supply. 2001. High-resolution minisatellite-based typing as a portable approach to global analysis of Mycobacterium tuberculosis molecular epidemiology. Proc. Natl. Acad. Sci. USA 98:1901-1906.[Abstract/Free Full Text]
15 - Niemann, S., E. Richter, and S. Rüsch-Gerdes. 1999. Stability of Mycobacterium tuberculosis IS6110 restriction fragment length polymorphism patterns and spoligotypes determined by analyzing serial isolates from patients with drug-resistant tuberculosis. J. Clin. Microbiol. 37:409-412.[Abstract/Free Full Text]
16 - Niemann, S., S. Rüsch-Gerdes, and E. Richter. 1997. IS6110 fingerprinting of drug-resistant Mycobacterium tuberculosis strains isolated in Germany during 1995. J. Clin. Microbiol. 35:3015-3020.[Abstract]
17 - Parniewski, P., A. Bacolla, A. Jaworski, and R. D. Wells. 1999. Nucleotide excision repair affects the stability of long transcribed (CTG*CAG) tracts in an orientation-dependent manner in Escherichia coli. Nucleic Acids Res. 27:616-623.[Abstract/Free Full Text]
18 - Poulet, S., and S. T. Cole. 1995. Repeated DNA sequences in mycobacteria. Arch. Microbiol. 163:79-86.[Medline]
19 - Ross, B. C., K. Raios, K. Jackson, and B. Dwyer. 1992. Molecular cloning of a highly repeated DNA element from Mycobacterium tuberculosis and its use as an epidemiological tool. J. Clin. Microbiol. 30:942-946.[Abstract/Free Full Text]
20 - Salfinger, M. 1996. Characteristics of the various species of mycobacteria, p. 161-170. In N. R. William and M. G. Stuart (ed.), Tuberculosis. Little, Brown and Company, New York, N.Y.
21 - Sinden, R. R. 1999. Biological implications of the DNA structures associated with disease-causing triplet repeats. Am. J. Hum. Genet. 64:346-353.[CrossRef][Medline]
22 - Steenken, W., W. H. Oatway, and S. A. Petroff. 1934. Biological studies of the tubercle bacillus. J. Exp. Med. 60:515-543.[Abstract]
23 - Steenken, W. J., and L. U. Garner. 1946. History of H37 strain of tubercle bacillus. Am. Rev. Tuberc. 79:62-66.
24 - Supply, P., S. Lesjean, E. Savine, K. Kremer, D. van Soolingen, and C. Locht. 2001. Automated high-throughput genotyping for study of global epidemiology of Mycobacterium tuberculosis based on mycobacterial interspersed repetitive units. J. Clin. Microbiol. 39:3563-3571.[Abstract/Free Full Text]
25 - Takahara, M., Y. Yajima, S. Miyazaki, M. Aiyoshi, T. Fujino, Y. Otsuka, J. Sekiguchi, K. Saruta, T. Kuratsuji, and T. Kirikae. 2003. Molecular epidemiology of intra-familial tuberculosis transmission. Jpn. J. Infect. Dis. 56:132-133.[Medline]
26 - van Embden, J. D., M. D. Cave, J. T. Crawford, J. W. Dale, K. D. Eisenach, B. Gicquel, P. Hermans, C. Martin, R. McAdam, and T. M. Shinnick. 1993. Strain identification of Mycobacterium tuberculosis by DNA fingerprinting: recommendations for a standardized methodology. J. Clin. Microbiol. 31:406-409.[Abstract/Free Full Text]
27 - Wells, R. D., M. Sarmiento, and S. T. Warren. 1998. Genetic instabilities and hereditary neurological diseases. Academic Press, New York, N.Y.
28 - Wiid, I. J., C. Werely, N. Beyers, P. Donald, and P. D. van Helden. 1994. Oligonucleotide (GTG)5 as a marker for Mycobacterium tuberculosis strain identification. J. Clin. Microbiol. 32:1318-1321.[Abstract/Free Full Text]
29 - Yeh, R. W., A. Ponce de Leon, C. B. Agasino, J. A. Hahn, C. L. Daley, P. C. Hopewell, and P. M. Small. 1998. Stability of Mycobacterium tuberculosis DNA genotypes. J. Infect. Dis. 177:1107-1111.[Medline]
30 - Zhang, Y., B. Heym, B. Allen, D. Young, and S. Cole. 1992. The catalase-peroxidase gene and isoniazid resistance of Mycobacterium tuberculosis. Nature 358:591-593.[CrossRef][Medline]
Journal of Clinical Microbiology, August 2004, p. 3538-3548, Vol. 42, No. 8
0095-1137/04/$08.00+0 DOI: 10.1128/JCM.42.8.3538-3548.2004
Copyright © 2004, American Society for Microbiology. All Rights Reserved.
This article has been cited by other articles:
-
Sekiguchi, J.-I., Nakamura, T., Miyoshi-Akiyama, T., Kirikae, F., Kobayashi, I., Augustynowicz-Kopec, E., Zwolska, Z., Morita, K., Suetake, T., Yoshida, H., Kato, S., Mori, T., Kirikae, T.
(2007). Development and Evaluation of a Line Probe Assay for Rapid Identification of pncA Mutations in Pyrazinamide-Resistant Mycobacterium tuberculosis Strains. J. Clin. Microbiol.
45: 2802-2807
[Abstract]
[Full Text]
-
Sekiguchi, J.-i., Miyoshi-Akiyama, T., Augustynowicz-Kopec, E., Zwolska, Z., Kirikae, F., Toyota, E., Kobayashi, I., Morita, K., Kudo, K., Kato, S., Kuratsuji, T., Mori, T., Kirikae, T.
(2007). Detection of Multidrug Resistance in Mycobacterium tuberculosis. J. Clin. Microbiol.
45: 179-192
[Abstract]
[Full Text]