ABSTRACT
The study of surface protein antigens of group B streptococci (GBS) is important for understanding of the pathogenesis and epidemiology of infection, and several of these antigens have been proposed as components of GBS conjugate vaccines. In a previous study, we developed a novel PCR-and-sequencing system for identification of GBS serotypes and serosubtypes based on the capsular polysaccharide synthesis (cps) gene cluster. In this study, we used published sequences to develop PCR assays for identification of genes encoding GBS surface proteins including C alpha (bca), C alpha-like proteins 2 and 3 (alp2 and alp3), Rib (rib), and C beta (bac). We showed that the prototype R reference strain, Prague 25/60, contained a novel alpha-like protein antigen gene (the proposed alp4), which presumably encodes an atypical, but antigenically similar, R-like protein. Initial evaluation of these gene-specific assays showed excellent specificity. By combining cps serotypes, serosubtypes, and surface protein gene profiles, we were able to divide 224 GBS isolates into 31 serovariants. GBS bac-positive strains could be further subtyped into 11 groups and 20 subgroups. Our results confirmed and extended reported associations between some cps serotypes and serosubtypes, on the one hand, and surface protein genes, on the other: serosubtypes III-1 and III-2 were associated with rib, serosubtype III-3 with alp2, serotype Ib with bca and bac, and serotype V with alp3. The associations between serotype Ia and bca, bca repetitive unit, and bca repetitive unit-like sequence-containing genes need to be studied further. These PCR-based methods will provide an alternative and objective tool for subtyping of GBS based on surface protein antigen genes.
Group B streptococci (GBS)—Streptococcus agalactiae—are the commonest cause of neonatal and obstetric sepsis and an increasingly important cause of septicemia in the elderly and in immunocompromised patients (25). There are nine GBS capsular polysaccharide serotypes (based on the capsular polysaccharide synthesis [cps] gene cluster), which vary in their distribution among geographic areas, disease types, and patient age groups (5, 8). Capsular polysaccharides are important virulence factors and epidemiological markers and are the main components of conjugate vaccines. For studies of epidemiology and pathogenesis, it is important to identify as many phenotypic or molecular markers as possible to increase the discriminatory power of typing systems (6). In addition to capsular polysaccharide antigens, GBS surface protein antigens, which also contribute to the pathogenesis of GBS disease and induce protective immunity, are potentially useful markers (16). Their use in polysaccharide conjugate vaccines is under investigation (4, 17). Identification of surface protein antigens, combined with cps serotyping, allows subdivision of GBS strains into a large number of serovariants, which can facilitate epidemiological, pathogenetic, and other related studies of GBS infection (15).
The genes encoding the C alpha protein (bca), C alpha-like proteins 2 and 3 (alp2 and alp3), and the Rib protein (rib) have been well studied, and their gene sequences have been published in GenBank (16, 23, 29). They are members of a family of surface proteins containing repetitive elements, which produce variations in protein size and antigenicity (16, 29). The gene encoding the C beta or IgA binding protein (bac) has also been well described (7, 9). C alpha, C beta, and Rib proteins have all been proposed as potential vaccine components (17, 18, 22).
Numerous methods have been used to identify GBS surface antigens by using monoclonal (24) or polyclonal antibodies (3) or genes by hybridization with probes (27), PCR (19, 20, 21), and/or sequencing (16, 20). PCR-based methods are attractive because of their high discriminatory power and reproducibility (20). PCR methods to detect C alpha and C beta protein genes have been published (19, 20, 21), but the specificity, clinical application, and interpretation of these methods require further study. Specific PCR methods to identify genes encoding Rib and C alpha-like proteins 2 and 3, which are present in the more-virulent serotypes, III and V, have not yet been described (16, 29). Associations between cps serotypes and some protein antigens have been described (15, 27). They are likely to vary over time and in different populations and geographic locations (5, 8) and should be useful for studies of the epidemiology and pathogenesis of GBS infection.
In this study, we used published sequences of surface protein antigen genes, including bca, bac, alp2, alp3, and rib (7, 9, 16, 23, 29), to improve and/or develop protein gene-specific PCR assays. We used these assays to examine the distribution and variation of surface protein genes, and their associations with cps genes, in a large collection of GBS isolates collected over the past decade in Australia and New Zealand.
MATERIALS AND METHODS
GBS isolates, serotyping, and serosubtyping.The isolates used in this study and the serotype and serosubtype identification methods have been described in detail elsewhere (14). Isolates included well-characterized reference panels kindly provided by Lawrence Paoletti, Channing Laboratory, Boston, Mass. (serotypes Ia to VIII; reference panel 1), and Diana Martin, Streptococcus Reference Laboratory, Institute of Environmental Science and Research, Porirua, Wellington, New Zealand (serotypes Ia to VI; reference panel 2), and 206 clinical isolates. All isolates were serotyped by conventional and molecular methods, and some were serosubtyped by PCR and sequencing. Antisera used for serotyping were prepared against serotypes Ia, Ib, Ic, and II through VIII and against the R protein antigen. The prototype R reference strain Prague 25/60 was used to raise the R antiserum.
Oligonucleotide primers.Oligonucleotide primers used in this study, their target sites in the gene sequences, and their melting temperatures (Tm) are shown in Table 1. The primers were synthesized according to our specifications by Sigma-Aldrich (Castle Hill, New South Wales, Australia). Six previously published oligonucleotide primers (19, 20, 21) and a series of new primers designed by us were used to sequence parts of genes encoding GBS surface proteins and/or to specifically amplify these genes. All new primers, except two used only for sequencing rib and six previously published (unmodified), were designed with high Tm (>70oC) for use in rapid-cycle PCR (Table 1).
Oligonucleotide primers used in this study
DNA preparations and PCR.DNA was prepared from GBS cultures (21), and PCR was performed as previously described (11, 12, 14). Denaturation, annealing, and elongation temperatures and times used were 96°C for 1 s, 45 to 72°C (according to the primer Tm or as previously described) for 1 s, and 74°C for 1 to 30 s (according to the length of amplicons), respectively, for 30 to 35 cycles, using a Perkin-Elmer Thermal Cycler 9600. Ten microliters of PCR products was analyzed by electrophoresis on 1.5% agarose gels, which were stained with 0.5 μg of ethidium bromide ml−1. For detection and/or subtyping, the presence of PCR amplicons of the expected lengths, shown by UV transillumination, was accepted as positive. For sequencing, 40-μl volumes of PCR products were further purified by the polyethylene glycol precipitation method (1).
Sequencing.To confirm the specificity of newly designed or modified primer pairs, we sequenced 10, 13, and 10 selected amplicons produced by bcaS1-bcaA (targeting the 5" end of bca), ribS1-ribA3 (targeting rib), and GBS1360S-GBS1937A (targeting bac), respectively, from the two panels of reference strains and 31 randomly selected clinical isolates. All amplicons of primer pairs bcaS1-balA (targeting alp2 and alp3), bal23S1-bal2A2 (targeting alp2), and IgAagGBS-RIgAagGBS (targeting bac) from any of the 224 isolates were sequenced.
PCR products were sequenced using Applied Biosystems Taq DyeDeoxy terminator cycle-sequencing kits according to standard protocols. The corresponding amplification primers or inner primers were used as the sequencing primers.
Database similarity searching and sequence comparison.Databases were searched for sequence similarity by using the FastA program in the SeqSearch program group. Sequences were compared using the Bestfit and Gap programs in the Comparison program group. The Translate program in the Translation program group was used to translate from DNA sequences to amino acid sequences. All programs are provided in WebANGIS (Australian National Genomic Information Service), version 3.
Surface protein gene profile codes.Each isolate was given a protein gene profile code according to positive PCR results using various primer pairs, as shown in Table 2.
Specificity and expected lengths of amplicons of using different primer pairs
Nucleotide sequence accession numbers.The sequences generated during this study have appeared in GenBank with the following accession numbers: AF367974 (partial bac sequence, with an insertion sequence, IS 1381, from one isolate), AF362685 to AF362704 (partial bac sequences for all bac-positive isolates), and AF373214 (partial proposed new alpha-like protein 4 gene [alp4] for reference strain Prague 25/60, an R protein standard strain).
Previously published gene sequences used in this study and their GenBank accession numbers are as follows: M97256 (bca), X58470 and X59771 (bac), U58333 (rib), AF208158 (alp2), AF291065 to AF291072 (alp3), and AF064785 (IS1381).
RESULTS
PCR results.With few exceptions, all primer pairs produced amplicons of the predicted lengths from isolates giving positive results (Table 2). Figure 1 shows a representative gel. The exceptions included one isolate that was positive by PCR using primer pairs GBS1360S-GBS1937A and GBS1717S-GBS1937A (both targeting bac) but produced amplicons significantly longer than those of other bac-positive isolates. Sequencing showed that the amplicon contained the insertion sequence IS1381 with minor variations from the published sequences (28). The amplicons produced using primers IgAagGBS-RIgAagGBS and IgAS1-IgAA1 (also targeting bac) varied in length (2) and were sequenced for further subtyping (see below and Table 3).
Results of PCR amplification using the primer pair bcaS2-bcaA (targeting and specific for the 5" end of bca). Lanes M, molecular weight marker φX174 DNA/HinfI; lanes 1 to 7, seven serotype Ia isolates, one of which (lane 4) was positive for the 5" end of bca; lanes 8 to 14, seven serotype II isolates, three of which (lanes 10, 12, and 13) were positive for the 5" end of bca.
Genetic groups and subgroups of bac (C beta protein gene) based on amplicon length (using primers IgAagGBS and RIgAagGBS) and sequence heterogeneity
Evaluation of the protein gene-specific primer pairs by direct sequencing of PCR amplicons.All 10 amplicons of primer pair bcaS1-bcaA and 12 of 13 amplicons (except that from strain Prague 25/60 [see below]) of primer pair ribS1-ribA3 were identical with the corresponding portions of the gene sequences in GenBank (M97256 [bca] and U58333 [rib], respectively). Four of 10 amplicons of primer pair GBS1360S-GBS1937A (targeting bac) were identical with the corresponding gene sequence in GenBank (X58470 , X59771 ). A single point mutation (A to G at position 1441 of X59771 ) was found in the remaining six bac-positive amplicons, including the one that contained the insertion sequence IS1381 (see above and GenBank accession number AF367974 ).
Fifty isolates produced amplicons with primer pair bcaS1-balA. The sequences of 9 were identical with the corresponding portion of the published sequence of alp2 (AF208158 ), and those of 41 were identical with that of alp3 (AF291065 ). There are two consistent heterogeneity sites between alp2 and alp3 in the sequences of bcaS1-balA amplicons, which can be used to distinguish them, in addition to alp2- and alp3-specific PCR. All nine amplicons obtained with primer pair bal23S1-bal2A2 were identical with the corresponding portion of the alp2 sequence in GenBank (AF208158 ). Primer pair IgAagGBS-RIgAagGBS identified bac in 52 isolates. There was considerable sequence variation, which allowed separation of bac-positive isolates into 11 groups and 20 subgroups based on amplicon length and sequence heterogeneity, respectively (Table 3). The groups contained small numbers (1 to 5) of isolates except for B1 (20 isolates, 2 subgroups) and B4 (11 isolates, 3 subgroups). In general, the presence or absence of short repetitive sequences was responsible for differences in amplicon length (2, 9).
Further confirmation of specificity of surface protein gene-specific primer pairs.To confirm primer pair specificity, we compared the results of PCR using the primer pairs we had designed or modified for bac PCR with those of PCR using previously published primer pairs (19, 21) and found 100% correlation.
The previously reported nonspecificity of the published primer pair bcaRUS-bcaRUA (targeting the bca repetitive unit) was confirmed (20). When these primers were used, all nine alp2-positive (BcaS1-BcaA-negative) isolates and 53 isolates which were PCR negative with primer pairs bcaS1-bcaA and bcaS2-bcaA (targeting the 5" end of bca) and with primer pairs bal23S1-bal2A2 and bal23S2-bal2A1 (targeting the 5" end of alp2) produced amplicons. Our sequencing showed that bca and alp2 have significant homology in the regions targeted by bcaRUS-bcaRUA, allowing amplicon formation from alp2-positive strains (16, 20). These false positive results could be due to the presence of other C alpha-like protein genes, containing regions homologous with the bca repetitive unit (bca repetitive unit-like sequence).
We also showed that the results of PCR using two or more primer pairs that we had designed for individual genes (rib, alp2, and alp3) correlated well, supporting the specificity of each set. The only exception, as mentioned above, was ribS1-ribA3, which produced a nonspecific amplicon from 1 of 224 isolates tested.
Prague 25/60 contains another new alpha-like surface protein antigen gene, alp4 (proposed).Strain Prague 25/60 (which is used to raise the R antiserum), in reference panel 2, produced an amplicon with primer pair ribS1-ribA3 but not with ribS2-ribA1, ribS2-ribA2, or ribS2-ribA3. It was therefore assumed not to contain rib, although the amplicon sequence showed considerable homology with rib and other members of the family of surface proteins (see below). This isolate was the only one, of 224 tested, for which PCRs were negative using ribS2-ribA1 and ribS2-ribA2 but positive using ribS1-ribA3. The latter primer pair is, then, not entirely specific for rib and was therefore used only for sequencing.
Sequencing of the Prague 25/60 ribS1-ribA3 amplicon showed considerable homology with other members of the surface protein gene family defined by bca-rib. The homologous regions were located at the 5" ends of the genes (for bca, the positions are in the region between nucleotides 251 and 559, and for the corresponding amino acid sequence, they are between positions 58 and 160 of the sequence identified by GenBank accession number M97256 ). The ratios of similarity (excluding the primer sequences) to DNA sequences of bca, rib, alp2, and alp3 were 66.7, 64.4, 62.5, and 62.5%, respectively. For the corresponding amino acid sequences, similarity ratios were 62.1, 60.2, 57.3, and 57.3%, respectively. Since this amplicon sequence is most similar to that of bca, which encodes C alpha, the prototype of the surface protein family, we propose that the gene be named alp4 (C alpha-like protein antigen 4 gene). Cloning and sequencing of the whole gene were beyond the scope of this study.
Surface protein gene profiles.For each GBS surface protein gene (except the bca repetitive unit and bca repetitive unit-like region), we selected two primer pairs to identify and characterize it by PCR. Four common profiles accounted for 203 of 224 (90.6%) isolates: R (62 isolates), AaB (51 isolates), a (49 isolates), and alp3 (41 isolates) (see Table 4). Only two isolates contained no surface protein gene markers. All but one isolate with the bac gene (B) also had bca with its repetitive unit (Aa); one had rib (R). All alp2 isolates contained single bca repetitive unit-like sequences (as). A, R, alp2, alp3, and the proposed new protein type alp4 were all mutually exclusive. Sixty-two of 63 isolates with rib (R) and 41 of 41 isolates with alp3 had no other protein antigen gene markers.
Relationships between GBS protein gene profiles and capsular polysaccharide synthesis (cps) gene molecular serotypes and serosubtypes
Relationship between surface protein antigen gene profiles and cps serotypes and serosubtypes.Development of the molecular serotype (MS) identification method and comparison with conventional serotyping (CS) have been described elsewhere (14). A cps MS was assigned to all isolates, and the results correlated with CS results except for 19 of 224 isolates that were nontypeable by use of antisera. Relationships between surface protein gene profiles and cps molecular serotypes are summarized in Table 4.
The following strong associations were confirmed or demonstrated: MS Ia with bca repetitive unit or bca repetitive unit-like sequence (most with profile a), MS III-1 and III-2 with rib, MS III-3 with alp2, MS Ib with bca and bac, and MS V with alp3. MS II showed the most varied surface protein gene profiles. However, the relationships were not absolute, and different combinations of polysaccharide cps serotypes and protein gene profiles produced 31 serovariants, or 51 when bac (B) subgroups were considered.
Relationship between surface protein antigens and protein gene profiles.Based on CS, 33 isolates (belonging to CS Ia/c, Ib/c, IIc, IIb, IIIc, or IIIb) reacted with the C antiserum. The surface protein gene profiles of all of these isolates included bca (A) and/or bca repetitive unit-related (a or as) markers as follows: Aa (3 isolates), AaB (18 isolates), a (11 isolates), and alp2as (1 isolate). Twenty-nine isolates reacted with the R antiserum; of these, 22 contained rib and 6 contained alp3. The remaining isolate was Prague 25/60 (the reference strain used to raise the R protein antiserum), which contained the new presumed alpha-like protein 4 gene, alp4 (see above).
DISCUSSION
In our previous study, all the isolates used in the present study were serotyped by conventional and molecular methods that identified their cps serotypes and, in some cases, serosubtypes (14). In this study we developed PCR-based methods to identify GBS surface protein genes and further characterize these isolates. Using the published bac sequence, we modified bac-specific primers (19, 21) and designed new primers, with high Tm (>70oC), suitable for rapid-cycle PCR (10, 13) and targeting all major surface protein genes.
As previously reported, a published PCR primer pair targeting the bca repetitive unit (at the 3" end of bca) was not entirely specific for bca (20). We designed two new primer pairs targeting the 5" end of bca in order to improve the specificity. However, very few serotype Ia strains gave positive results with these two primer pairs, whereas all were PCR positive using a primer pair targeting the bca repetitive unit (20). These results were consistent with a previous report (6) that a probe targeting the 5" end of bca hybridized with only one of nine serotype Ia strains, whereas a large bca probe, including the tandem repeat region, hybridized with all nine. Further study is required to define the sequences and specificities of different portions of bca and their effects, if any, on the structure and functions of C alpha and related proteins.
PCR primers specific for rib, alp2, and alp3 have not been described previously. The primer pairs we designed mainly targeted the 5" ends of the genes and were chosen after comparison of their heterogeneity with related gene sequences (16, 29). We designed two or more primer pairs for each gene in order to check primer specificity by comparing the results of different PCRs targeting the same genes. Protein gene profiles alp2 and alp3 were distinguished on the basis of the alp2- and alp3-specific PCR and/or two sequence heterogeneity sites in the amplicons of bcaS1-balA or bcaS2-balA.
To confirm the specificities of our primers, we used them to examine two reference panels and selected GBS isolates. The longest amplicons produced by PCR for each gene were sequenced in order to provide maximal sequence information and ensure that the inner primers were not located at strain heterogeneity sites. Our sequencing results confirmed the specificities of the primers. Two pairs of primers for each gene were compared, with similar results. Finally, six gene- or region-specific primer pairs (including the one targeting the bca repetitive unit) were used to define protein antigen gene profiles for all 224 isolates.
The study showed that only one member of the surface protein gene family containing repetitive sequences—rib, bca, alp2, or alp3 (or, presumably, alp4)—was present in any single isolate (15, 16). However, all isolates containing bac, which is not a member of the surface protein gene family containing repetitive sequences, also contained either bca (51 of 52 isolates) or rib (1 of 52 isolates) (15).
The C beta protein gene, bac, was present in 23% of isolates, a proportion similar to those (19 to 22%) previously reported (2). In common with others, we found variations in the bac (2) amplicons due to variable small internal repetitive sequences (2, 9) that, unlike those of the bca-rib family, were irregular. Their role is not clear, but they are potentially useful molecular markers for epidemiological studies (2, 7).
Our study confirmed previously reported relationships between cps serotypes and surface protein gene profiles (16). For example, some serotype III isolates (our MS III-1 and III-2) were closely associated with rib (26), and others (our MS III-3) were closely associated with alp2 (16). Serotype Ib was associated with bca and bac (15), and serotype V was associated with alp3 (16). However, as the relationship was not absolute, different combinations of cps serotypes and protein gene profiles identified many serovariants, which will be useful in epidemiological studies and in the formulation of conjugate vaccines (15, 16). Based on PCR only, we were able to divide our 224 isolates into 31 serovariants based on bac (B) groups or into 51 serovariants, based on subgroups. Theoretically, there are likely to be additional serovariants.
Comparison of protein antigen (C and R proteins) serotyping results with protein gene profiles showed that the presence of the gene does not necessarily indicate the expression of the corresponding protein. This is one reason for discrepancies between genetic and serotyping results; another is that C and R protein antisera are not entirely specific (16). Our analysis showed that reaction with the C antiserum generally correlated with the presence of genes encoding C alpha (bca) or alpha-like protein 2 (alp2). Reaction with the R antiserum correlated with the presence of genes encoding Rib protein (rib), alpha-like protein 3 (alp3), or the presumed new, rare alpha-like protein 4 (alp4) (found in this study). Apparently, antigenic cross-reactivity does not necessarily reflect genetic similarity, since the alp4 sequence studied was more similar to that of bca than to that of rib or alp3. More-extensive analysis of these genes and the relationships between the proteins they encode is required.
These methods will be useful in further studies of the effects of various antigen profiles on virulence and to further define the genealogy of GBS serotypes and various subtypes.
ACKNOWLEDGMENTS
We thank Ansuiya Sharma, Rebecca Hoile, Leanne Montgomery, and David Smith for their help in culturing GBS strains, Moana Ngatai and Julie Morgan for serotyping of some isolates, and Mark Wheeler for valuable help in sequencing.
FOOTNOTES
- Received 8 August 2001.
- Returned for modification 23 September 2001.
- Accepted 18 November 2001.
- Copyright © 2002 American Society for Microbiology