Serotype and Genotype (Multilocus Sequence Type) of Streptococcus suis Isolates from the United States Serve as Predictors of Pathotype

Streptococcus suis is a significant cause of mortality in piglets and growing pigs worldwide. The species contains pathogenic and commensal strains, with pathogenic strains causing meningitis, arthritis, endocarditis, polyserositis, and septicemia. Serotyping and multilocus sequence typing (MLST) are primary methods to differentiate strains, but the information is limited for strains found in the United States.

D isease caused by Streptococcus suis is a significant economic and welfare concern in the swine industry. S. suis is a Gram-positive bacterium, and the species contains pathogenic and commensal strains. Pathogenic S. suis strains are associated with meningitis, arthritis, endocarditis, polyserositis, and septicemia in piglets and growing pigs (1,2), and S. suis strains isolated from neurological or systemic tissues (brain/ meninges, joints, and heart) are commonly considered the primary pathogens (2)(3)(4). Commensal strains normally reside in the upper respiratory tract of pigs, with pigs commonly serving as carriers (1,5,6). S. suis can be an opportunistic pathogen associated with coinfections with other bacterial and viral pathogens (2,3). In addition, some S. suis strains have zoonotic potential, causing meningitis in humans (7).
A 1992 United States study investigated the serotype distribution of S. suis in porcine samples from Minnesota and reported the prevalence of serotypes 2 to 9 and 11, of which serotype 2 was the predominant serotype associated with neurological disease (3). A 1993 U.S. study identified serotypes 1 to 8 and 1/2 in naturally infected pigs primarily from a single state, with serotype 2 being the predominant serotype, followed by serotypes 3, 4, 7, 8, 1, 5, 1/2, and 6 (13). A large U.S. study in 2009 investigated the serotype distribution of S. suis strains collected from 2003 to 2005 from 17 states, illustrating that the distribution of strains was similar to Canada (14). In both countries, serotypes 1/2, 2, 3, 7, and 8 were most prevalent in diseased pigs (14,15) which is dissimilar to the distribution in Europe, in which serotype 2 occurs at a considerably higher percentage of isolates than in North America (16).
MLST is a nucleotide sequence-based technique for subtyping bacteria, and a standard MLST scheme has been developed for S. suis, with 1,161 registered sequence type (ST) profiles as of 28 February 2019 (17) (pubmlst.org). Global MLST studies of S. suis identified ST1, ST25, and ST28 as the most prevalent STs in swine (18)(19)(20)(21). In North America, ST25 and ST28 are more common among strains recovered from diseased animals, while ST1 strains are more prevalent in Europe and Asia (18,20,22). However, these studies address MLST for serotype 2 strains and may not apply to the remaining serotypes.
Previously, studies have classified isolates into pathotypes based on clinical information and site of isolation (3,4). Our objective was to combine information on pathotype with serotype and ST to address the limited information on current S. suis strains circulating within the United States. In total, 208 porcine S. suis isolates from North America were characterized by serotyping and MLST to determine the population and distribution of S. suis in the United States. Furthermore, the serotype and MLST data were used to investigate associations with the pathogenic and commensal pathotypes with the goal to identify pathogenic-and commensal-specific serotype and MLST patterns. Identifying the major disease-causing strains can promote the development of treatment and control plans. Our research seeks to identify pathogenic strains to track isolates in an outbreak, select strains for a vaccine, and develop effective treatment and control plans.

Selection of S. suis isolates.
A total of 208 S. suis isolates were selected for the project. Most of the S. suis isolates were obtained from routine diagnostic cases submitted between April 2014 and July 2017 to the University of Minnesota Veterinary Diagnostic Laboratory (UMNVDL) or the Kansas State Veterinary Diagnostic Lab (KSVDL). Further commensal isolates were collected from 9 different farms with a lack of systemic S. suis clinical disease. Isolates that met our pathotype criteria (defined below) were selected from as many states as possible (n ϭ 20) to minimize sample bias and increase geographic diversity to represent the major regions of the U.S. swine industry. S. suis isolates were verified to the species level by matrix-assisted laser desorption ionization-time of flight mass spectrometry (MALDI-TOF MS) (Microflex device, Bruker Daltonics GmbH, Germany) (23).
Multiple isolates may be recovered from healthy pigs due to the native microflora of the upper respiratory tract, while a single isolate is generally responsible for systemic infections (24). To limit the bias in isolating and selecting strains associated with clinical signs, a pathotype category system was developed for the S. suis isolates similar to previously published methods (4,25). "Pathogenic" isolates were obtained from the brain/meninges, joint, heart, or liver and reported as the primary cause of meningitis, arthritis, epicarditis, or septicemia in diagnostic reports by pathologists. "Possibly opportunistic" isolates were from lung samples submitted to the diagnostic lab from pigs without signs of neurological or systemic disease and included two isolates from nasal samples from farms with a clinical outbreak of S. suis disease. "Commensal" isolates were from laryngeal, tonsil, or nasal samples retrieved from farms with no known history or current control methods for S. suis disease.
Serotyping, MLST via whole-genome sequencing. Isolates were recultured for 24 to 48 h at 37°C on blood agar plates (tryptic soy agar [TSA] with 5% sheep blood) (Thermo Fisher Scientific, Waltham, MA, USA) and sent for serotyping to the bacterial serology laboratory at the Diagnostic Service of the Faculty of Veterinary Medicine of the Université de Montréal, Canada. The serotyping was done through the coagglutination test with reference antisera (26)(27)(28)(29). Nontypeable samples (samples which failed to react with the serum panel, autoagglutinated, or reacted to several sera) were further serotyped by PCR (30), a technique that cannot differentiate serotype 2 from 1/2 and serotype 1 from 14.
The S. suis DNA was extracted using the protocol for cultured cells from the QIAamp DNA kit (Qiagen Inc., Germantown, MD, USA) and submitted to the University of Minnesota Genomic Center (UMGC, St. Paul, MN, USA) for library preparation using Nexture TX (Illumina, San Diego, CA), and next-generation sequencing was performed on a HiSeq 2500 instrument (Illumina) with 250-bp paired-end reads. Illumina sequencing reads for each isolate were processed using Trimmomatic (31) with an average quality cutoff of 20 (2.3 million average reads per sample). Strains were again confirmed as S. suis by having a 96.6% to 100% nucleotide identity to the 1,662-bp S. suis-specific recombination/repair protein (recN) sequence (Streptococcus suis 05HAS68, GenBank accession number CP002007) using the S. suis serotyping pipeline (32).
In silico MLST analysis was performed using the Short Read Sequence Typing for Bacterial Pathogens (SRST2) program (http://katholt.github.io/srst2), which maps reads to MLST references (33). The ST allele sequences and profiles were obtained from the S. suis MLST database (https://pubmlst.org/ssuis/) (34). Novel ST allele sequences were confirmed by PCR amplification and Sanger sequencing of the aroA, cpn60, dpr, gki, mutS, recA, or thrA genes (17). The primers used for the amplification and sequencing of the mutS gene were mutS forward (5=-AAGCAGGCAGTCGGCGTGGT-3=) and mutS reverse (5=-AGTACAAACTACCATGCTTC-3=) as described (35). STs were grouped into major clonal complexes (CCs) using the entire MLST database and the eBURST software (36). Groups were defined with the strict parameters for determining single-locus variants (match of 6 or more loci). The entire S. suis MLST database was displayed as a single eBURST diagram by setting the group definition to zero of seven shared alleles.
MLST clustering analysis. Alignments, sequence identity calculations, and construction of the MLST sequence identity heatmap for basic clustering analysis were performed with R software (v.3.4.3) (37) and R packages (38)(39)(40)(41). The concatenated sequences of the seven MLST alleles were aligned with MUSCLE (v.3.8.31) (42), and sequence identities were calculated. The sequence identity scores were used to generate a heatmap based on Euclidian distances and neighbor joining clustering.
Statistical analysis. Basic data transformation and plotting for statistical analyses were performed using R software and R packages (43)(44)(45). Ternary plots of subtypes and pathotypes were generated using the R package Ternary (v.1.0.2) (46). The pathotype boundaries were assigned and colorcoded using 50% as a cutoff. Odds ratio (OR) analysis was used to test all pathotype-subtype combinations containing more than a single isolate, and 95% confidence intervals (CIs) were generated using Fisher's exact test. For each combination, the 2 by 2 table was created comparing that pathotype and subtype against all others. Similar 2 by 2 tables were generated for testing pathotype and serotype-STcombinations by chi-square and Fisher's exact tests. ORs greater than 1 with a 0.3 minimum lower limit were considered biologically significant. The minimum lower limit of 0.3 was calculated as the average lower limit among the combinations, is specific to our data set, and was selected for the identification of biologically meaningful relationships. An infinite (Inf) OR for a pathotype-subtype combination refers to a subtype that occurred in only one pathotype. The associations within and between types were investigated using multiple correspondence analysis (MCA), with the FactoMineR (v.1.41) and factoextra (v.1.0.5) packages (47,48), by setting the serotype, ST, and pathotype as the three variables.
Data availability. The reads associated with the samples were deposited in the NCBI Sequence Read Archive under accession numbers SRR9123061 to SRR9123268 (see Table S1).

Serotype and ST distributions of S. suis in the United States. Characterization of S. suis isolates by serotyping and MLST.
A total of 208 S. suis isolates were characterized, of which 203 were from the United States, 4 from Canada, and 1 from Mexico (Fig. 1). The clinical history and tissue of origin of the isolates were used to determine the pathotype, and the 208 isolates were classified as pathogenic (n ϭ 139), possibly opportunistic (n ϭ 47), and commensal (n ϭ 22) ( Table 1). The recN segment from S. suis was identified in the whole-genome sequences of all the 208 strains (Ͼ99% coverage of the gene and 40ϫ to 314ϫ depth), indicating that the isolates were S. suis.
In silico MLST analyses were performed on the WGS data, and the samples had an average depth of 155ϫ across the seven loci. STs could not be determined for four isolates because one housekeeping gene necessary for MLST classification was not identified in these isolates (referred to as NF, see Table S1 in the supplemental material). Fifty-eight different STs were identified for the remaining 204 isolates, indicating high diversity among the isolates (see Table S3 in the supplemental material). Twenty of these STs were previously defined, while 38 were newly identified (961 to 969, 971 to 998, and 1001; n ϭ 56). The predominant ST was ST28 (n ϭ 52), followed by ST94 (n ϭ 18), ST1 and ST108 (n ϭ 17 each).  Relationship between serotypes and STs. The distribution of STs by serotype illustrated the diversity of the S. suis strains (Fig. 2). Fifteen of the 20 serotypes identified contained multiple STs, with the number of different STs within a single serotype ranging from 2 to 8. The predominant serotype 1/2 contained three STs (ST28 [n ϭ 44], ST961 [n ϭ 8], and ST982 [n ϭ 1]). Serotypes 8, 14, 24, 28, and 29 contained a single ST each, namely, ST87, ST1, ST94, ST968, and ST972, respectively. However, serotypes 24, 28, 29, and 1or14 contained only a single isolate.
Associations among pathotypes, serotypes, and STs by analysis of proportions and OR. Associations between pathotype and serotype. Proportions and OR analyses were used to investigate pathotype associations with serotype for serotypes (proportions) or serotype-pathotype combinations (OR analysis) that contained more than one isolate. Between 80% and 100% of serotypes 1, 1/2, 2, 7, 14, and 23 were classified as the pathogenic pathotype (Fig. 5A), and these associations were supported by OR analysis (Fig. 5B). In the ternary plot, serotypes 3, 5, and 9 demonstrated a moderate association with the pathogenic pathotype, with 56% to 63% of isolates classified as pathogenic. However, the association between pathotype and serotype was not supported by OR analysis. OR analysis supported associations of serotypes 10 and 12 with the possibly opportunistic pathotype, with 67% of isolates classified as possibly opportunistic in the ternary plot. Serotypes 21 and 31, with 67% to 80% of FIG 3 Distribution of S. suis pathotypes by serotype. The stacked histogram illustrates the serotypes identified in this study, which were subdivided by pathotype (pathogenic, possibly opportunistic, and commensal). The x axis represents each serotype while the y axis represents the frequency of each pathotype. Bar sections are labeled with their respective pathotypes. The category 1or14 and NT (nontypeable) represents isolates with serotypes that could not be differentiated by coagglutination, PCR, or WGS. isolates classified as commensal in the ternary plot, were supported as commensal pathotypes by OR analysis.
Associations between pathotype and ST. Proportions and OR analysis were used to investigate pathotype associations with ST for STs (proportions) or ST-pathotype combinations (OR analysis) that contained more than one isolate. The ternary plot of the 58 STs (and the NF category) illustrated a clear differentiation by pathotype for all STs except ST87 and ST119 (approximately 50% pathogenic) (Fig. 6A). Twelve STs and the NF category contained over 75% of isolates classified as pathogenic, including ST1, ST13, ST25, ST28, ST29, ST94, ST108, ST117, ST225, ST373, ST961, and ST977, which demonstrated the same associations by OR (Fig. 6B). ST969 had an association with the possibly opportunistic pathotype, which was supported by OR. The commensal pathotype demonstrated a strong association with ST750 and ST821, which was supported by OR analysis.
Odds ratio and MCA of pathotypes, serotypes, and STs. Initially, OR was used to investigate the relationships between pathotype and serotype-ST-combinations, but significance relationships were lacking for the combinations (OR data not shown). Then, MCA was performed to analyze the possible relationships among all serotypes, STs, and pathotypes (Fig. 7). The first and second dimensions of the analysis only represent 6% of the data. The ellipses represent 95% of isolates in each pathotype. All the subtypes demonstrating a strong association with the pathogenic pathotype by OR analysis (Fig.  5 and 6) fell within the overlapping 95% ellipses for multiple pathotypes by MCA 3) and typical threshold (OR, 1) for identifying significant ORs. Error bars represent the 95% confidence intervals. Inf, Infinite. Nontypeable (NT) represents isolates which could not be serotyped using coagglutination, PCR, or WGS. (Fig. 7). Five serotypes and 13 STs in the commensal pathotype lacked overlapping ellipses. Serotypes 21 and 31 lacked any isolates with the pathogenic pathotype (Fig. 3), while ST750 and ST821 contained only isolates with the commensal pathotype (Fig. 4). The limited representations of the MCA data (6% variance) and the overlapping ellipses indicate a lack of relationship between serotype, ST, and pathotype, highlighting potential confounding factors for predicting pathogenic isolates based on both serotyping and MLST together. Thus, the relationship between pathotype, serotype, and ST is lacking for the pathogenic and possibly opportunistic pathotypes.

Associations between pathotype and MLST CC by analysis of proportions and OR. Identification of S. suis CCs.
To investigate the population structure of our S. suis isolates by MLST, the STs were assigned into CCs defined by eBURST, using the entire S. suis MLST database and our 58 STs (Fig. 8 and Table S3). Using the stringent definition (six of seven shared alleles) for defining a CC, five CCs (CC1, CC28, CC94, CC104, and CC750) with a primary founder were identified from our set of STs. However, multiple STs (n ϭ 30) did not form a CC or formed a CC without a primary founder ( Table 2). The most diverse CC (CC94) contained isolates from 13 of the 28 STs assigned into CC, compared with CC1, CC28, CC104, and CC750, which contained isolates from 4, 7, 1, and 3 STs, respectively.
Associations between pathotype and CC. Patterns between CC and pathotype were investigated by proportions and OR analysis. CC1, CC28, CC94, and CC104 were associated with the pathogenic pathotype, and the association was supported by OR analysis (Fig. 9). CC750 was associated with the commensal pathotype and was supported by OR analysis, with 83% of isolates classified as the commensal pathotype. The STs among the group of isolates lacking a CC did not associate with any pathotype.
CC1 was divided into two groups and clustered with CC750 and isolates without a CC. The first cluster of CC1 contained a concentration of isolates in the pathogenic pathotype (n ϭ 17/28), while the second cluster contained 4 pathogenic isolates, 6 possibly opportunistic isolates, and a single isolate with the commensal pathotype (  Lacking a CC, the ST13 isolates (n ϭ 5; serotype 1 [n ϭ 4] and serotype 1or14 [n ϭ 1]) clustered with CC1 isolates, demonstrating a possible genetic relatedness to isolates of CC1 and the pathogenic pathotype. Serotypes 1, 2, and 14 and ST1 and ST13 were also associated with isolates of the pathogenic pathotype by proportions and OR. Inversely, CC750 (n ϭ 6) consisted of isolates with the commensal (n ϭ 5) and possibly opportunistic (n ϭ 1) pathotypes and was predominantly composed of isolates characterized as nontypeable (n ϭ 5/6) and ST750 (n ϭ 4/6). Interestingly, CC750 was closely related to the group of isolates lacking a CC (n ϭ 31), which consisted of isolates with the commensal pathotype (n ϭ 12/31, multiple serotypes and novel STs), providing further evidence for the association between CC750 and the commensal pathotype.

DISCUSSION
S. suis is an important swine pathogen, often resulting in neurological and systemic disease caused by pathogenic strains. However, much is still unknown about the population structure of S. suis in the United States. In this study, we utilized serological and molecular typing techniques to investigate the serotype and ST distributions of U.S. isolates. Fourteen of the 20 S. suis serotypes identified in this study were recovered from pigs with clinical disease (n ϭ 139). The predominant pathogenic serotypes identified in this study were 1/2 (n ϭ 45), 7 (n ϭ 19), and 2 (n ϭ 14), which have been previously identified as the predominant serotypes from diseased pigs in North America (14,15,49,50). While serotypes 2 and 3 are considered predominant pathogenic serotypes in North America, only 10.6% of the strains in our study were recovered from diseased pigs. Furthermore, the serotype distribution from our study differed from European studies, in which serotypes 2 and 9 are predominant (50, 51). The higher prevalence of serotype 1/2 in North America could be due to a common evolu- tionary lineage with serotype 2. Genetic analysis by PCR-based serotyping of the cps loci demonstrated serotypes 1/2 and 2 share the same genetic profile and cannot be differentiated by serotype-specific cps loci (11,12). Sequencing of the cpsK gene reveals a missense mutation permitting the differentiation of serotypes 2 and 1/2 (12), but a PCR protocol has not been implemented yet to differentiate these serotypes.
In our study, the geographic distribution of S. suis was from 20 different states (Table  S1), which represent the major swine-producing states in the United States. Variability in the serotype distribution of S. suis has been reported within the same country, which is likely due to natural differences in geographic distribution (13). Geographic distribution of the S. suis serotypes in our study identified serotype 1/2 in 13 of the 20 states, with a concentration in 5 of the 20 states, possibly displaying a geographic distribution pattern of serotype 1/2 in the United States. Serotype 1/2 is also a frequent serotype  (3). The five CCs are indicated by black brackets, with the number of isolates in the CC. Blue brackets represent clusters of isolates without a CC. Nontypeable (NT) represents isolates which could not be serotyped using coagglutination, PCR, or WGS. #, group of isolates lacking a CC; ϩ, ST13 not within a CC but closest to CC1; ϳ, ST979 not within a CC but closest to CC94. found in Canada, although at lower levels than serotypes 2 and 3 (52). This prevalence of serotype 1/2 in Canada may contribute to the U.S. serotype distribution through the transport of pigs between the two countries (50). Transport of livestock has been associated with geographic invasion or the emergence of a pathogen in a novel geographic area (53)(54)(55). While most transport of pigs to the United States head to harvest facilities, new breeding stock of pigs could be colonialized with new S. suis strains, which could result in the spread of new strains to downstream swine farms. Whole-genome analysis of the U.S. and Canadian serotype 1/2 strains would further clarify the relationship between U.S. and Canadian 1/2 strains.
We anticipated identifying a large number of novel ST profiles due to the inclusion of commensal and possibly opportunistic samples, which are not generally subjected to subtyping by MLST. As a result of this study, 38 novel ST profiles were submitted to the S. suis MLST database. Of the 58 STs identified here, 24 STs were isolated from pigs with clinical disease, and the predominant STs were ST28 (n ϭ 42), followed by ST1 (n ϭ 17), ST94 (n ϭ 14), and ST108 (n ϭ 14). In a previous Canadian study in 2011, ST25 was the predominant ST found in Canada, while ST28 was the predominant ST found in the United States (22). Our results confirm ST28 as a predominant pathogenic pathotype, while ST25 represents only 1% of the strains recovered from diseased pigs (n ϭ 2). The reason for this low percentage of ST25 isolates in the United States is unclear, and updated ST analysis of S. suis strains from Canada is needed to confirm ST25 as the predominant ST in that country. Our ST distribution also differs from that of European and Asian countries in which ST1 strains, largely characterized as serotype 2, are predominant in diseased pigs (50,56).
Proportions, OR, and clustering analysis illustrated potential relationships among pathotypes, serotypes, and STs. While multiple pathogenic serotypes and STs were identified in our study, this discussion focuses on serotype and STs with more than four isolates in the pathogenic pathotype. Serotypes 1, 1/2, 2, 7, 14, and 23 as well as ST1, ST13, ST28, ST94, ST108, ST961, and ST977 were frequently identified as pathogenic strains. Based on our pathotype classifications, isolates characterized as pathogenic were linked to neurological or systemic disease, and our analyses provide evidence that these subtypes are potential indicators of virulence. As discussed previously, serotypes 2 and 1/2 are predominant serotypes identified from diseased pigs in North America, supporting our observations of these serotypes as pathogenic strains by proportions, OR, and clustering analysis (14,15,49,50,52).
Serotypes 1 and 7 are more prevalent in diseased pigs in some European countries than in North America, and pathogenic serotype 1 strains have been linked to the production of muramidase-released protein (MRP), extracellular-factor protein (EF), and suilysin (SLY). Pathogenic serotype 1 strains have been characterized as producing both MRP and EF, with variable production of SLY (16,18). In one study (18), four of the six serotype 1 strains were MRP ϩ EF ϩ SLY ϩ and five of the six were either ST1 or ST13, indicating a correlation between serotype 1, ST1, ST13, and virulence. Interestingly, the serotype 1 isolates in the current study were either ST1 (n ϭ 7/11) or ST13 (n ϭ 4/11) and were associated with the pathogenic pathotype, supporting the previous study. Serotype 7 was the second-most common serotype identified in this study, and 19/23 isolates were characterized as the pathogenic pathotype. Virulence studies on serotype 7 strains demonstrating clinical disease in pigs are limited, but a previous in vivo study associated serotype 7 with septicemia and arthritis, with rare cases of meningitis (57). These findings support the classification of serotype 7 as pathogenic.
This study demonstrates that ST appears to be a stronger predictor of pathotype than serotype. While experimental mouse models have demonstrated the virulence of serotype 2 ST1, ST25, and ST28 (22,56), our analyses also illustrated ST1, ST13, ST28, ST94, ST108, ST961, and ST977 (of various serotypes) as pathogenic. As mentioned previously, we hypothesize that Canadian and U.S. serotype 2 and serotype 1/2 strains share a evolutionary lineage. If so, the observed virulence of serotype 2 ST28 in previous studies may support the virulence of serotype 1/2 ST28, as predicted in our study. Whole-genome single nucleotide polymorphism (SNP)-based phylogenetic analysis of S. suis serotype 2 ST28 strains revealed a unique clade composed of virulent strains capable of inducing severe disease in a murine infection model (58). These strains demonstrated differences in virulence to reference serotype 2 ST28 strains of low virulence. Recently, a study characterized pathogenic Australian serotype 1/2 ST1 strains by core genome single nucleotide polymorphisms and linked the genetic similarity to pathogenic serotype 1/2 ST1 strains from the United Kingdom and Vietnam (59). Our clustering analysis indicates that ST1, ST13, ST94, ST108, ST961, and ST977 may also be pathogenic. It would be of interest to further investigate the virulence properties of serotype 1/2 ST28, as well as ST1, ST13, ST94, ST108, ST961, and ST977 strains isolated in the United States.
In addition to strains in CC1, CC28, and CC104, serotype 9 strains belonging to CC16 (previously CC87) have been isolated from pigs with invasive disease (20). However, the low percentage of serotype 9 strains in our study is reasonable because serotype 9 is predominant in diseased pigs from the Netherlands (16). The serotype 9 strains in this study belong to multiple CCs or occur as singletons and did not demonstrate associations with pathotype. Serotype 9 isolates from diseased and healthy pigs in China were characterized into multiple STs and demonstrated high diversity among the isolates (60). The majority of these serotype 9 isolate STs occurred as singletons and did not form major clonal complexes.
Inversely, commensal S. suis serotypes 21 and 31 and ST750 and ST821 were identified by proportions, OR, and cluster analysis. Studies on S. suis from North America have observed a prevalence of serotype 21 from healthy pigs (26,27). However, previous studies have identified a limited number of serotype 31 strains from pigs with typical clinical signs of S. suis disease (49,52,61,62). The association between serotype 31 and pathotype remains unclear and requires further investigation.
Associations among serotypes, STs, and pathotypes, although identified by individual analyses, were not evident in the MCA, indicating both serotype and ST together could not indicate pathotype. We investigated additional approaches, such as chisquare and Fisher's exact tests, but these tests failed to generate significant relationships between both serotype and ST. In addition, we investigated associations between serotype-ST combinations and pathotype by chi-square and Fisher's exact tests and did not identify any significant associations. One possible explanation for this is the lack of discrimination due to the limitations of sample size within each subtype. Traditional chi-square and Fisher's exact tests work best on nonsparse data (few zero values) (63,64). These tests have been used to identify associations between S. suis subtypes and characteristics of pathogenicity. However, most studies involved a limited number of subtypes of interest, while our study focused on all serotypes and STs identified in our sample set. Due to the diversity of the S. suis strains in this study and the large number of subtypes evaluated, the division of our data by pathotype resulted in sparse data. Thus, sparse data limits our ability to conduct certain analyses using common approaches for S. suis. An OR formula was used to evaluate statistical significance of subtype with pathotype, as well as the size of the possible effect, to limit the misidentification of associations due to sample size. For this reason, proportions were used for basic identification of relationships and OR analysis was used for further discrimination of strains.
In summary, our study increases the knowledge on S. suis strains circulating in the United States between 2014 and 2017 by investigating serotype and ST distributions. We identified a diverse set of strains, predominantly serotypes 1/2, 3, and 7, and as ST1, ST28, and ST94. Further investigation by pathotype classification (defined in this study) identified STs that could be differentiated as pathogenic or commensal pathotypes. The predominance of serotype 1/2 strains from clinically affected pigs in our study stresses the importance of expanding studies of virulence traits to other serotypes and STs of S. suis. These findings can be applied to improve the prevention and control of S. suis by selecting strains for diagnostics and vaccine development.

SUPPLEMENTAL MATERIAL
Supplemental material for this article may be found at https://doi.