Sequence Type 631 Vibrio parahaemolyticus, an Emerging Foodborne Pathogen in North America

Vibrio parahaemolyticus is the leading seafood-transmitted bacterial pathogen worldwide. It causes gastroenteritis and, rarely, lethal septicemia. The estimated 45,000 annual cases of foodborne V. parahaemolyticus infections in the United States are concerning because their incidences are rising

V ibrio parahaemolyticus is the leading seafood-transmitted bacterial pathogen worldwide. It causes gastroenteritis and, rarely, lethal septicemia. The estimated 45,000 annual cases of foodborne V. parahaemolyticus infections in the United States are concerning because their incidences are rising despite control measures, in part due to the impact of changing climate on pathogen abundance and distribution (1; https://www.cdc.gov/vibrio/). Although the pandemic complex of strains of sequence type 3 (ST3) (serotype O3:K6) has dominated infections worldwide (2), in the United States and Canada, the most prevalent clinical strains are of ST36 (O4:K12), which recently spread from the Pacific into the Atlantic (3-8).
Here we report that a new lineage of V. parahaemolyticus, identified as ST631, is rapidly emerging as the predominant pathogenic clade endemic to the Atlantic coast of North America (3, 4,8). The first reported ST631 genome came from a clinical case that occurred in Louisiana in 2007 and was traced to oysters from Florida (8). In 2009, a second ST631 clinical isolate was reported in Prince Edward Island, Canada (O11:KUT) (4). From 2010 to 2015, the incidence of infections by strains of ST631 has increased, with 35 confirmed cases reported in four Atlantic coastal U.S. states (Table 1), where they are second only to ST36 strains in prevalence. Due to the self-limiting nature of infections and underreporting (9), ST631 infections may be more widespread.
Genome comparisons were used to understand the potential relationships of ST631 strains, which share no recent ancestry with and differ substantially from ST36 and ST3 strains (Ͼ3,600 out of 3,909 shared genes contained variation). ST631 has a virulence gene profile similar to that of ST36 in that it harbors tdh, trh, and a type 3 secretion system (T3SS2) and is urease positive. We applied a core genome multilocus typing (cgMLST) scheme to draft genomes of 37 clinical isolates and 1 environmental isolate (Table  1) representing the geographic and time spans of infections. This analysis identified 132 single nucleotide polymorphisms (SNPs) in the population and confirmed that clinical ST631 isolates are clonal, with limited diversification (Fig. 1). Within the ST631 population, 97% of the core genes are identical, whereas less than 8% of the core genes are identical between ST631, ST36, and ST3 strains. Both maximum-likelihood phylogeny and minimum spanning tree analysis indicated a mixed population ( Fig. 1A and B). Most isolates grouped within one clonal complex, with only a few divergent isolates (Fig. 1B). This population structure suggests that this pathogenic lineage recently evolved and that its distribution may have expanded along the North American Atlantic Coast (10).
The fact that an increasing number of cases tracing to sources in the northwestern Atlantic suggests that ST631 poses a mounting public health threat and calls for surveillance of this lineage to reduce illnesses. That its emergence coincided with  warming ocean trends in some areas of the northwestern Atlantic (2) and invasion by a nonresident pathogen indicates that a changing climate may be driving pathogen dynamics (1, 2, 3, 7). However, this does not eliminate the potential of anthropogenic influences on the dissemination of ST631 strains, whose continued population expansion may increase human health risk beyond North America.
Accession number(s). Sequences were deposited in the Sequence Read Archive under accession numbers SRR1952988, SRR4016797, SRR4016801, SRR4018053, SRR4032168 to SRR4032182, SRR4032354 to SRR4032363, SRR4035056, and SRR4090622 to SRR4090626.  Table 1) demonstrates the highly clonal nature of pathogenic ST631 isolates, which are colored by year and marked by geographic distribution. The scale bar represents the average number of nucleotide substitutions per site, and branches with greater than 60% bootstrap support are labeled. (B) A minimum spanning tree analysis reflecting the relationships among ST631 isolates based on core gene SNPs differences further demonstrates the clonal population structure. The numbers above the connected lines (not to scale) represent SNP differences. The isolates are colored by year of isolation using the same color scheme as in panel A. Cluster analysis of ST631 was performed using a custom cgMLST analysis using Ridom SeqSphereϩ software v3.2.1 (Ridom GmbH, Münster, Germany). Briefly, the cgMLST software first defines a cgMLST scheme using the cgMLST target definer tool with default settings. MAVP-Q was used as the reference genome (4,568 genes). Then, five other V. parahaemolyticus genomes (strains BB22OP, CDC_K4557, FDA_R31, RIMD 2210633, and UCM-V493) were used for comparison with the reference genome to establish the core and accessory genome genes. Genes that are repeated in more than one copy in any of the six genomes were removed from the analysis. Subsequently, a task template that contains both core and accessory genes was created. Each individual gene locus from MAVP-Q was assigned allele number 1. Then each individual ST631 V. parahaemolyticus genome assembly was queried against the task template, during which any locus that differed from the reference genome or any other queried genome was assigned a new allele number. For the cgMLST, a gene-by-gene analysis of all core genes (excluding accessory genes) was performed and SNPs were identified within different alleles to establish genetic distance calculations. PEI, Prince Edward Island; FL, Florida.

ACKNOWLEDGMENTS
Letter to the Editor Journal of Clinical Microbiology