TABLE 2

Strains used in the validation of the hicap method, with predicted serotype and fragmentation status of the cap locus as determined by hicap

StrainSerotypeFragmented locusAssembly identifieraRead identifier
DescribedPredicted by hicap
Hi75aaNoERX1834398
Hi76aaNoERX1834399
Hi77aaNoERX1834400
Hi78aaNoERX1834401
Hi79aaNoERX1834402
Hi609aaNoGCA_003363335.1
Hi642aaNoGCA_003363355.1
NML-Hia-1aaNoGCA_001856725.1
10810bbNoGCA_000210875.1
ATCC 10211bbYesGCA_001997355.1
Hi80bbNoERX1834403
Hi81bbYesERX1834404
Hi82bbYesERX1834405
Hi83bbYesERX1834406
Hi84bbNoERX1834407
NCTC 13377bbNoGCA_900478275.1
NCTC 8468bbNoNCTC 8468 (Sanger FTP)
Hi85ccNoERX1834408
Hi86ccNoERX1834409
Hi87ccNoERX1834410
Hi88ccNoERX1834411
M12125ccNoGCA_003351605.1
M17648ccNoGCA_003351465.1
Hi89ddNoERX1834412
Hi90ddNoERX1834413
hi467eeNoGCA_001975845.1
Hi91eeYesERX1834414
Hi92eeNoERX1834415
Hi93eeNoERX1834416
Hi94eeNoERX1834417
Hi95eeNoERX1834418
NCTC 8455eeNoGCA_900478735.1
Hi100ffNoERX1834168
Hi96ffNoERX1834419
Hi97ffNoERX1834420
Hi98ffYesERX1834421
Hi99ffNoERX1834422
KR494ffNoGCA_000465255.1
NCTC 11394fbNoNCTC 11394 (Sanger FTP)
NCTC 11426ffNoGCA_900475755.1
WAPHL1ffYesGCA_002237715.1
86-028NPNTHiNo cap locusGCA_000012185.1
PittEENTHiNo cap locusGCA_000016465.1
Rd KW20NTHiNo cap locusGCA_000027305.1
  • a Assemblies that were available for each isolate were downloaded and screened. Assemblies obtained from the Sanger FTP (https://sanger.ac.uk/resources/downloads/bacteria/nctc) were additionally converted from GFF3 to FASTA format. Where an assembly was not available for an isolate, read sets were downloaded and assembled using SPAdes (as described in Materials and Methods) before screening. All assemblies used for testing are available through FigShare (https://doi.org/10.26180/5c352c5110712).