ABSTRACT
Broad testing for respiratory viruses among persons under investigation (PUIs) for SARS-CoV-2 has been performed inconsistently, limiting our understanding of alternative viral infections and coinfections in these patients. RNA metagenomic next-generation sequencing (mNGS) offers an agnostic tool for the detection of both SARS-CoV-2 and other RNA respiratory viruses in PUIs. Here, we used RNA mNGS to assess the frequencies of alternative viral infections in SARS-CoV-2 RT-PCR-negative PUIs (n = 30) and viral coinfections in SARS-CoV-2 RT-PCR-positive PUIs (n = 45). mNGS identified all viruses detected by routine clinical testing (influenza A [n = 3], human metapneumovirus [n = 2], and human coronavirus OC43 [n = 2], and human coronavirus HKU1 [n = 1]). mNGS also identified both coinfections (1, 2.2%) and alternative viral infections (4, 13.3%) that were not detected by routine clinical workup (respiratory syncytial virus [n = 3], human metapneumovirus [n = 1], and human coronavirus NL63 [n = 1]). Among SARS-CoV-2 RT-PCR-positive PUIs, lower cycle threshold (CT) values correlated with greater SARS-CoV-2 read recovery by mNGS (R2, 0.65; P < 0.001). Our results suggest that current broad-spectrum molecular testing algorithms identify most respiratory viral infections among SARS-CoV-2 PUIs, when available and implemented consistently.
INTRODUCTION
The coronavirus disease 2019 (COVID-19) pandemic caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has precipitated a massive global health and economic crisis, claiming over a million lives since its emergence in Wuhan, China, in December, 2019 (1). Broad testing for respiratory viruses among persons under investigation for SARS-CoV-2 is performed inconsistently, limiting our understanding of alternative viral infections and coinfections in these patients. We evaluated the performance of RNA metagenomic next-generation sequencing (mNGS) to detect SARS-CoV-2, coinfections, and alternative viral infections in persons under investigation (PUIs) for SARS-CoV-2 during the first 2 months of the pandemic in the state of Georgia.
MATERIALS AND METHODS
Samples and clinical testing.A convenience sample set was selected from 75 PUIs who were tested for SARS-CoV-2 in the Emory Healthcare system between 26 February and 23 April 2020 (spanning the first detection of SARS-CoV-2 infection in Georgia on 2 March and in the Emory Healthcare system on 3 March). Clinical data including laboratory results were extracted by chart review. Flu/RSV PCR (Cepheid, Sunnyvale, CA) and multiplex respiratory panels, including the eSensor respiratory viral panel (RVP; GenMark Diagnostics, Inc., Carlsbad, CA), the BioFire FilmArray respiratory pathogen panel, and the BioFire FilmArray pneumonia panel (BioFire Diagnostics, LLC, Salt Lake City, UT), had been performed at the discretion of treating physicians. The eSensor RVP (GenMark Diagnostics, Inc., Carlsbad, CA) includes the following RNA respiratory viruses: influenza, respiratory syncytial virus (RSV), parainfluenza, human metapneumovirus (hMPV) and rhinovirus. The BioFire FilmArray panels (BioFire Diagnostics, LLC, Salt Lake City, UT) include the same RNA respiratory viruses as described above with the addition of human coronaviruses (HCoV). Residual upper respiratory tract (nasopharyngeal) swab samples were retrieved from 4°C storage in the clinical laboratory within 72 h of collection and stored at −80°C until sequencing. The study was approved by the institutional review board at Emory University.
Library preparation and sequencing.Since most respiratory pathogens that present similarly to SARS-CoV-2 are RNA viruses, we focused this analysis on detection of RNA respiratory viruses. RNA was extracted from the primary sample, and repeat SARS-CoV-2 testing was performed by triplex RT-PCR (2). Samples underwent RNA mNGS as previously described (3, 4). Briefly, methods included DNase treatment, random primer cDNA synthesis, and Nextera XT tagmentation. A water sample was included as a negative control with each library construction batch, and stringent laboratory protocols were followed to minimize the risk of contamination. Libraries were made with unique dual indexes, and the indexes used for samples from SARS-CoV-2-negative PUIs had never been used in the laboratory previously. Samples underwent Illumina sequencing to a median (interquartile range [IQR]) depth of 42 (26 to 60) million reads per sample. Of the 75 samples, 74 (99%) had more than 1 million reads, and 66 (88%) had more than 10 million reads.
Data analysis.Reads underwent taxonomic classification by KrakenUniq, as implemented in viral-ngs (5). For computational confirmation, reads that were taxonomically assigned to a human respiratory virus species by KrakenUniq were mapped to the corresponding RefSeq reference sequence in Geneious Prime V2020.1.1 and manually inspected (see Fig. S1 in the supplemental material). Reads that were taxonomically assigned to a potential human respiratory virus family by KrakenUniq, but not classified to the species level, were first classified to the species level by BLASTn and then mapped to the corresponding RefSeq reference sequence, as described above. Some reads were assigned to a virus by KrakenUniq but failed to map to the reference sequence in Geneious; these reads underwent final classification by BLASTn, and in all cases were nonviral.
Detection of a virus was reported if nonoverlapping reads from ≥3 distinct genomic regions were identified (6). Values of reads per million (RPM) were calculated by dividing the number of mapped reads by the total number of reads and multiplying by one million. Reference-based viral genome assembly was performed using viral-ngs, and percent genome coverage was reported if at least ∼50% of the genome was assembled. The following references were used for alignment and assembly: NC_001803.1 (RSV), NC_0391991.1 (hMPV), NC_006577.2 (HCoV HKU1), NC_005831 (HCoV NL63), NC_006213.1 (HCoV OC43), and NC_026431.1 to NC_026438.1 (influenza virus H1N1).
To evaluate the possibility of contamination between samples in this study, we compared assembled genomes and reads across all samples that were identified to have influenza (GA-EHC-006F, GA-EHC-063K, GA-EHC-095Q, GA-EHC-103Y), hMPV (GA-EHC-079A, GA-EHC-092N, and GA-EHC-094P), and RSV (GA-EHC-034H, GA-EHC-072T, and GA-EHC-103Y). We found no evidence of contamination based on sequence similarity, though in some cases this analysis was limited by low genome coverage.
Data availability.All reads (cleaned of human reads) are available in the Sequence Read Archive under BioProject PRJNA634356, and complete genomes are available in GenBank under accession numbers MW008562 to MW008571, MW008573 to MW008590, and MW047016 to MW047018. Statistical analysis was performed using R (v4.0.2).
RESULTS
Seventy-five PUIs were included; 30 were negative and 45 were positive for SARS-CoV-2 as determined by RT-PCR. Median (IQR) times to sample collection from the onset of symptoms were 4 (2 to 6) and 5 (4 to 7) days among PUIs who were determined to be SARS-CoV-2 negative and positive by RT-PCR, respectively (see Fig. S2A in the supplemental material).
Among the 30 SARS-CoV-2-negative PUIs, 93% (28/30) underwent clinical testing for alternative infections, 73% (22/30) by Flu/RSV PCR testing and 80% (24/30) by multiplex respiratory viral panel testing. Of the 30 SARS-CoV-2-negative PUIs, 8 (27%) tested positive for another respiratory virus by routine clinical testing. In all cases, mNGS identified the same virus (Table 1).
Molecular and metagenomic testing of persons under investigationa
Importantly, mNGS also identified viruses in four samples that were not detected by routine clinical testing. In the first sample (GA-EHC-001A), HCoV NL63 was identified by mNGS, and no clinical testing for HCoV NL63 had been performed. We performed multiplex PCR on a research basis (BioFire FilmArray respiratory pathogen panel) and confirmed the finding of HCoV NL63. In the second sample (GA-EHC-072T), RSV was identified by mNGS but not by clinical multiplex PCR testing (BioFire FilmArray respiratory pathogen panel). We performed multiplex PCR on a research basis (BioFire FilmArray respiratory pathogen panel) and again did not detect RSV. RSV was detected by repeat sequencing of an independent RNA mNGS library. In each library, RSV was present at a very low level (0.4 virus reads per million total sequencing reads); this may be below the limit of detection of multiplex PCR, or the sequence may have contained regions of primer mismatch. In the third sample (EHC-C19-079A), hMPV was identified by mNGS and no clinical testing for hMPV had been performed. There was no residual primary sample available for multiplex PCR testing, but RNA mNGS of an independent library confirmed the presence of hMPV. Finally, in the fourth sample (GA-EHC-103Y), RSV was identified by mNGS but not by clinical multiplex PCR testing (BioFire FilmArray respiratory pathogen panel). We performed multiplex PCR on a research basis (BioFire FilmArray respiratory pathogen panel) and confirmed the presence of RSV. In addition, a residual oropharyngeal (OP) sample was available from the same patient (GA-EHC-102X), and this was positive for RSV by both mNGS and research multiplex PCR (BioFire FilmArray respiratory pathogen panel), suggesting that the initial clinical test from the NP sample was erroneous. Influenza was also detected by mNGS in sample GA-EHC-103Y but was not detected by confirmatory multiplex PCR. Influenza was also not detected in the same patient’s OP sample (GA-EHC-102X) by mNGS or multiplex PCR and thus represents an unconfirmed finding.
Among the 30 RT-PCR-negative PUIs, no SARS-CoV-2 was detected by mNGS (see Table S1 in the supplemental material). Eighteen (60%) patients had no respiratory viruses detected by routine clinical testing or mNGS.
Among the 45 SARS-CoV-2 RT-PCR-positive PUIs, 31 (69%) underwent clinical testing for coinfections, and none were detected (Table 1). mNGS identified one viral coinfection (RSV in sample GA-EHC-034H) that was not tested for by clinical PCR but was confirmed by research multiplex PCR (BioFire FilmArray respiratory pathogen panel).
SARS-CoV-2 was present in all 45 RT-PCR-positive PUIs by mNGS (Table S2). For the sample (GA-EHC-084F) with the lowest concentration of SARS-CoV-2 in our study, cycle threshold (CT) = 34, only one SARS-CoV-2 genome region was identified, which did not meet our criteria for detection by mNGS (three or more genome regions). For the remaining 44 samples, mNGS robustly detected SARS-CoV-2, at CT values up to 32, as well as in one sample that was positive by initial clinical testing but negative by our laboratory RT-PCR (GA-EHC-019S; see Table S2 in the supplemental material). Across all samples, the median (IQR) SARS-CoV-2 CT value was 25 (21 to 28). SARS-CoV-2 CT values were correlated with time since symptom onset (R2, 0.067 [P = 0.05]; with outlier [n = 1] omitted, R2, 0.17 [P = 0.04]) and inversely correlated with SARS-CoV-2 read recovery by mNGS (R2, 0.65; P < 0.001 [Fig. 1]).
Correlation of SARS-CoV-2 RT-PCR CT and log-transformed reads per million for PUI samples testing positive (n = 44). CT, cycle threshold; RPM, reads per million.
DISCUSSION
Overall, using RNA mNGS, we readily identified SARS-CoV-2, RNA viral coinfections, and alternative infections in PUIs, including viruses not identified by routine clinical testing. We did not detect SARS-CoV-2 by mNGS in samples that were determined to be negative by RT-PCR, supporting the sensitivity of the RT-PCR assay. Interestingly, 60% of PUIs who were negative for SARS-CoV-2 had no alternative infection identified on either routine testing or mNGS, which may reflect inadequate sampling, rapid clearance of a potential viral infection, infection with a pathogen other than an RNA virus, or a noninfectious process.
Limitations of our study include the testing of banked samples, which may degrade over time with freeze-thaw cycles, the use of clinical tests (such SARS-CoV-2 RT-PCR) as an imperfect gold standard, a focus on RNA viruses, and low read depth for a small number of samples. Nevertheless, our results offer several valuable insights regarding molecular testing for respiratory viruses among SARS-CoV-2 PUIs.
First, we found a low rate of viral coinfection in patients with SARS-CoV-2, with only one coinfection (with RSV) among 45 patients (2%). Our results are concordant with a systematic review of 16 studies and 1,014 patients, which found a 3% rate of viral coinfection, with RSV and influenza A being the most common (7). Thus, viral coinfections may be less common than initially reported (8, 16); nevertheless, it is imperative to continue to monitor for them in the setting of the upcoming respiratory virus season and ongoing waves of COVID-19. With many commercial platforms now offering multiplex panels that contain SARS-CoV-2, this will be feasible for many centers (9).
Second, among patients with SARS-CoV-2, we found that viral genome recovery by mNGS was correlated with viral load (inversely correlated with CT), as has been shown for other RNA respiratory viruses (10). Furthermore, we were able to reliably detect SARS-CoV-2 by mNGS in samples with CT values up to 32, although no formal test of limit of detection was attempted. This high viral load subset likely represents patients of highest clinical and epidemiological importance, as the CT value on initial presentation has been found to be an independent predictor of clinical outcomes, including respiratory failure and inpatient mortality (11). Similar to other studies (12), we also noted that the duration of time between symptom onset and SARS-CoV-2 testing was correlated with CT (inversely correlated with viral load). However, SARS-CoV-2-negative PUIs did not have a longer symptom interval than SARS-CoV-2-positive PUIs, suggesting that these were not individuals outside the testing window.
Finally, our results illustrate key considerations about the potential use of mNGS as a diagnostic tool. Notably, mNGS detected all viruses that were identified by routine clinical workup, as well as four viruses that were not identified by clinical workup. Two of these had not been tested for clinically, underscoring the importance of routine broad-spectrum molecular testing for respiratory viruses among PUIs. However, two of the viruses were tested for and not identified by multiplex PCR. In one case, the viral level may have been below the limit of detection of multiplex PCR. In the other case, a repeat multiplex PCR analysis was positive, suggesting a potential error in the initial clinical testing. Overall, our results are similar to those of a prior study that found high concordance between RNA mNGS and a commercial RVP (GenMark Diagnostics, Inc., Carlsbad, CA) and also detected several pathogenic respiratory viruses by mNGS that were either not targeted by the RVP or missed due to low virus levels (13). Further advantages of mNGS have been reviewed elsewhere (14) and include the opportunities to characterize viral genomics, the microbiome, and the host transcriptome.
Current challenges with implementing mNGS into routine clinical microbiology workflows include its relatively high cost and low throughput, as well as the need for expertise in interpreting results and clinical correlation. One barrier in particular is the lack of established thresholds for identifying pathogens. Here, we required detection of nonoverlapping reads from ≥3 distinct genome regions based upon previously described criteria (6). Recently, Peddu et al. used a cutoff based on reads per million (RPM) (<10) (15); in our study, the use of this cutoff would have excluded five viruses (hMPV, HCoV OC43, RSV [n = 2], and influenza A) that were found by both mNGS and routine clinical testing by PCR, which we regard as true positives (see Tables S1 and S2). Thus, there is ongoing need to harmonize practices around clinical mNGS testing.
In conclusion, our results suggest that current broad-spectrum molecular testing algorithms identify most respiratory viral infections among SARS-CoV-2 PUIs, when available and implemented consistently. In addition, these results illustrate the potential of mNGS to streamline and expand clinical testing for respiratory viruses, which may augment strategies to surveil for unexpected viral coinfections or the emergence of divergent strains during periods of high transmission.
ACKNOWLEDGMENTS
We acknowledge our laboratory colleagues at the Emory University Healthcare Microbiology and Molecular laboratories who have worked tirelessly to provide necessary care to our patients during this time. We thank the staffs of the Emory Integrated Genomics Core, the Yerkes NHP Genomics Core, and the Emory Clinical Virology Research Laboratory for sequencing support, and we thank Daniel J. Park for valuable advice on the use of viral-ngs.
This study was supported by the Pediatric Research Alliance Center for Childhood Infections and Vaccines and Children’s Healthcare of Atlanta. The Yerkes NHP Genomics Core is supported in part by NIH P51 OD011132 and sequencing data were acquired on an Illumina NovaSeq6000 funded by NIH S10 OD026799. Salary support was provided by the Doris Duke Charitable Foundation, a Clinical Scientist Development Award 2019089 (J.J.W.) and NIH K08 AI13948 (A.P.). This study was supported in part by the Center for AIDS Research (P30 AI050409).
FOOTNOTES
- Received 15 August 2020.
- Returned for modification 5 September 2020.
- Accepted 13 October 2020.
- Accepted manuscript posted online 16 October 2020.
Supplemental material is available online only.
- Copyright © 2020 American Society for Microbiology.