Chagas disease serological test performance in United States blood donor specimens

Background Chagas disease affects an estimated 300,000 individuals in the US. Diagnosis in the chronic phase requires positive results by two different IgG serological tests. Three ELISAs (Hemagen, Ortho, Wiener) and one rapid test (InBios) are FDA-cleared, but comparative data in US populations are sparse. Methods We evaluated 500 seropositive and 300 seronegative blood donor plasma samples. Country of birth was known for 255 seropositive specimens and grouped into regions: Mexico (n=94), Central America (n=88) and South America (n=73). Specimens were tested by the four FDA-cleared IgG serological assays. Test performance was evaluated by two comparators and latent class analysis. Results InBios had the highest sensitivity (97.4-99.3%), but lowest specificity (87.5-92.3%). Hemagen had the lowest sensitivity (88.0-92.0%), but high specificity (99.0-100.0%). Sensitivity was intermediate for Ortho (92.4-96.5%) and Wiener (94.0-97.1%); both had high specificity (98.8-100.0% and 96.7-99.3%, respectively). Antibody reactivity and clinical sensitivity was lowest in donors from Mexico, intermediate in those from Central America and highest in those from South America. Conclusions Our findings provide an initial evidence base to improve laboratory diagnosis of Chagas disease in the US. The best current testing algorithm would employ a high sensitivity screening test followed by a high specificity confirmatory test.


Introduction
For the current analysis, the Ortho ELISA was re-run on all 800 specimens in 2019. Aliquots 100 used for current Ortho testing were thawed and refrozen twice.
FDA approval for blood donation screening and clearance for diagnostic purposes, but is not yet 104 marketed for the latter use. For the Ortho ELISA, signal-to-cutoff (S/CO) ratios of 1.00 or 105 greater are considered reactive; in the blood donation screening algorithm, all reactive units are 106 retested two more times. A blood donation is considered repeat reactive if at least 2 of 3 107 sample results have an S/CO greater than 1.00.

108
Testing by Hemagen ELISA, Wiener ELISA, and InBios rapid tests was conducted at the BD was defined by the blood donation testing algorithms described above (8). For InBios 126 testing, reader 1 scores were used for performance calculations; reader 2 scores were used to 127 calculate inter-reader agreement statistics. The Hemagen and Wiener kits both include an 128 indeterminate zone; results that fell in this zone were included as positive in the performance 129 analyses, because they would necessitate confirmatory testing in real-world scenarios. This 130 definition may overestimate the sensitivity and/or specificity of these two tests (depending on 131 whether the grey zone results predominantly correspond to seropositive or seronegative 132 specimens). Exact binomial 95% confidence intervals (CI) were calculated for each of the 133 performance parameters. Analyses were conducted in SAS 9.4 and R version 3.5.2.

134
The third performance assessment consisted of a latent class analysis (LCA). LCA 135 comprises a group of mathematical modeling techniques developed to evaluate diagnostic tests 136 in the absence of a true gold standard (20)(21)(22)(23). We assumed two latent classes and conditional 137 independence of test outcomes. We used bootstrapping to generate multiple samples from the dataset and then applied an expectation-maximization (EM) algorithm to estimate sensitivity and 139 specificity for each test. The distributions of the bootstrapped samples were used to generate 140 95% CIs. We tested the robustness of the two-class assumption by comparing fit between 141 models assuming two versus three latent classes, using the Akaike information criterion (AIC) 142 and Bayesian information criterion (BIC). The latent class analysis was conducted in R version

146
California and the southeastern states accounted for nearly three-quarters of the blood 147 donations included in the study (Table 1). BD-positive specimens were significantly more likely 148 than BD-negative ones to be from donors who identified themselves as Hispanic. Among 282 were born in the US, but the source of their infections was likely a mixture (congenital, travel or locally acquired); this group of donations was not included in analyses by birth country.

153
The three analyses (BD-status, consensus, and LCA) yielded similar results, with highest sensitivity estimates resulted from the LCA and the lowest from the BD comparison; the

158
In all three analyses, InBios CDP had the highest sensitivity (97 to 99%), but the lowest 159 specificity (88 to 92%). Reader agreement on InBios scores was high (weighted kappa=0.9315 negative results. Current Ortho S/CO values were a median 15.9% lower than in BD testing regression analysis of percent decline in S/CO vs specimen age in months).

181
Finally, we stratified results by region of birth to explore geographic variation in test 182 sensitivity (Table 3). Compared to BD or consensus status, sensitivity for Ortho, Wiener and 183 Hemagen tended to be lowest in specimens from those born in Mexico and highest in those

197
Simultaneous use of two tests optimizes both parameters and may be cost-effective in high 198 prevalence settings. However, when low prevalence is anticipated, universal testing by two 199 assays is impractical. Most programs will use one test as a screen and run only the screen-200 positives by the second assay. In these circumstances, the order is crucial; a high sensitivity 201 screening test is essential to minimize the risk of missing true infections (Figure 2). At the same confidence in testing. For example, in a setting of 1.5% prevalence (27), any specificity lower 204 than 98.5% will result in more false than true positive results.

205
No single test had optimal performance characteristics in our data, despite the high 206 sensitivity and specificity figures reported in their FDA 510(k) clearance applications and 207 package inserts (28)(29)(30)(31). In part, this may be attributable to the difference in performance in a 208 setting closer to 'real world' diagnostic testing versus the more controlled setting of a clinical 209 trial. However, a major issue in the available data is that few specimens from Mexico and 210 Central America appear to have been included in preclinical testing (28,29,31  Central America (33). Thus, the low reactivity in Mexican specimens is not a result of TcI 220 predominance per se. Poorly understood strain differences within the TcI DTU may be 221 responsible for the observed geographic variability in immune response (15,17).

222
Based on the performance in our data, the Wiener Recombinante 3.0 and Ortho ELISAs 223 showed the best balance of sensitivity and specificity, but both had suboptimal sensitivity in 224 Mexican specimens. The InBios rapid test had the best sensitivity, with high sensitivity even in 225 Mexican specimens, but its low specificity will result in a substantial number of false positives 226 requiring confirmatory testing. The low sensitivity of Hemagen, especially in Mexican 227 specimens, raises the risk of false negatives and concerns for its use as a screening test. In all 228 cases, a discordant result between screening and confirmatory testing should prompt a third test as a tie-breaker, such as the IgG TESA-blot or the Abbott ESA, the latter having received FDA 230 licensure for confirmatory use in the blood donor screening algorithm.

231
The use of surplus blood donation specimens has both limitations and advantages.       Seronegative specimens frequency-matched to seropositive specimens by donation region. 2 Positive blood donors significantly more likely to report Hispanic ethnicity (p<0.0001). 3 Data available for 282 blood donors identified as seropositive in blood donation testing; no data for 218 seropositive and 300 seronegative specimens.   Effect of variation in clinical sensitivity of initial test in a two-step diagnostic algorithm. Two-step diagnostic algorithms allow for an acceptable number of false positives to ensure positive cases are detected. A) Illustrates higher sensitivity initial test, with a high specificity confirmatory test to rule out false positives. B) Illustrates a missed case of Chagas disease due to a lower sensitivity initial test and false negative result.