Diagnostic Accuracy of Stool Xpert MTB/RIF for Detection of Pulmonary Tuberculosis in Children: a Systematic Review and Meta-analysis

Invasive collection methods are often required to obtain samples for the microbiological evaluation of children with presumptive pulmonary tuberculosis (PTB). Nucleic acid amplification testing of easier-to-collect stool samples could be a noninvasive method of diagnosing PTB.

Minimal sample preparation is required, and test results are produced within 2 h. In a meta-analysis that pooled data from sputum smear-positive and -negative subjects, the performance of Xpert on respiratory samples had a sensitivity of 62% (95% credible interval, 51 to 73%) and a specificity of 98% (95% credible interval, 97 to 99%). The use of Xpert on sputum is thus more sensitive than smear microscopy. Moreover, Xpert has several operational advantages over mycobacterial culture, the gold standard for TB diagnosis (4). However, in children under 5 years old, and particularly in those under 2 years old, the collection of sputum specimens is difficult and often requires invasive methods that are challenging to implement in resource-limited settings (e.g., nasopharyngeal/nasogastric aspiration or bronchoscopy) and not widely available (2). Furthermore, as pediatric TB is typically paucibacillary, the sensitivity of currently deployed tests is diminished in children versus adults (5).
Mycobacterium tuberculosis-containing sputum may be swallowed, particularly during sleep, and acid-fast bacilli have been shown to survive digestion and are detectable in stool (6,7). As such, stool may represent a more acceptable and feasible alternative to conventional specimens for the evaluation of suspected childhood PTB. The use of Xpert on stool has not been included in recommendations by the WHO, nor has any claim been made by the manufacturer regarding stool. However, several groups have now developed preprocessing methods in order to use Xpert on stool for the diagnosis of childhood TB.
We performed a systematic review and meta-analysis of the diagnostic performance of Xpert using stool samples for PTB in children.

MATERIALS AND METHODS
Protocol and registration. The protocol for this systematic review and meta-analysis was registered at the International Prospective Register of Systematic Reviews (PROSPERO) (identifier CRD42017079836).
Search strategy and information sources. PubMed, EMBASE, Scopus, and the Cochrane Library were systematically searched from 1 January 2008 until 15 June 2018. The search strategy was developed with a medical librarian and based on key validated terms for "children" and "Xpert," as well as "tuberculosis," with no filters applied. The full search strategies for each database are presented in Text S1 in the supplemental material. Experts in TB diagnostics were consulted to identify relevant papers that may have been missed by the search strategy. Citations of reviews and included publications were also searched.
Eligibility criteria. Publications in English, French, Italian, Mandarin, Spanish, and Portuguese; of any design and sampling strategy; and of any enrollment timing (prospective, retrospective, or crosssectional) were eligible for inclusion. Conference proceedings and abstracts, commentaries, editorials, and reviews were excluded, as were studies with a sample size of less than 10. To be included, eligible studies must have reported the diagnostic performance of stool Xpert in patients under 16 years old, compared to a microbiological reference standard for the diagnosis of PTB. Studies that did not explicitly state that their focus was PTB were eligible if the types of specimens used for the reference standard were those that are typically used for PTB diagnosis (e.g., gastric aspirate). Studies that used banked sputum and stool specimens originally collected from children were also eligible.
Study screening and selection. Search results were imported into a citation manager, and duplicates were removed. Two authors (E. MacLean and G. Sulis) independently screened citations by title and abstract per predefined eligibility criteria, followed by full-text review for all selected studies. Results disagreed upon were discussed, and a third reviewer consulted if necessary (F. Ahmad Khan).
Data extraction. A data extraction form was piloted by two reviewers (E. MacLean and G. Sulis) with critical input from a third (C. M. Denkinger). Two reviewers (E. MacLean and G. Sulis) independently extracted results from all included studies using a standardized form (Text S2). After data extraction, results were compared, and disagreements were discussed until a consensus was reached. Study authors were contacted for missing performance data, clarification regarding reference standard definitions, and sample preparation techniques. Using these data and figures indicated in the publications, we reconstructed two-by-two tables for stool Xpert performance compared to the microbiological reference standard and, where applicable, the clinical reference standard.
Risk-of-bias assessment. The Quality Assessment of Diagnostic Accuracy Studies 2 (QUADAS-2) tool (8) was used to assess each included study's risk of bias. No formal assessment of publication bias was made, as traditional methods such as funnel plots and regression tests are not helpful for diagnostic studies (9).
Reference standards. Acceptable microbiological reference standards were mycobacterial culture or Xpert MTB/RIF, performed on specimens that are conventionally used to diagnose childhood PTB (nasogastric aspirates, gastric lavage fluid, nasopharyngeal aspirates, and expectorated sputum). No studies included stool mycobacterial culture in their diagnostic workup. Stool Xpert was not included in the reference standard.
Childhood PTB is often clinically diagnosed (i.e., without microbiological confirmation). As such, we also examined the performance of stool Xpert compared to clinical reference standards that are compatible with updated international guidelines (5). Studies that followed these guidelines used a combination of signs and symptoms, chest radiography, epidemiological history, and tuberculin skin test (TST) results to classify children as "likely TB," "unconfirmed TB," and "unlikely TB" (Table S1). For our purposes, we dichotomized these outcomes into "likely/possible TB" and "unlikely TB." Statistical analysis. Data from reconstructed two-by-two tables were used to calculate sensitivity and specificity and the associated 95% confidence intervals (CIs). In cases of empty cells in two-by-two tables, a zero correction was made by replacing the cell with 0.5. Aggregate-data meta-analyses were performed with bivariate random-effect hierarchical models (10) to estimate pooled sensitivity and specificity for stool Xpert compared to the microbiological reference standard and, separately, compared to the clinical reference standard. We also estimated pooled sensitivity and specificity stratified by HIV status. Results from individual studies and pooled estimates are presented on forest plots. To assess between-study heterogeneity, we used the I 2 statistic (11). In a sensitivity analysis, we estimated pooled sensitivity and specificity after excluding studies that used Xpert MTB/RIF but not mycobacterial culture of conventional specimens as the microbiological reference standard. All analyses were conducted using the Midas package in STATA (STATA 15; Stata Corp., USA) (12). The study is reported according to PRISMA guidelines (Table S2) (13).

RESULTS
Search results. Our search identified 1,589 unique citations from which 34 studies were selected for full-text review, and 9 studies met inclusion criteria (Fig. 1).
Study and participant characteristics. Study and patient characteristics are presented in Table 1. Among the 9 studies that we included, African countries were most well represented (7/9), whereas 2 studies recruited participants from Asia. One study had multiple sites across two continents, whereas the others were single-country studies. In total, 1,681 children from 9 studies were included in our meta-analysis of stool Xpert's diagnostic performance compared to a microbiological reference standard, and 869 children from 5 studies were included in the comparison against a clinical Stool Xpert for Diagnosing Childhood TB Journal of Clinical Microbiology reference standard. The prevalence of microbiologically confirmed cases per study ranged widely, from 2.6% (14) to 54% (15). The prevalence of clinically confirmed or unconfirmed cases was much higher, ranging from 35% (16) to 100% (17). Table S1 in the supplemental material provides details on clinical reference standard definitions of the included studies. Studies enrolled children from 0 to 16 years of age. The ratio of females to males was generally balanced. The percentage of participants with a documented history of TB disease contact, when reported (5/9 studies), ranged from 12% (18) to 56% (19). Most studies did not include information about tuberculin skin test (TST) results. Two studies included only children with HIV (18,20), and two restricted enrollment to HIV-negative children (16,21); the remainder had a mixed population.
Sample processing. Table 2 shows the sample preparation steps utilized in each study. In one study (19), two sample preparation methods were attempted, with results ultimately being pooled. Most studies (6/9) obtained one stool sample from enrolled children, typically within 24 h of obtaining respiratory samples. Samples were either used immediately or stored for later use, except for one study (20) which used some samples immediately and some after freezing and a second study (19) which stored samples collected at the child's home and immediately used those collected at the health care center. As information on sample storage was not available for all studies, subgroup analysis could not be performed per sample storage method.
The mass of stool utilized, and its collection method, varied: 0.15 g of bulk stool (16), 0.15 g using a sterile loop (17), a flocked rectal swab (22), 0.5 g (21), 0.6 g (15), 2 g (20), and 5 g (19). A diluent solution, such as phosphate-buffered saline (PBS), distilled water, or a sucrose solution, was added to the stool before homogenization, in various quantities, typically followed by vortexing. Most studies (6/9) reported a period of sample settling before further workup. Final sample preparation methods were quite varied but included either centrifugation or filtering through a syringe filter or gauze, primarily to remove large particles, before final addition of the sample to the Xpert cartridge (Table 2). Quality assessment. Figure 2 displays the overall risk of bias and applicability concerns of the 9 studies included in our meta-analysis. Figure S1 presents the individual studies' quality assessment results. In the patient selection domain (Fig. 2), five studies were at low risk of bias, and one study (15) was at high risk of bias due to its use of a case-control design, whereas the remaining eight were either cross-sectional or cohort studies. Risk of bias was high for one study because of convenience sampling (16) and unclear in two studies because of an unclear sampling strategy and inappropriate exclusions of certain children (17,21). With respect to applicability, the majority of studies (Table 1) included children who presented with symptoms suggestive of TB. Two studies (18,20) included only children with HIV, and because it is known that Xpert performs differentially for those who are HIV infected (23), these studies were scored for applicability concerns as high. One study (15) tested only samples from confirmed TB cases and noncases, which does not represent a typical clinical scenario, so we also rated applicability concerns as high.
The conduct of the index test generally was at low risk of bias, as Xpert is an automated assay with a predefined cutoff of detection that produces a binary response. However, since there is no standardized operating protocol for stool samples and no internationally recommended procedure for sample storage and processing, applicability concerns regarding the index test's conduct are unclear (Fig. 2).
In light of the inherent limitations of microbiological tests for diagnosing childhood PTB, we classified 8/9 studies as having an unclear risk of bias with respect to correctly classifying the target condition despite having used culture as the reference test. The exception was one study that was scored as having a high risk of bias as its microbiological reference standard did not include culture. Both culture and Xpert are automated assays, so we scored the risk of bias as low regarding test result interpretation. Additionally, all studies' reference standards were performed in regional or central reference laboratories, so we expect bias from operator error to be of low concern. Applicability concerns were uniformly unclear.
We scored the risk of bias as low for all studies with respect to the appropriateness of the time interval between the index test and the reference standard, as all studies reported running stool Xpert within 7 days of specimen collection (Fig. 2).
Results of the sensitivity analysis in which we excluded the study that did not use mycobacterial culture as part of the reference standard (15) are presented in Fig. S2. Pooled sensitivity and specificity estimates combining data from all studies and data stratified by HIV status were all similar to those estimated in our main analyses, as was between-study heterogeneity. Pooled estimates from our main analysis and from this sensitivity analysis are summarized in Table 3.
We undertook two post hoc sensitivity analyses. In the first, we sought to determine whether the quantity of stool used for testing was associated with diagnostic accuracy (assuming that a higher mass might increase sensitivity). There were too few studies to estimate pooled accuracy stratified by stool mass used; however, visual inspection of forest plots showed no obvious trend to support a minimum quantity (Fig. S3). In the second sensitivity analysis, we evaluated whether the burden of TB in the country where a study was conducted was associated with the accuracy of stool Xpert. As shown in Fig. S4, there was no clear trend to suggest such an association.

DISCUSSION
In this systematic review and meta-analysis, we found that the sensitivity and specificity of stool Xpert (67% [95% CI, 52 to 79%] and 99% [95% CI, 98 to 99%], respectively) for the diagnosis of microbiologically confirmed childhood PTB were comparable to what has been reported for the performance of Xpert on respiratory specimens (62% [95% credible interval, 51 to 73%] and 98% [95% credible interval, 97 to 99%], respectively) (4). Sensitivity and specificity varied by HIV status. As stool collection is noninvasive, this is of substantial interest for the medical evaluation of children with suspected PTB, but a number of limitations of the existing evidence highlight the need for more research, and greater standardization of testing, before policy formulation.
Among the most important limitations of the evidence base is the lack of data on performance in the subpopulation of children for whom stool Xpert is of greatest potential clinical utility, those under the age of 5 years, and especially the subgroup under the age of 2 years. Only one study compared accuracy between age categories, and a cutoff of 10 years of age was used (17).
We observed substantial between-study heterogeneity in diagnostic accuracy, mostly for sensitivity. Different approaches to participant selection likely contributed to this, in particular the use of a case-control design (15) and nonconsecutive sampling (16,21), which are at a higher risk of introducing bias into a study. Data also suggested that heterogeneity was partly explained by differences in the prevalence of HIV infection. The higher sensitivity of stool Xpert among children with HIV has also been observed for other specimen types in this population (4,24), perhaps as a result of more severe TB disease in HIV-TB-coinfected children.
We found substantial variability in protocols for performing stool Xpert, with each study taking a unique approach. Differences were seen at all steps: (i) at stool collection, different methods of sampling, numbers of specimens, and volumes of stool were used; (ii) different reagents were added to stool samples before homogenization, and all studies utilized different additional reagents; and (iii) dissimilar filtration methods and decontamination steps were adopted. Future studies should ensure, at minimum, complete reporting of protocols for stool collection processing and testing. A standardized protocol would be of value, as would a standardized stool collection-andprocessing kit.
Our systematic review and meta-analysis has a number of strengths. First, all included studies reported using a microbiological reference standard for comparison to stool Xpert, and 8 out of 9 studies used liquid or solid culture. While the imperfect nature of any reference standard for diagnosing pediatric TB means that the true number of affected children is always unknown, the accuracy of stool Xpert against microbiological confirmation is likely a closer estimation of its true accuracy than its performance compared to the clinical reference standard (as symptoms of PTB are nonspecific). Second, by systematically assessing each study's sample preparation and processing techniques, we found substantial variability in methods of performing stool Xpert and were also able to identify obstacles to implementation. For example, most protocols required at least one centrifugation step, which is inauspicious in terms of translating this assay to a lower health care system level. Finally, we utilized a sensitive and validated search strategy that covered six languages.
The present work also has some limitations. First, data were insufficient, and there were too few studies for us to perform stratified or metaregression analyses to assess most demographic-related potential causes of observed heterogeneity. Hence, we suggest that in addition to HIV-stratified results, future studies of stool Xpert should also ensure that reporting is stratified by age, gender, and extent of radiographic results from "intention-to-treat" (ITT) analyses, where any child who produced any sample was included, as well as "per-protocol" analyses, where only children who produced all requested samples were included. In these instances, we meta-analyzed the ITT results to avoid selection bias. (B) Forest plots of stool Xpert's diagnostic performance compared to a clinical reference standard of "likely/possible TB" or "unlikely TB." (C) Forest plots of diagnostic performance of stool Xpert in children with HIV compared to a microbiological reference standard. (D) Forest plots of diagnostic performance of stool Xpert in HIV-negative children compared to a microbiological reference standard. disease. Second, while we identified wide variability in sampling and stool processing, we could not explore these as sources of heterogeneity or determine if any processing workflows were potentially superior. Third, we did not include one study concerning the performance of stool Xpert on samples from children (25) that was reported after our systematic search was completed and therefore was not included in our metaanalysis. However, including it in our pooled analyses did not significantly alter sensitivity or specificity estimates (see Fig. S5 in the supplemental material). Finally, our pooled estimates came from study populations with a high prevalence of TB; hence, it is possible that these estimates may not be generalizable to settings of lower TB burdens.
Given that these preliminary studies of stool Xpert suggest high specificity and moderate sensitivity, its potential role in the diagnostic pathway would be as a first-line rule-in test rather than as a triage test to rule out PTB. Studies assessing whether stool Xpert has value as an add-on test in combination with currently deployed assays will be useful, as will studies assessing the effect of repeat testing on sensitivity.
Conclusion. Preliminary data suggest that the use of Xpert on stool specimens may be potentially useful as a rule-in test, but a standardized stool sample preparation protocol is lacking, and the accuracy of stool Xpert in children under 5 years old, the subgroup for whom the test could bring the most added value, remains largely unknown.