Previous Article | Next Article ![]()
Journal of Clinical Microbiology, July 2008, p. 2200-2205, Vol. 46, No. 7
0095-1137/08/$08.00+0 doi:10.1128/JCM.01666-07
Copyright © 2008, American Society for Microbiology. All Rights Reserved.

McGill University Centre for Tropical Diseases, Montreal General Hospital,1 Department of Epidemiology, Biostatistics and Occupational Health, McGill University, and Division of Clinical Epidemiology, McGill University Health Centre, Montreal, Quebec, Canada2
Received 21 August 2007/ Returned for modification 21 October 2007/ Accepted 23 April 2008
|
|
|---|
|
|
|---|
To date, no studies have been published on the value of a QA program undertaken to directly evaluate the quality of microscopy in the detection of intestinal protozoa, although several have reviewed this for other types of procedures and microorganisms (e.g., bacterial counts [2]; human immunodeficiency virus [4]). QA programs, such as that undertaken by the College of American Pathologists, have typically relied on results from proficiency tests (17). In some areas of clinical microbiology, proficiency testing has been used to evaluate multiple steps in a complex process and reveal widespread systematic problems (e.g., for Streptococcus pneumoniae antimicrobial susceptibility testing [5]). In the parasitology laboratory, proficiency tests are based on sending "unknowns" to each participating laboratory, and the evaluation is based on the number of parasite species correctly identified. A potential weakness of proficiency testing is the possibility for the "unknown" test specimens to be identified as such, and thus, it is difficult to ensure that they are handled in the same manner as routinely submitted specimens. Significant interlaboratory variations in specimen handling and processing practices make it particularly difficult to blind technologists to the nature of the specimen in large proficiency testing programs. This may introduce a bias into the assessment of the diagnostic process, particularly for the subjective interpretation of results.
Some published data exist on the value of pooling of specimens. This involves combining fecal material from two or more stool specimens and comparing the results of microscopy on the pooled sample with those for the original specimens (1, 10, 14, 20). To our knowledge, no studies have examined result reproducibility when the same individual stool specimens are examined more than once.
Previous research has already compared the sensitivity of single versus multiple stool examinations, with the majority concluding that sensitivity increases with the number of specimens examined (3, 7, 9, 12, 13, 15, 16, 18). Our study looks instead at reproducibility from the perspective of repeat examinations of the same specimen, defined as concordance. This allows us to look at the "sensitivity" of a single examination in the unconventional sense of comparing this examination to a "gold standard" of the combined results from two or three examinations of different preparations of the same specimen.
A full QA program in parasite microscopy, as in clinical microbiology in general, involves the management of factors affecting the reliability, efficiency, and utilization of laboratory services. Many of the same types of activities and indicators apply, such as developing procedure manuals, developing quality control processes for tests and reagents, and monitoring the various aspects of the analytic process. Our intent was to focus on the limited set of activities related to microscopy, including technologist competence. We therefore designed an intralaboratory tool, based on the blinded resubmission of pairs of sodium acetate-acetic acid-formalin (SAF)-preserved clinical stool specimens in order to analyze the reproducibility of the results, for use within a QA program. Our evaluation of this tool focused on several specimen characteristics that could hypothetically influence reproducibility, such as the protozoal concentration, the number of protozoal species present, and the duration of storage. In addition, we used this tool to evaluate the effect of pooling stool specimens.
Our study objectives were the following: (i) through the resubmission to the laboratory of selected pairs of "blinded" stool specimens, to determine concordance rates and to identify modulating factors (parasite load, resubmission time interval, number of competing parasites) and (ii) through the resubmission to the laboratory of a blinded pooled stool specimen, to determine concordance with (a) the initial pair of submitted stool specimens and (b) the blinded resubmitted pair of stool specimens.
|
|
|---|
Two stool specimens, collected on alternate days, were submitted in SAF. Only specimens received in pairs were used for this study. We concentrated specimens by using a standard formal ethyl acetate method, and permanent stains were prepared using an iron-hematoxylin technique (6). Quantitation of the parasite concentration was based on examination of approximately 1 µl of fecal suspension under a 22- by 30-mm coverslip for 1+ to 4+ designations, and on examination of the permanent stain for 3+ to 5+ designations, as follows: 1+ represents 1 to 5 protozoal organisms/coverslip; 2+, 6 to 20 organisms/coverslip; 3+,
1 organism per low-power field (magnification, x100); 4+,
1 organism per high-power field (x400); 5+,
1 organism per oil immersion field (x1,000). When there was disagreement between the results of the wet mount and those of the stained specimens, quantitation was based on the higher designation. Only one final report was issued for each pair of specimens in a set, including all parasites found in either specimen and using the highest quantity of each species found in either specimen for the report.
In this study, we evaluated 231 sets of stool specimens originally reported between June 2002 and January 2005. Each set consisted of three subsets: the original patient-submitted pair of stool specimens, a blinded resubmission of the same pair, and a blinded specimen consisting of equal parts of the original pair (the "pooled" specimen). Selection of the original patient-submitted pairs was not random. Specimens were selected soon after the original reports were issued with the aim of creating a study collection which had balance between parasite-positive and parasite-negative specimens and between low and high parasite concentrations, and with a skew toward the three protozoan pathogens of interest: Giardia lamblia, Entamoeba histolytica/Entamoeba dispar, and Dientamoeba fragilis. Selection was limited by the availability of a sufficient quantity of stool to create the resubmitted and pooled subsets.
Clinical stool specimens are normally submitted as a 3:1 mixture of SAF to stool with a total volume of approximately 75 ml. A portion of the specimen is selected for homogenization, straining, and saline washes; aliquots are removed for fecal parasite concentration and permanent stains; and the remaining processed specimen is replaced in the original specimen container with the residual unprocessed stool. After reading and reporting, for our study, selected specimens were retrieved from their original containers by a technologist (E.K.) not involved in specimen preparation or analysis. Those selected for resubmission were diluted with a variable amount of SAF in order to create the two study subsets: a new specimen pair for resubmission and a single specimen created by pooling equal amounts of the originally submitted pair. Proper homogenization could require dilution of the original specimen by adding a volume of SAF equal to as much as 25% of the specimen volume, depending on the consistency of the stool. Each study specimen was relabeled with a new accession number, and a new requisition was created with fictional patient information. These specimens were mixed in with the routine workflow. Due to technical problems, for 50 of the 231 original sets, the resubmission subset consisted of only one specimen derived from one sample of the originally submitted pair. In these cases, we verified that the results of the two original specimens were similar.
A set of specimens (original pair, resubmitted pair, and pooled subsets) was considered positive if any of its component specimens were positive. When two subsets were compared, if the results of both subsets from a single set were the same (positive or negative), the set was called concordant. The concordance rate was defined as the percentage of concordant sets out of all sets positive for a particular parasite. Concordance rates are presented as descriptive data, without other summary statistics, because sample sizes were small and specimens were not randomly selected.
To examine whether specimens might be deteriorating over time, we used linear regression to evaluate the relationship between the change in concentration, on the one hand, and the time delay between examinations of specimens from the same set, on the other hand (EpiInfo, version 3.3.2; Centers for Disease Control and Prevention, Atlanta, GA).
In order to evaluate whether the presence of multiple protozoa in one specimen affected concordance, we created a "competition index." This is a simple but unvalidated measure for assessing the average burden of protozoa other than the targeted species. We are not aware of any similar measure that has been described previously. This measure was defined for any group of specimens as the sum of the quantitation values (1+ to 5+) for all parasites other than the targeted species (pathogens and nonpathogens) divided by the number of specimens in the group. For each parasite, we compared the competition index for the group of concordant specimens with that for the group of nonconcordant specimens using the resubmitted specimen subsets (since the initial specimen needed to be slightly diluted as described above in order to reconstitute the study specimens).
|
|
|---|
|
View this table: [in a new window] |
TABLE 1. Frequency of parasites found in the routine parasitology laboratory workload over 24 monthsa
|
![]() View larger version (31K): [in a new window] |
FIG. 1. Distribution of protozoal concentrations among study specimens compared to that among all specimens received in the lab over a similar time period. Results for each protozoan are shown separately. (a) E. histolytica/E. dispar; (b) Giardia; (c) D. fragilis.
|
|
View this table: [in a new window] |
TABLE 2. Percent concordance of results in pairwise comparisonsa
|
![]() View larger version (25K): [in a new window] |
FIG. 2. Effect of protozoal concentration on concordance between the resubmitted paired subset and the pooled subset. EH, E. histolytica; GL, G. lamblia; DF, D. fragilis.
|
Table 3 shows the concordance of negative specimens. Rarely, a parasite was found in a resubmitted or pooled specimen that was not found in the original specimen. These protozoa were generally nonpathogenic (i.e., not E. histolytica/E. dispar, G. lamblia, or D. fragilis).
|
View this table: [in a new window] |
TABLE 3. Results of initially negative specimens upon resubmission and pooling
|
Finally, we evaluated the effect of pooling the two specimens in the resubmitted subset into one. Concordance was high, both for pathogenic and for nonpathogenic protozoa (Fig. 3). When specimens were not concordant, the pooled specimen was positive slightly more often than the paired specimen.
![]() View larger version (29K): [in a new window] |
FIG. 3. Percentage of submissions positive in both the resubmitted paired subset and the pooled subset, or positive only in one or the other, for each parasite.
|
|
|
|---|
Our program does allow internal validation of specimens for laboratories with more than one trained microscopist. Discrepant results can also be analyzed according to individual technologists. This can allow an assessment of competence for individual technologists.
The process of preparing blinded specimens requires some dilution of the original specimens. This appeared to have an effect on the reproducibility of G. lamblia and D. fragilis specimens at the lowest concentrations. This problem was particularly marked for D. fragilis: specimens with quantitations of
2+ tended to be poorly reproducible after dilution. There were fewer specimens with low concentrations of E. histolytica/E. dispar, making it difficult to confirm this effect for this pathogen. However, concordance for all specimens was considerably higher when only the diluted specimens (resubmitted and pooled) were compared with each other. Laboratories should carefully monitor the parasite concentration in selected specimens, especially the proportion with a concentration of
2+ for G. lamblia or
3+ for D. fragilis, in this type of reproducibility assessment. Ideally, resubmitting specimens twice, as we have done, allows only resubmitted specimens to be compared with each other, eliminating any effect of dilution. Specimens that otherwise reflect the distribution of parasite concentrations in routine specimens would allow the most accurate evaluation of the reproducibility of laboratory results.
Test specimens prepared in SAF appear to be stable over time. Specimens can be resubmitted into the routine workflow as long as 6 months after the original submission without noticeable deterioration. This facilitates the organization of a QA program and theoretically would allow the same specimens to be resubmitted more than once.
The blinding of specimens in this study was imperfect. Due to technical accessioning issues, technologists were sometimes able to guess which specimens were resubmissions. Nevertheless, there was no way for technologists to know how a resubmitted specimen was initially reported. It is possible that these specimens received greater than average attention. However, technologists could not distinguish individual specimens belonging to the resubmitted subset pair from pooled specimens, so these comparisons were well blinded. Removal of a portion of the specimen prior to any processing would allow more effective blinding but would greatly increase the workload in comparison to selecting specimens for resubmission only after initial results are available.
We believe these data, though derived from a modest number of specimens in a single laboratory, provide a useful starting benchmark for other parasitology labs. Given the fact that our lab is based in a large academic center and employs highly trained and experienced microscopists, we suggest that reproducibility in the range of 80%, and perhaps a bit higher for G. lamblia, is a reasonable threshold for satisfactory performance in laboratories with a specimen profile and testing procedures comparable to ours. With time and repeated evaluations, a laboratory will be able to generate summary statistics to refine the assessment of reproducibility.
Compared to most other procedures in the diagnostic microbiology lab, stool microscopy is heavily labor intensive. Pooling two homogenized stool suspensions could save nearly half this time. In agreement with the findings of other groups, we found that examination of a single pooled specimen was as sensitive for detecting the presence of protozoa (both pathogenic and nonpathogenic) as independent examination of the same specimens.
This project focused on the most common protozoal pathogens. A broader goal would be to ensure that technologists can detect and differentiate any fecal protozoan, several of which may be present in clinical specimens more frequently than the pathogens described here. The techniques described here can be used to evaluate result reproducibility for other protozoa. A preliminary analysis of our own data suggests that results for the most common nonpathogenic protozoa resemble those we describe for the major pathogens.
In summary, current laboratory standards require accurate assessments of the quality of analytic procedures. Stool microscopy for protozoa is one of the most technically demanding procedures in the diagnostic microbiology lab, with a necessarily subjective interpretation. Training, review of competency, and proficiency testing all help reduce the degree of subjectivity. The high complexity and inherent variability of this process further underline the need for quality audits. Nevertheless, evidence-based guidance on QA in this area is lacking. We propose a supplementary tool for use within a QA program, and we provide examples of potential performance benchmarks. We also provide data confirming that pooling pairs of stool specimens for microscopy is likely to be cost-effective. Given the potential for wide variations in lab performance, these procedures should be validated before implementation in other laboratories.
Published ahead of print on 30 April 2008. ![]()
|
|
|---|
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»