Differential Detection of Human Papillomavirus Genotypes and Cervical Intraepithelial Neoplasia by Four Commercial Assays

Laboratories now can choose from >100 human papillomavirus (HPV) assays for cervical screening. Our previous analysis based on the data from the Danish Horizon study, however, showed that four widely used assays, Hybrid Capture 2 (HC2), cobas, CLART, and Aptima, frequently do not detect the same HPV infections. Here, we determined the characteristics of the concordant samples (all four assays returning a positive HPV test result) and discordant samples (all other HPV-positive samples) in primary cervical screening at 30 to 65 years of age (n = 2,859) and in a concurrent referral population from the same catchment area (n = 885). HPV testing followed the manufacturers' protocols. Women with abnormal cytology were managed according to the routine recommendations. Cytology-normal/HPV-positive women were invited for repeated testing in 18 months. Screening history and histologically confirmed cervical intraepithelial neoplasia (CIN) in 2.5 years after the baseline testing were determined from the national pathology register. HPV-positive women undergoing primary screening having concordant samples were more likely to harbor high-risk infections and less likely to harbor only low-risk infections than women with discordant samples. Additionally, assay signal strengths were substantially higher in concordant samples. More than 80% of ≥CIN2 results were found for women with concordant samples, and no ≥CIN2 results were found when the infection was detected by only one assay. These patterns were similar in the referral population despite the younger age and higher number of HPV infections. HPV test result discordance identified a cluster of low-risk HPV infections that were hardly ever associated with high-grade CIN and, almost exclusively, represented false-positive screening findings.

H uman papillomavirus (HPV) assays will be used for cervical cancer screening (to select women with a high risk of highgrade cervical intraepithelial neoplasia [CIN], termed ՆCIN2) and for monitoring of the effectiveness of HPV vaccination (to determine the changes in infection epidemiology). Currently, laboratories can choose from Ͼ100 commercially available HPV assays (1). These assays utilize different technologies to detect HPV infections (2). The randomized controlled trials of primary screening studied only two assays, Hybrid Capture 2 (HC2) and GP5ϩ/6ϩ PCR (3). Therefore, it has been recommended that other assays should be compared to them in terms of the detection of HPV infections and CIN (4).
In the Danish split-sample Horizon study comparing HC2 (Qiagen, Gaithersburg, MD), cobas (Roche, Pleasanton, CA), CLART (Genomica, Madrid, Spain), and Aptima (Hologic, San Diego, CA) HPV assays, the detection of HPV infections depended on the assay. This was particularly pronounced in women undergoing primary cervical screening at Ն30 years, where only 29% of all positive test results were concordant on all four compared assays (5). The remaining 71% (partially) discordant positive test results reflected infections in 16% of all screened women tested with the four assays. For comparison, only 4% of all screened women had abnormal cytology, pointing out that HPV assay discordance is not of trivial magnitude. The same pattern was seen in studies undertaken in populations with substantially different HPV prevalence rates and using other sample storage media (5).
This unexpected finding warrants an additional in-depth analysis, as it may have implications for the two intended uses of HPV assays. Hence, our aim here was to study an infected woman's likelihood of harboring high-risk HPV and high-grade CIN depending on the degree of the discordance between the HPV test results. To gain insight into the patterns across various populations, we studied two groups of women: those undergoing primary screening and those with (recent) abnormalities (i.e., a referral population).

MATERIALS AND METHODS
Study design. The study design was described in detail previously (5)(6)(7)(8)(9)(10)(11). In short, consecutive routine SurePath samples from 5,034 women arriving at the Department of Pathology of Copenhagen University Hospital Hvidovre in June to August 2011 were collected and tested with liquidbased cytology (LBC; reported using the Bethesda 2001 system) and the four HPV assays (this constituted the baseline testing). By linkage to the Danish National Pathology Register (Patobank) (12) from 1 January 2000 onwards, the samples were categorized as primary (screening) samples or as samples taken for follow-up of recent abnormalities. Screening samples were defined as those without a previous histological diagnosis of cervical cancer, histologically confirmed CIN in Յ3 years, atypical squamous cells of undetermined significance (ASCUS) or non-CIN cervical histology in Յ15 months, or a more severe cytological abnormality, inadequate cytology, or positive HPV test in Յ12 months. Reflecting routine practice, these samples included a small proportion of samples taken for investigation of symptoms. All other samples were follow-up samples.
Women with abnormal cytology were monitored according to the routine guidelines in use for the laboratory's catchment area (colposcopy was performed if the women had high-grade squamous intraepithelial neoplasia [HSIL] or worse, atypical glandular cells [AGC], atypical squamous cells not excluding HSIL [ASC-H], adenocarcinoma in situ [AIS], HC2-positive ASCUS at Ն30 years, abnormal repeated testing following ASCUS at Ͻ30 years, low-grade squamous intraepithelial lesions [LSIL], or HC2-negative ASCUS at Ն30 years). Women with normal cytology and a positive test result on at least one HPV assay were invited, according to the study protocol, for repeated cytology and HPV testing in 18 months after the baseline. Follow-up samples with abnormal cytology or a positive HC2 test result elicited a referral for colposcopy. All colposcopies were undertaken under routine conditions either by a hospital or privately practicing gynecologists. In Denmark, it is recommended that directed biopsy specimens be taken from all suspicious areas after application of acetic acid and random biopsy specimens from four quadrants if lesions are not visible. The Hvidovre laboratory evaluated almost 90% of the high-grade CIN biopsy specimens included in the study. During the study period, between four and seven pathologists evaluated cervical cytology and gynecological histology in this laboratory. Most samples were read by one pathologist. p16 staining was not used systematically. The remaining approximately 10% of high-grade CIN biopsy specimens were evaluated in other Danish hospitals or by private pathologists. Data on the most severe histological diagnosis during follow-up were retrieved from the national Patobank in December 2013, covering the period of about 2.5 years after baseline. Follow-up was highly complete for women with abnormal baseline cytology (ca. 95%) and moderately complete (ca. 60%) for women with cytology-normal/HPV-positive test results (6).
HPV testing. All assay testing and sample storage were undertaken in strict concordance with the protocols agreed upon by all manufacturers prior to the study (see the supplemental material). The instrumentation and software were used as supplied and maintained by the manufacturers. HC2 testing was undertaken on the postquot LBC material. cobas, CLART, and Aptima testing was undertaken on the original residual material, diluted approximately 1:1 in SurePath. HC2 detects 13 high-risk HPV genotypes collectively. The assay is based on hybridization of HPV DNA to a high-risk HPV RNA probe cocktail. cobas is a real-time PCR analysis detecting the 13 high-risk HPV genotypes and HPV 66. The assay separately identifies HPV 16 and HPV 18, while the remaining 12 genotypes are detected collectively. The CLART assay is a PCR-based lowdensity microarray detecting 35 defined genotypes, including the 13 highrisk genotypes. All genotypes are reported individually. Aptima detects E6/E7 mRNA expression of the 13 high-risk HPV types and HPV 66 collectively. HC2, cobas, and Aptima are FDA-approved assays and have also been validated according to the protocol defined by the international assay validation guidelines (2,4).
Statistical analysis. The primary screening population was defined as women with screening samples at 30 to 65 years of age without invalid HPV test results (n ϭ 2,859, 56.8% of all 5,034 included women; 10 were excluded because of invalid HPV test results). The referral population included women with follow-up samples and women with screening samples showing abnormal cytology at any age, excluding women with invalid HPV test results (n ϭ 885, 17.6%).
HC2 was positive when the relative light units per cutoff (RLU/CO) value was Ն1. cobas was positive when channels 16, 18, and/or other high-risk genotypes had critical threshold (C T ) values of Յ40.5, Յ40.0, and Յ40.0, respectively (for samples where Ͼ1 channel returned a positive test result, we considered the channel with the strongest signal, i.e., the lowest C T value). Aptima was considered positive when the signal-to-cutoff (S/CO) value was Ն0.5. CLART was considered positive when Ն1 high-risk HPV genotype was detected. In line with the classification by the International Agency for Research on Cancer, genotypes 16, 18, 31, 33, 35, 39, 45, 51, 52, 56, 58, 59, and 68 were considered high risk (13).
We categorized samples by the number of positive HPV test results (one through four). For each category, we determined the proportions of samples with HPV 16 infections, with infections with the other 12 highrisk genotypes (excluding HPV 16), with only low-risk HPV infections, and without a single detected HPV genotype (from among 35 detectable by CLART); these proportions were mutually exclusive. Separately, we determined the proportions of high-risk samples with multiple infections (defined as detection of Ն2 genotypes with or without low-risk genotypes) and with abnormal cytology (ՆASCUS). Genotypes were reported as detected by CLART. Trends in the proportions of genotype detection by category were tested with the Mantel-Haenszel 2 test for trend. We determined the median signal strengths and the associated interquartile ranges (IQR) on the HC2, cobas, and Aptima assays for samples that tested positive on only one assay compared to samples that tested positive on all four assays. The 95% confidence intervals (CI) for relative proportions (RP) comparing screening and referral populations were calculated assuming lognormal distribution. Analyses were undertaken using IBM SPSS Statistics, version 19.   (Fig. 1), and combined they accounted for 258 (39.6%) of all positive test results in the study. In 558 women with positive HPV test results and normal cytology, 125 women (22.4%) tested positive on all four assays. In 93 women with positive HPV test results and abnormal cytology, all four assays tested positive in 63 women (67.7%).

Screening
Women with a positive test result on only one HPV assay were less likely to have HPV 16 infections, infections with other highrisk genotypes, multiple HPV infections, and abnormal cytology than women with a positive test result on Ͼ1 HPV assay (P Ͻ 0.001). Their samples were also more likely to contain only lowrisk or no HPV genotype (P Ͻ 0.001). As this comparison could have been biased by the double role of the CLART assay (i.e., being one of the compared assays at the same time as having the role of adjudicating HPV genotypes), we additionally studied the concordance comparing only the HC2, cobas, and Aptima assays, with CLART used exclusively for an adjudication of the genotypes present in the sample. This comparison revealed essentially the same patterns (see Table S1 in the supplemental material).
Signal strength values indicated that samples with only one positive HPV test result were relatively weakly positive compared to the samples from women with four positive test results, and their medians were close to the manufacturer-recommended threshold cutoffs for positive test results ( Table 2). The median signals in samples with all four assays returning a positive test result were, on the other hand, substantially different from the manufacturers' threshold values. The IQRs for the weakly and strongly positive categories of samples did not overlap.
Referral population. The mean age of the 885 women was 34.0 years (range, 18 to 89 years; SD, 10.8 years), and 519 (58.6%) had HPV infections detected on one or more assays. Of the 519 samples with at least one positive HPV test result, 89 (17.1%) were positive on only one, and 290 (55.9%) were positive on all four assays.
These HPV-positive women differed from those in the screening population. They were more likely than the screening population to be infected with HPV 16 (RP, 1.66; 95% CI, 1.31 to 2.10; calculated as 131/519 versus 99/651) ( Table 1) and have multiple HPV infections (RP, 1.43; 95% CI, 1.27 to 1.63; calculated as 286/ 519 versus 250/651). These women were less likely to have no genotypes detected in their samples (RP, 0.37; 95% CI, 0.26 to 0.52; calculated as 37/519 versus 126/651). In this comparatively young and high-risk population, the median assay signal strengths were stronger and the concordance in detecting infections between the four HPV assays was better than that in the screening population. Nevertheless, also here the median assay signals were stronger for women with concordant HPV test results than for women with discordant HPV test results (Table 2).

Main findings.
Screening samples in which the HC2, cobas, CLART, and Aptima assays showed HPV status discordance represented a cluster with a lower risk of ՆCIN2 than samples in which all four assays detected HPV infections. Discordant samples had weaker assay signals, were less likely to contain high-risk genotypes, multiple infections, and high-grade CIN, and more likely to harbor only low-risk or no HPV genotype. These differences between weakly and strongly positive samples were also found in the referral population with a lower average age and a higher prevalence of high-risk infections, suggesting that our findings can be generalized across various populations. Strengths and weaknesses of the study. The study was undertaken on fresh routine samples. All samples were handled in one laboratory by the same staff. By having access to the women's screening histories, we could determine which samples were taken for screening purposes, even though this information was not registered routinely. This is one of a few studies that compared several HPV assays on primary screening samples with active follow-up of women with positive HPV test results and normal cytology. Al-though the follow-up with repeated testing was incomplete, it was very similar to that observed in the randomized controlled trials comparing HPV-based to cytology-based primary cervical screening (14). Nevertheless, a somewhat higher number of high-grade CIN would plausibly be detected with a more complete follow-up (6).
There is no internationally agreed standard reference HPV genotyping assay. The research-only Linear Array (LA) was among the assays most frequently used for genotyping in previous studies. However, LA test results are read manually and may be prone to interobserver variability. We genotyped the infections using the CE-IVD-marked CLART assay, which interprets the test results using a built-in software algorithm. In some studies (15)(16)(17)(18) but not all (19), CLART and LA showed relatively good concordance. In our study of women undergoing primary screening, CLART detected a number of high-grade lesions similar to that of the HC2 and cobas assays. However, similar to cobas it did miss one of the three cases of cervical cancer (8).
Comparison with other studies. Taken together, the Horizon data are in good agreement with prior studies. Our summary of the literature comparing HPV detection by various assays found similarities between our SurePath-based findings and those from studies using other liquid media (5). More recently, Cook et al. compared HC2 and cobas on ThinPrep samples from the Canadian FOCAL randomized controlled trial of primary screening and observed a large difference in signal strength between samples where both assays  tested positive (median RLU/CO, 54.9; median C T , 30.2) and samples where the assays returned different test results (HC2 ϩ /cobas Ϫ median RLU/CO, 5.9; HC2 Ϫ /cobas ϩ median C T , 38.3) (20). Previous screening studies showed high levels of concordance between assays in terms of the detection of high-grade CIN, which corresponds to a substantially higher likelihood of detecting these lesions when multiple assays return a positive test result (Table 3). Nevertheless, it should be noted that only two studies reported CIN lesions detected jointly by cytology and two HPV assays, whereas the remaining studies reported CIN lesions detected only by cytology and one HPV assay. This potentially means that some of the variability in detecting CIN that was observed in our study, where all CIN were detected by cytology and/or up to four HPV assays, could not have been observed in previous studies. Several studies, including ours, however, reported relatively small numbers of high-grade CIN.
Clinical implications. The concordance between the assays increased with the severity of the underlying biology. The percentage of HPV-positive women testing positive on all four HPV assays increased from 22% in women with normal cytology to 68% in women with abnormal cytology and to 84% in women with ՆCIN2. The high level of concordance in detecting high-grade CIN indicates that the choice of an HPV assay for screening will have little effect on the high-risk women who should be recommended for treatment to avoid progression to cervical cancer. The likelihood that they will be detected through screening is high with any of the assays evaluated here, which is consistent with a relatively high clinical sensitivity for each assay.
On the other hand, false-positive screening tests, i.e., positive HPV test results with harmless infections that do not lead to highgrade CIN, represent clinically inconsequential noise. They represent a challenge for primary HPV-based screening even in women above 30 years of age (25), and this calls for further optimization of HPV assays. False positivity appears to be a hallmark of the weakly positive and discordant screening samples. Our data suggested that women with a single positive HPV test result are unlikely to harbor ՆCIN2, but they represented 40% of all HPVpositive women in our study, corresponding to ca. 10 to 20% of positive test results on each assay.
As their infections will be detected by some but not other assays, healthy women with false-positive test results will be affected by the choice of an HPV assay for primary screening. These women will be recommended for (unnecessary) follow-up, possibly including colposcopy. This brings into focus the question of whether the management recommendations for HPV-positive women should be the same irrespective of which assay detected the infection. Studies agree in that, regardless of the assay used for primary screening, HPV-positive women should not be directly referred for colposcopy (26)(27)(28). For HC2-positive women from the Dutch VUSA-screen study, Rijkaart and colleagues proposed using cytological triage and to repeat cytology testing at 12 months postbaseline for triage-negative women (26). This triage strategy had a negative predictive value for ՆCIN3 cases of Ͼ99% and required the lowest number of women referred for colposcopy. In this setup, the addition of HPV genotyping would lead to a higher cumulative number of colposcopies but would not significantly decrease the risk of missing ՆCIN3 cases. On the other hand, Iftner and colleagues, using data from the German Aptima-and HC2-based screening study (27), and Wright and colleagues, us-  ing data from the U.S.-based ATHENA study evaluating the cobas assay (28), found that optimal triage strategies seemed to involve HPV 16/18 genotyping at baseline. In these two studies, follow-up testing of triage-negative women could not be further evaluated, as immediate colposcopy was recommended to all HPV-positive women (for study purposes). Differences in study populations and designs may, to some extent, help explain the differences in the conclusions on the optimal management strategy for HPVpositive women. Another explanation is that the different selections of HPV-positive women chosen for follow-up by the different assays require adaptations in the clinical management. Given its relevance for policy-making in screening, this remains to be studied in more detail.
One of the indicators for monitoring the implementation of HPV vaccination is a change in the epidemiology of HPV genotypes (29). Our study suggested that none of the assays detects all HPV infections. The discordance between the assays in detecting the virus was observed also at the genotype level. For example, only 59% of all HPV 16 infections detected by either cobas or CLART were concordant on the two assays (data not reported). For HPV 18, this was 69%. Thus, to reliably attribute changes in HPV epidemiology to the use of the vaccine, it will be necessary to maintain consistency in the choice of the HPV assay.
In conclusion, discordance between multiple HPV assays in detecting HPV infections identified a cluster of weakly positive infections that are less frequently associated with HPV 16 and 12 other high-risk genotypes. As these samples also hardly ever harbored high-grade CIN, they were typically associated with false positivity in screening.