**DOI:**10.1128/JCM.00624-11

## ABSTRACT

We propose a new coefficient, the adjusted Wallace coefficient (*AW*), and corresponding confidence intervals (CI) as quantitative measures of congruence between typing methods. The performance of the derived CI was evaluated using simulated data. Published microbial typing data were used to demonstrate the advantages of *AW* over the Wallace coefficient.

## TEXT

Several molecular epidemiology studies of clinically relevant microorganisms provide a characterization of isolates based on different typing methods (3, 5, 7). The informed choice of which typing method is more appropriate in a given clinical or microbiological research setting lies in the ability of the method to identify isolates of interest, the execution time, the cost-effectiveness, and the ease of interpretation of the results (16). Nevertheless, to support the decision, a quantitative comparison of the results of the typing methods should also be performed (3).

Carriço et al. (3) proposed the use of the adjusted Rand coefficient (*AR*) and the Wallace coefficient (*W*) as measures to assess the congruence of typing methods. These have been applied in several studies comparing or proposing new typing methods (2, 3, 7, 9, 12, 17). *AR* provides a measure of the overall agreement between two typing methods and corrects the previously used coefficient of typing concordance (18) for chance agreement, avoiding the overestimation of concordance between typing methods (8). *W* provides information about the directional agreement between typing methods. *W _{A}*

_{→}

_{B}is the probability that, for a given data set, two individuals are classified together using method

*B*if they have been classified together using method

*A*. In spite of its simple interpretation, one can obtain high values of

*W*due to chance alone. For instance, if method

*A*creates a high number of partitions (such as pulsed-field gel electrophoresis [PFGE] subtypes) and method

*B*creates only two (such as the presence or absence of a given gene),

*W*

_{A}_{→}

_{B}will be high but may not be different from the value expected by chance alone.

The expected Wallace coefficient under independence (*W _{i}*) was previously proposed to evaluate whether the results of two typing methods could agree by chance alone (11). To assess whether the estimated

*W*value is significantly different from the

*W*value, one can use the proposed Wallace 95% confidence interval (CI) (11, 13). If the value of

_{i}*W*is within the CI of

_{i}*W*, the null hypothesis of independence between classifications cannot be rejected with the respective confidence level (11). One way to directly take into account

*W*would be to calculate an adjusted version of

_{i}*W*.

Albatineh et al. (1) had previously discussed the correction for chance agreement for several similarity indices, including *W*. Although this correction was never applied in the context of microbial typing studies, others have previously acknowledged the importance and usefulness of such a correction (15).

Derivation of *AW*.The adjusted Wallace coefficient (*AW*) is derived by following an approach similar to that used for *AR* (8):
*W _{i}*

_{(}

_{A}

_{→}

_{B}

_{)}= 1 − SID

_{B}(11), where SID

_{B}is Simpson's index of diversity of the

*B*classification (14).

For a 95% CI, assuming a Gaussian distribution, the limits are given by
*W _{A}*

_{→}

_{B}) is the variance of

*W*. A detailed description of

_{A→B}*AW*and the 95% CI formula derivation and evaluation is given in the Appendix.

Evaluation of *AW*.In order to illustrate the importance of the correction for chance agreement, we analyzed representative results from previously published data sets. The data sets used were results for 325 macrolide-resistant *Streptococcus pyogenes* isolates (group A streptococci [GAS]) (3) and 116 methicillin-resistant Staphylococcus aureus (MRSA) isolates (5) characterized by several typing methods. The data are summarized in Table 1.

For both data sets, only two of the calculated 95% CIs for *W* included the respective *W _{i}*. In all the other comparisons, the congruence between typing methods could not be attributed to chance alone. However, some higher values of

*W*are observed for the methods with less discriminatory power (lower SID

_{i}_{B}values), such as the macrolide resistance phenotype (MRP) method for the GAS data set. For high

*W*values, a large part of the agreement that is being measured by

_{i}*W*is due to chance. This could lead to more distinct differences between

*W*and

*AW*, as illustrated in the following examples. The value of

*AW*

_{T typing}

_{→}

_{MRP}

_{method}of 0.46 (95% CI, 0.33–0.58) is considerably lower than the agreement measured by

*W*

_{T typing}

_{→}

_{MRP method}of 0.72 (95% CI, 0.66-0.79) (Table 1). The reverse relationship, the ability of MRP to predict T types, was similarly affected:

*W*

_{MRP}

_{method}

_{→}

_{T typing}is 0.41 and

*AW*

_{MRP}

_{method}

_{→}

_{T typing}is 0.18 with CIs that do not overlap. A similar decrease was also observed in comparing the MRP method to PFGE(SfiI68), in which clusters were defined as groups of isolates sharing at least 68% similarity in the unweighted-pair group method using average linkages (UPGMA)/Dice dendrogram of PFGE profiles upon SfiI digestion. In the MRSA data set, even more distinct differences were observed for

*W*

_{eBURST}

_{→}

_{SCC}

_{mec}

_{typing}. In the extreme case in which the 95% CI of

*W*includes the

*W*value (for instance, in the case of

_{i}*W*

_{SCC}

_{mec}

_{typing}

_{→}

_{eBURST}),

*AW*will be very close to zero, providing a clear indication that the agreement between typing methods is due to chance. However, such marked differences between

*W*and

*AW*are not universal. For several typing methods, only small decreases of

*AW*relative to

*W*were noted, for instance, in the case of

*W*

_{T typing}

_{→}

_{emm typing}for GAS or

*W*

_{BURP}

_{→}

_{eBURST}for MRSA isolates (Table 1). More predictably (see the Appendix), with

*W*values close to 1, the

*AW*value did not differ much from

*W*. This can be observed for

*W*

_{PFGE(SfiI68)}

_{→}

_{MRP method}in the GAS study and W

_{spa typing}

_{→}

_{eBURST}in the MRSA study.

In order to facilitate the use of *AW* and respective CIs for the comparison of typing methodologies, these can be calculated with a freely accessible online tool at www.comparingpartitions.info. Bionumerics scripts for the calculation of *AW* and the respective CI are also available at this website.

Conclusion.It is important to clarify the difference between the actual ability to predict the classification produced by a given typing method from the results obtained with another method and the fact that such congruence of classification could arise by chance. *AW* provides such correction to the now widely used *W*. A drawback of this approach is the loss of direct interpretation of the *AW* value as a probability compared to *W*, since the value of *W* is transformed by the correction for chance agreement. Nevertheless, we recommend the use of *AW* over *W* since it avoids the overestimation of unidirectional concordance between typing methods, similar to *AR* for bidirectional concordance. The use of these coefficients and respective CIs are tools for an effective comparison of the results of different molecular typing studies, providing a better evaluation of the strengths and weaknesses of each study and of each typing method.

## APPENDIX

### Derivation of *AW*.

The derivation of *AW* is analogous to that of *AR* (8). The general form of an index corrected for chance agreement (8) is as follows:
*W* of 1,
*W* approaches 1, the correction for chance agreement, measured as the difference between *W* and *AW*, approaches 0. For smaller values of *W*, *W _{i}* may approach the value of

*W*, resulting in stronger corrections and lower values of

*AW*.

According to the expression
_{B} is Simpson's index of diversity of the *B* classification (11), methods with lower diversity result in higher *W _{i}* values and, therefore, the difference between

*W*and

*AW*increases.

Considering the following properties for the variance of a variable *X*, where *c* is a constant,
*AW* can be deduced:
*W _{i}*

_{(}

_{A}

_{→}

_{B}

_{)}/(1 −

*W*

_{i}_{(}

_{A}

_{→}

_{B}

_{)}) is constant, by property

*a*,

*W*

_{i}_{(}

_{A}

_{→}

_{B}

_{)}is constant, by property

*b*,

*W*

_{A}_{→}

_{B}) is the variance of

*W*

_{A}_{→}

_{B}, calculated as described in reference 11.

For a 95% CI, assuming a Gaussian distribution, the limits are given by
_{B} to be constant and assumes a Gaussian distribution for *AW*. Both assumptions were already used in the derivation of the Wallace CI (11), and their validity was assessed by simulation of the sampling process as described previously (13) (Fig. A1). Briefly, population frequency tables (PFTs) were generated according to the parameters *R* (representing the number of rows), *C* (representing the number of columns), alpha (determining the distribution of cluster sizes in the rows), and beta (determining the distribution of the elements in each row across columns). By following a multinomial distribution for the absolute frequencies of each PFT, 1,000 contingency tables representing samples of *N* elements from the infinite population were randomly generated.

The *AW* CI coverage is quite robust for changes in the number of clusters and the cluster size distribution (Fig. A1, first row). However, CI coverage is very sensitive to sample size (*N*), decreasing steeply for *N* of 20 and *AW* values higher than 0.2.

The amplitude of the CI also reflects the importance of the sample size for assessing the congruence between typing methods (Fig. A1, second row). As expected, smaller samples (*N* = 20) resulted in higher amplitudes and, therefore, greater uncertainty in the point estimate.

## ACKNOWLEDGMENTS

This work was partially funded by the Fundacão para a Ciência e a Tecnologia (PTDC/SAU-ESA/71499/2006) and an unrestricted grant from GlaxoSmithKline.

We thank D. Ashley Robinson for insightful discussions about the need for an adjusted Wallace coefficient.

## FOOTNOTES

- Received 29 March 2011.
- Returned for modification 26 August 2011.
- Accepted 2 September 2011.
- Accepted manuscript posted online 14 September 2011.

- Copyright © 2011, American Society for Microbiology. All Rights Reserved.