Previous Article | Next Article ![]()
Journal of Clinical Microbiology, August 2005, p. 3963-3970, Vol. 43, No. 8
0095-1137/05/$08.00+0 doi:10.1128/JCM.43.8.3963-3970.2005
Copyright © 2005, American Society for Microbiology. All Rights Reserved.
Department of Immunology/Microbiology, Rush Medical College, Chicago, Illinois,1 New England Research Institute, Inc. Watertown, Massachusetts,2 Pediatrics, University of Medicine and Dentistry of New Jersey, New Jersey Medical School, Newark, New Jersey3
Received 16 September 2004/ Returned for modification 7 November 2004/ Accepted 7 May 2005
|
|
|---|
98% sequence homology of a sample tested to a group consensus sequence for that sample. A second was concordant identification of codons at sites identified with resistance mutations in the sample, although scoring of these criteria is still undetermined from this study. These criteria are applicable to all sequence-based genotyping platforms and have been used as a baseline for assessing the performance of genotyping for the determination of antiretroviral resistance in our ongoing proficiency program. |
|
|---|
![]() View larger version (25K): [in a new window] |
FIG. 1. General protocol for performance of sequence-based genotypic assays of plasma samples. The flow of individual steps that comprise sequence-based genotypic assays is shown. The assay may be separated into components that are accomplished as a result of specified protocol instructions and machinery and those that require subjective input from the person performing the assay.
|
|
|
|---|
Plasma samples and panels. Two panels each containing 1 ml of plasma from each of three HIV-1-infected individuals were distributed to 10 laboratories of the PACTG Sequencing Working Group. These samples were collected, characterized, and sequenced by the Virology Quality Assurance (VQA) Laboratory (5) prior to distribution. Panel 1 samples (Table 1) were from donors untreated with antiretroviral drugs for at least 1 year and were used to determine whether the HGS laboratories were performing the sequencing portion of the assay consistently. Panel 2 plasma donors demonstrated increased viral loads while on therapy, which may indicate therapeutic failure. Their plasma samples were used to see whether identification of mutations and editing occurred consistently across laboratories (6). The panels were distributed approximately 6 months apart to those laboratories in which the HGS kit was used. The laboratories using the TRUGENE assay (total, n = 7; PACTG, n = 5; Visible Genetics, Inc.; University of Colorado, n = 2) received both panels at the same time 6 months after panel 2 was sent to the HGS group. Distribution of the panels to the respective groups was made according to the availability of (i) the RUO research and clinical configurations of the respective kits for use and (ii) a critical mass of trained and certified laboratories able to participate to generate enough data to analyze. The HGS laboratories were given the same version of the kit configuration and the then-current software to use for each panel. The TRUGENE groups used the same version of reagents for their panels but were provided either the research or clinical version of the software by the manufacturer. For this study, both groups of laboratories were instructed to use their respective kits and software as they were trained to do by the companies. Plasma was collected by protocols approved by the Institutional Review Board at Rush University Medical Center.
|
View this table: [in a new window] |
TABLE 1. Characteristics of plasma samples in panels 01rg and 02rg
|
Both HGS genotyping software versions 1.1 and 2.0 and TRUGENE software v. 3.0.1 allow editing to be tracked. Bases identified by each kit's software, which aligns bidirectional overlapping data from sequencing primers to form a consensus sequence, are automatically recorded in upper case. The individual nucleotides in the consensus sequence that are changed manually are recorded in lowercase characters. The case designation is preserved in the sequence text file generated by the software, thus documenting bases that were manually edited.
For each platform, the group consensus sequence (GCS) was generated by alignment with Align Plus (Scientific Educational Software, Durham, North Carolina) at a 60% threshold. Homologies to the GCS were determined in a second alignment using the GCS as the reference strain. Sequence data from each platform were analyzed separately partly due to differences in length and portion of the protease (PR) and reverse transcriptase (RT) gene sequence generated by the kits, software design and editing guidelines associated with each assay, and overall protocol design used to produce sequence data. HGS data were analyzed for the entire PR gene (99 codons) and the RT gene for codons 1 to 320; TRUGENE data covered codons 4 to 99 of the PR gene and codons 37 to 247 of the RT gene. Any codon for any laboratory that contained a nucleotide in lowercase characters was classified as edited. The number of edited codons for each sample for each laboratory was tabulated. Data were examined for (i) concordance of submitted nucleotides to the GCS for that sample; (ii) frequency of editing; and (iii) sites on the entire template that were edited.
Mutations associated with antiretroviral resistance were identified for each laboratory for the samples in panel 2 only. These mutations were recognized by algorithms incorporated into the software version that was available when the assay was performed.
|
|
|---|
Homology to group consensus sequence versus editing. Each group's data were evaluated to determine how closely their individually edited consensus sequences matched the total group consensus sequence for each gene across the entire backbone. Agreement among the HGS laboratories was high for both the PR and the RT genes. Of 114 individual consensus sequences submitted, only one displayed < 98% homology to the GCS. Concordance to the group consensus sequence was 97.3 to 100.0% for PR (297 bases) (Fig. 2A) and 98.0 to 100.0% for RT (960 bases) (Fig. 2B). The percentage of codons in each sequence edited by laboratory personnel to form a consensus sequence ranged between 2.0 and 87.9% and between 4.7 and 63.6% for the protease and RT genes, respectively. Homology was negatively correlated with edit rate for PR (Spearman rank correlation, r = 0.36; P = 0.0061) but not for RT (Spearman rank correlation, r = 0.21; P = 0.13), although the trends in the two plots look similar. However, even at the highest editing rates, homology to the GCS remained high. Only 11 (5 PR, 6 RT) of 114 sequences had <99% homology to the GCS and only 1 of the 11 (97.3%) had <98% homology to the GCS. Also, no association between experience in performance of the assay and amount of editing was observed (data not shown).
![]() View larger version (11K): [in a new window] |
FIG.2. Comparison of the homology of consensus sequences generated by the HGS platform to the frequency of editing performed to generate the consensus sequence. (A) PR (99 codons); (B) RT (320 codons). The percentages of the homology of the consensus sequences generated by individual laboratories are compared to the percentage of codons edited by the laboratory to obtain that sequence. Homologies to a group consensus sequence for each sample were determined as described in Materials and Methods. Edited codons containing lowercase letters were tabulated for each of the 10 laboratories. Letters in gray represent the laboratories who submitted data for panel 1 (n = 10). Letters in black represent the laboratories who submitted data for panel 2 (n = 9). Panels were distributed 6 months apart.
|
![]() View larger version (8K): [in a new window] |
FIG. 3. Comparison of the homology of consensus sequences generated by the TRUGENE platform to the frequency of editing performed to generate the consensus sequence. (A) PR (codons 4 to 97); (B) RT (codons 37 to 247). The percentages of the homology of the consensus sequences generated by individual laboratories are compared to the percentage of codons edited by the laboratory to obtain that sequence. Homologies to a group consensus sequence for each sample were determined as described in Materials and Methods. Edited codons containing lowercase letters were tabulated for each of the seven laboratories. Both panels were processed by the participating laboratories at the same time.
|
|
View this table: [in a new window] |
TABLE 2. Discordant results at positions associated with antiretroviral resistance
|
Discordant nucleotide identification along the template.
The ability to accurately report the identity of each nucleotide at each position along the template could be considered in the assessment and evaluation of genotyping proficiency. This issue was also described by Sayer et al. (8) in their analysis of, predominantly, in-house assays and by Shafer et al., who examined replicate testing by two laboratories also using in-house assays (9). In our testing, all laboratories used the same sequencing platform and the group consensus sequence for each sample was generated by alignment using a 60% threshold. This means that the same nucleotide in the GCS was identified by
60% of the laboratories. Figure 4 shows three examples of discordant identification of a nucleotide by individual groups by use of the HGS platform at positions 532, 534, and 846. In 8 of 10 sequences, the base at position 532 was identified as "r." In the other two sequences, from laboratories 3 and 4, the nucleotide was identified as "g." The "r" is interpreted by International Union of Biochemistry (IUB) code to be a mixture composed of bases "a" and "g." The converse situation is shown for position 534. Here, 60% of the laboratories identified the base to be a pure base "a," which is recorded in the group consensus sequence, and 40% of the groups designated the base to be "r." In the final example at position 846, 9 of 10 laboratories identified the base as a "c," but 1 group reported a "t." In this example, no mixed bases were identified for that position.
![]() View larger version (16K): [in a new window] |
FIG. 4. Examples of nucleotide identifications discordant from the group consensus sequence. Alignments of individual sequences were made using the group consensus sequence as a reference. The group consensus sequence is in the top line. Mixed bases are identified by letters according to the International Union of Biochemistry (IUB) code; "r" is composed of "a" plus "g"; "k" is composed of "g" plus "t." Nucleotides submitted that are identical to the group consensus sequence are designated by dots. Bases discrepant from the reference sequence are designated by letters. The alignment program does not distinguish between upper- and lowercase letters.
|
|
|
|---|
98%. Since only 1 of 114 HGS and 12 of 80 TRUGENE sequence data sets were <98% homologous to the respective GCS, the data suggest that a cutoff value for assessment of proficiency should be a minimum of 98% homology and could possibly be set higher. Until more data are obtained and analyzed to refine the recommendation, the suggestion of 98% homology would incorporate the more stringent interpretation of mixtures discussed above for Fig. 4. This would mean that if a mixed base is represented in the GCS, then data returned by a laboratory should show a mixed base in the same position to be considered homologous. Sequence homology to a group consensus sequence was shown to be slightly lower by Sayer et al. (8), but these results may be influenced by the high proportion of in-house assays in the group evaluated (six of nine laboratories). Although some samples in our study displayed <98% concordance to the GCS, it is possible that the discrepancies were generated in part by the use of RUO kits where less-stringent quality control of kit reagents was in place. It is anticipated that the number of samples that should reach at least >98% concordance to the GCS would increase using FDA-approved kits and software for analysis. Another indicator that might contribute to assessing proficiency is the generation of discordant data along the template sequence (8, 9). By examining the patterns of editing reflected by the case designation of each base reported, potential difficulty in technical performance may be identified. For instance, large regions of sequence requiring editing may reflect generation of poor sequence data. Areas of discordance in sequence reporting may point out differences in editing strategy (6) that can affect the identification of mutations associated with antiretroviral resistance.
It is more difficult to set forth criteria to evaluate the identification of mutations associated with antiretroviral resistance. Our data suggest that concordant identification of the presence or absence of codons representing antiretroviral resistance at a site of interest was high but was not always 100%. In the examples described in Table 2, in one instance, one laboratory reported the presence of a resistant mutation when the rest of the group identified a WT codon; in the other instance, one laboratory reported a WT codon when the rest identified a resistance mutation. More data, perhaps using replicate samples in panels, need to be analyzed to determine how best to evaluate proficiency in identification of antiretroviral resistance mutations.
Editing performance was generally not found to be useful in proficiency testing evaluation. The data in Table 2 showed that different editing strategies might be used while still employing the guidelines provided during company-sponsored training. The variation in strategy might be reflective of the quality of sequence data generated from the technical components of the assay. However, the data shown in Fig. 2 and 3 indicate that the laboratories submitted data that were highly concordant with the rest of the group despite editing practices. The data suggest that generation of good sequence data can occur although the editing process may not be absolutely uniform among laboratories.
As stated earlier, both FDA-approved genotyping systems provide positive and negative controls and instruction to use them with each batch of samples to be sequenced. The kit controls are useful for determining whether the protocol, reagents, and instrumentation are in good working order and whether individuals performing the assays are able to generate good-quality sequences. Depending on the control used, different steps of the protocol can be monitored routinely during assay performance, which contributes to the laboratory's success in proficiency testing. However, as also mentioned, these defined controls do not allow the complexities of the assay to be monitored in a way similar to proficiency testing.
Our experience with these panels and pre-FDA-approved kits suggests that analysis of proficiency of individual laboratories should be grouped by platform to allow for differences in the configurations of the assays (2). In this way the performance by the laboratories can be evaluated within platform fairly and will not be influenced by the differences in kit configuration and software interpretations or by the length of sequence data generated. Our data also suggest that the use of clinical samples to test proficiency may provide an accurate evaluation of the complexities involved in performance of genotyping assays.
A proficiency program for approximately 40 participant laboratories using primarily the TRUGENE and ViroSeq platforms was initiated using several of the criteria described in this study. Seven panels of clinical samples have been distributed over the past 3 years. Cumulative data from these panels, especially those of replicate samples, have reinforced and strengthened the use of these criteria as an effective means of monitoring the performance of genotyping assays and learning more about the factors that can contribute to the variability of the assay.
We thank Applied Biosystems for providing the HIV-1 Genotyping System version 1 and software versions 1.1 and 2.0.
Members of the Pediatric AIDS Clinical Trials Group Sequencing Working Group include Grace Aldrovandi, University of Alabama at Birmingham, Don Brambilla, New England Research Institute, Clark Brown, Applied Biosystems, Susan Eshleman, Johns Hopkins Medical Institutions, Susan Fiscus, University of North Carolina, Lisa Frenkel, University of Washington, Hasnah Hamdan, Nichols Institute, Stephen Hart, Frontier Science and Technology Research Foundation, Diana Huang, Rush Medical College, Andrea Kovacs, University of Southern California, Paul Krogstad, University of California at Los Angeles, Phillip LaRussa, Columbia University, Paul Palumbo, University of Medicine and Dentistry of New Jersey, Walter Scott, University of Miami, Stephen Spector, University of California at San Diego, John Sullivan, University of Massachusetts, Adriana Weinberg, University of Colorado Health Sciences Center, and Yu Qi Zhao, Northwestern University. Participants outside of the PACTG group who contributed sequence data include Daniel Kuritzkes (University of Colorado; now at Harvard University) and Lynne Hough (Applied Sciences, Norcross, GA).
|
|
|---|
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»