Previous Article | Next Article ![]()
Journal of Clinical Microbiology, April 2008, p. 1426-1434, Vol. 46, No. 4
0095-1137/08/$08.00+0 doi:10.1128/JCM.01560-07
Copyright © 2008, American Society for Microbiology. All Rights Reserved.
,
Guang-Wu Chen,2,3,
Cheng-Tsung Lai,1
Li-Ching Hsu,1
Pei-Jer Chen,4
Steve Hsu-Sung Kuo,1
Ho-Sheng Wu,1,5* and
Shin-Ru Shih3,6*
Research & Diagnostic Center, Centers for Disease Control, Taipei, Taiwan, Republic of China,1 Department of Computer Science & Information Engineering,2 Research Center for Emerging Viral Infections, Chang Gung University, Taoyuan, Taiwan, Republic of China,3 Graduate Institute of Clinical Medicine, College of Medicine, National Taiwan University, Taipei, Taiwan, Republic of China,4 School of Medical Laboratory Science and Biotechnology, Taipei Medical University, Taipei, Taiwan, Republic of China,5 Department of Medical Biotechnology & Laboratory Science, Chang Gung University, Taoyuan, Taiwan, Republic of China6
Received 5 August 2007/ Returned for modification 19 December 2007/ Accepted 24 January 2008
|
|
|---|
|
|
|---|
Starting in 2003, the Centers for Disease Control (CDC) of Taiwan has been receiving influenza virus isolates from its 12 contract virology laboratories around the island and has sequenced the HA1 region of many of these isolates. By July 2006, more than 3,000 HA1 sequences were obtained from influenza A viruses H1 and H3 and influenza B virus. In this study we used these sequences to determine the evolutionary properties of these Taiwanese influenza viruses by integrating their genetic features with local epidemiological information. Distance-based sequence clustering and phylogenetic analysis were both used to reveal the evolutionary pattern and important amino acid variations between Taiwanese isolates and the corresponding vaccine strains or global strains found in databases in the public domain.
|
|
|---|
|
View this table: [in a new window] |
TABLE 1. Amino acid sequence clustering of influenza A and B viruses in Taiwan during 2003 to 2006
|
The average number of amino acid differences for the antigenic sites and the nonantigenic sites for H1 and H3 were calculated as
![]() |
We subsequently performed protein sequence clustering for the Taiwanese influenza A virus subtypes H1 and H3 and influenza B viruses collected. Two sequences were assigned to a different cluster as long as there was one amino acid difference in the HA1 region analyzed. We then classified a cluster as dominant if it contained five or more sequences.
Statistical analysis. The Student t test (two-tailed, two-sample test with unequal variance) was used to determine the significance in the average number of amino acid differences at antigenic sites and nonantigenic sites for influenza A virus subtypes H1 and H3 by use of SPSS software (version 13.0; SPSS Inc., Chicago, IL).
Phylogenetic construction. Phylogenetic analysis of a partial region of HA1 (as shown in Table 1) of influenza A virus subtypes H1 and H3 and influenza B virus was performed on the basis of the nucleotide sequences. The MEGA program (11) was used for tree building by use of the neighbor-joining method and the Kimura two-parameter distance matrix. The number of bootstrap replications was set to 1,000, and bootstrap values were labeled on major tree branches for reference. Note that clustering based on nucleotides was first performed for the Taiwanese strains, and only dominant clusters were used to infer phylogenetic relationships. Similar to the clustering scheme used for the amino acid sequences, two strains with any observed nucleotide difference were assigned to different clusters, and a dominant cluster contained at least five sequences. The cluster counts and the lifetimes of the dominant clusters for nucleotide-based and amino acid-based clustering can be different due to synonymous substitutions at the nucleotide level. Aside from the definition of a dominant cluster for the purpose of selecting representative strains for phylogenetic analysis, we further defined a strongly dominant cluster as one that contained 20 or more sequences, for the purpose of the later discussion.
Nucleotide sequence accession numbers. All sequences newly reported in this study have been deposited in the GenBank database under accession numbers EU068114 to EU068198.
|
|
|---|
![]() View larger version (31K): [in a new window] |
FIG. 1. Monthly distribution of positive isolates of influenza A virus subtypes H3 and H1 and influenza B viruses in Taiwan from January 2003 to August 2006 (A). (B to D) Individual distributions incorporating the amino acid differences for subtypes H3, H1, and B, respectively. The amino acid differences for the antigenic and nonantigenic sites of subtypes H3 and H1 are graphed separately in panels B and C, respectively. While no antigenic site information was available for influenza B viruses, the amino acid differences for the Victoria and Yamagata lineages are graphed separately in panel D. The amino acid difference for a pair of sequences was computed by summing the number of aligned pairs that showed different amino acid residues. The average amino acid difference per month was then computed on the basis of all pairs of aligned sequences having their disease onset time within the same month. Months in which less than one pair of viruses was isolated and for which no amino acid difference could be produced are labeled with asterisks. a.a., amino acid.
|
It is apparent that from October 2003 to June 2004 the sequence diversity at the antigenic sites was higher than that during any other period. This observation is in line with laboratory hemagglutination inhibition (HI) test results (data not shown), which showed that a major antigenic drift from A/Panama/2007/99-like to A/Fujian/411/2002-like strains was first seen in the winter of 2003-2004, followed by detection of the first batch of local strains that were antigenically distinguishable from A/Fujian/411/2002 in the summer of 2004. Three subsequent smaller peaks of antigenic sites occurred in March 2005, September 2005, and May 2006, with only the one in March 2005 being followed by a major peak of H3.
We also calculated the average amino acid differences for five antigenic sites of H3 separately (data not shown) and found that antigenic sites B and D had higher degrees of diversity than the other three sites from 2003 to mid-2006, suggesting that these two sites were important hot spots when antigenic drift was seen in 2004 and 2005, when the Fujian-like strains were transformed into California-like strains. Finally, fewer amino acid differences were seen at the nonantigenic sites for H3 in 2003 and 2004 than in 2005 and 2006. The average numbers of differences at antigenic and nonantigenic sites in 2003 and 2004 were 2.76 and 0.43, respectively, while the average numbers of differences at antigenic and nonantigenic sites in 2005 and 2006 were 1.88 and 0.91, respectively. The less apparent gap between them in the two most recent years analyzed might suggest stabilization of the Fujian-like strains in the population, without further antigenic drift in the near future.
The corresponding activity of influenza A virus H1 in terms of the amino acid diversity at the antigenic and nonantigenic sites is shown in Fig. 1C. We analyzed only the sequences available in 2005 and 2006 because no apparent activity was detected in 2003 and 2004. Note that the HA1 region analyzed is shorter for the H1 virus (169 amino acids) than for the H3 virus (262 amino acids). The antigenic sites are also less abundant (25 of 169 for H1 and 125 of 262 for H3). In contrast to H3, on the other hand, the nonantigenic sites of H1 showed larger numbers of amino acid differences, on average, than the antigenic sites of viruses recovered during this time period. The average numbers of differences at antigenic and nonantigenic sites of H1 in 2005 and 2006 were 0.90 and 2.34, respectively, while the average numbers of differences at antigenic and nonantigenic sites of H3 from 2003 to 2006 were 2.63 and 0.68, respectively. This phenomenon corresponds well to the findings of a previous study that the HA1 gene of H1 undergoes neutral evolution rather than positive selection, as is observed in H3 (24). Moreover, we have observed that the overall average amino acid difference has steadily increased since February 2006. The average differences at antigenic and nonantigenic sites in 2005 were 0.39 and 0.67, respectively, while in the first 7 months of 2006 they were 0.98 and 2.93, respectively, which illustrates the trend for increasing H1 activity as well as for both antigenic and nonantigenic substitutions from 2005 to 2006.
It was reported that both the Victoria and the Yamagata lineages of influenza B viruses have cocirculated in Taiwan in recent years. Reassortants from these two lineages were detected in as early as 2002 and became dominant in the winter of 2004-2005 (5, 12, 19). Classification in either the Yamagata or the Victoria lineage was based on a BLAST search of their HA1 gene regions and comparison with the sequences in the nucleotide database of the National Center for Biotechnology Information. The Yamagata-lineage strains steadily showed a greater average amino acid difference (3.93) prior to May 2005, after which they became obsolete (Fig. 1D). On the other hand, the Victoria-lineage strains had a much lower level of amino acid diversity (0.94) compared with their sequences after December 2004, when they became prevalent, although they were apparently the dominant lineage during the course of their cocirculation with Yamagata viruses. It is known that B/Shanghai/361/2002 (a Yamagata-like strain) was selected as the influenza B virus vaccine strain for the Northern Hemisphere in the winters of 2004-2005 and 2005-2006. We found that the circulating Victoria viruses had lower levels of amino acid diversity, yet their sequences appeared to better fit those of the population of circulating viruses, probably due to a mismatch of vaccine strains. On the other hand, the greater diversity found among those Yamagata viruses might have been driven by the selection of this vaccine strain in the general population.
General features of sequence clusters. The statistics for HA1 sequence clustering are summarized in Table 1. Each cluster, according to our definition that sequences are placed in separate clusters whenever a single amino acid difference exists between an aligned pair, has an HA1 region amino acid composition that is unique. The numbers of cluster (i.e., genetic variants) in influenza A viruses H1 and H3, the influenza B virus Victoria lineage, and the influenza B virus Yamagata lineage were 201, 497, 87, and 96, respectively. After normalization for their time spans, it is clear that the cluster counts per month for influenza A viruses H1 and H3 are comparable. They were, however, more prevalent than influenza B viruses, suggesting that influenza A viruses were more likely to evolve over the time period investigated. The influenza B virus Victoria lineage was found to be the largest cluster of all four subtypes, with the sequences of isolates of this lineage comprising 71.5% (534 of 767) of the sequence counts. On the other hand, the largest cluster of influenza A virus H1 sequences contained only 53 (9.9%) of 535 sequences, suggesting the presence of a dominant strain (from a genetic diversity point of view) among the influenza B virus Victoria-like strains, while such dominance was the least apparent for influenza A virus H1. This was also reflected by the cluster density, which was computed by dividing the total number of sequences by the total number of clusters, from which it was clear that influenza A virus isolates are far less dense (2.66 and 2.88 for H1 and H3, respectively) than influenza B viruses (8.59 and 5.57 for the Victoria and Yamagata lineages, respectively). In other words, more genetic variants of influenza A virus than genetic variants of influenza B virus circulated during 2003 to 2006 in Taiwan. The small percentage of dominant clusters (6.9 to 8.4%), in which each cluster contained five or more sequences, indicates that the prevalence of many of those clusters (over 90%) was sparse and the isolates failed to prevail in the general population. Among those clusters that dominated, influenza B virus Victoria strains seem to have aggregated as a limited number of genetic variants.
We also defined the duration or the lifetime of a dominant cluster as the time that elapsed (measured in months) from the earliest to the latest time of onset of sequences within that cluster. The lifetimes were found to be similar for the two influenza A virus subtypes according to either the average or the longest duration. However, for influenza B viruses, the lifetimes were double those for the influenza A viruses. This observation is in line with the findings of cluster analysis, mentioned above, that the influenza B viruses circulating in Taiwan over the past few years showed better genetic coherence into a number of major strains than the influenza A viruses did and thus were able to survive and cause disease for longer durations.
The monthly compositions of the cluster counts for each subtype were also computed and are shown in Fig. 2. Regardless of how many subtypes of influenza viruses were present or which subtype dominated in a specific month, there appears to be a good correlation between the sum of the cluster counts and the total virus counts. In other words, more circulating genetic variants appeared to contribute directly to sizable epidemics, as represented by the peaks in Fig. 2.
![]() View larger version (28K): [in a new window] |
FIG. 2. Monthly compositions of clusters and total numbers of influenza A virus subtypes H1 and H3 and influenza B virus from January 2003 to July 2006.
|
![]() View larger version (25K): [in a new window] |
FIG. 3. Phylogenetic trees of the HA1 gene in influenza A virus H3N2 and H1N1 strains and influenza B virus strains. The nucleotide sequences of the H3N2 and H1N1 and the influenza B virus strains (positions 178 to 963, 147 to 653, and 232 to 891 of the HA gene, respectively, as described in Materials and Methods) were first aligned by use of the Clustal W program, and phylogenetic analysis was performed by use of the MEGA2 program and the neighbor-joining method. The percentages of bootstrap frequencies at the major branches are indicated. Strains A/Moscow/10/1999, A/Brevig Mission/1/1918, and B/Lee/40 were used as the outgroups for the H3N2 and H1N1 viruses and influenza B virus, respectively. The cluster identifier on each leaf node contains three sections: a unique cluster name, the earliest and latest onset month (month/year), and the size of the cluster. Solid triangles, vaccine strains selected by the World Health Organization; solid circles, strongly dominant clusters, each of which contains more than 20 sequences. The nucleotide sequences of the reference strains indicated in parentheses completely match the nucleotide sequences within that cluster. The amino acid changes at major branches are also labeled, with additional letters shown in parentheses if that position is a reported antigenic site.
|
Figure 3C shows the phylogenetic tree of the HA genes of the influenza B virus isolates, which is clearly divided into a Yamagata lineage and a Victoria lineage. The Yamagata-like strains showed a more diverse evolutionary pattern, which could be further grouped into three clades, namely, clades Ya, Yb, and Yc. The sequences of the isolates in clade Ya were similar to the sequence of B/Yamagata/1246/2003, while the sequences of the isolates in clades Yb and Yc were similar to the sequences of B/Georgia/9/2005 and B/New York/12/2005, respectively. For the Victoria-like lineage, the sequence of the largest cluster (V-2, with 376 sequences) completely matched the sequence of B/Nepal/1137/2005, which had the longest life span, from March 2004 to June 2006 (28 months). Although Yamagata-like strain B/Shanghai/361/2002 was chosen as the vaccine strain for the 2004-2005 and 2005-2006 seasons, the Taiwanese Victoria-like strains actually prevailed and outnumbered those Yamagata strains, and they were apparently less varied in terms of their HA1 sequences (fewer clusters, each with a relatively short tree branch). Although both influenza B virus lineages were dominant, in particular during the outbreak in the winter of 2004-2005, a Yamagata lineage containing Victoria clusters (and vice versa) was not observed, suggesting that no recombination took place in the HA1 domain analyzed.
|
|
|---|
Another interesting observation for H3 is the relative amino acid sequence variations between antigenic sites and nonantigenic sites. From April 2003 to June 2004, the ratio of the number of antigenic site changes to the total number of variations was higher than that during any other month. This period corresponds to a time when certain antigenic changes and strain transitions from Moscow (or Panama) to Fujian strains and, following that, to California (or Wellington) strains were observed. It seems that Fujian strains were converted to California strains in a short period of time, and they were closer in terms of their evolutionary distance than the Moscow and Fujian strains. In contrast, from April 2005 to February 2006, the ratio of the number of antigenic changes to the total number of variations was the lowest among those that occurred during the period of time under investigation. Indeed, A/California/7/2004 was used as the vaccine strain in the winter of 2005 and had apparently relieved some positive selection pressure on the antigenic sites for H3 cases from the fall of 2005 and beyond.
In contrast to the circulating H3 viruses, for which several peaks were found during the 4 years of surveillance and the antigenic changes were overwhelmingly higher than the nonantigenic ones, only one H1 peak, which occurred in the winter of 2005-2006, was found, and the amino acid changes were mostly nonantigenic. Although the average number of amino acid differences at the H3 antigenic sites was greater than the number at the H1 antigenic sites in our studies (2.63 and 0.90, respectively), it should be noted that the average number of amino acid differences per residue at H1 antigenic sites (0.04) was larger than the average number at H3 antigenic sites (0.02). Nevertheless, it is our belief that the number of amino acid variations at all antigenic sites determined the overall antigenic variations in the HA1 region. Evidence for this is that most H1 strains isolated in Taiwan, based on their HI test data, still showed high titers against A/New Caledonia/20/99. Furthermore, the lower level of fitness of H3 during the winter mentioned above might have given H1 a chance to cause an epidemic, as we have described here. It seems that the H3 and H1 viruses were competing somewhat and were holding each other up over these 4 years. That is, there were almost no cases of H1 infection in Taiwan prior to the winter of 2005-2006, over which this long period of time H3 was dominant and revealed major strain transitions from Moscow to Fujian to California. On the other hand, H1 viruses took over in the winter of 2005-2006, when less H3 activity was found.
Although no antigenic site has been reported for influenza B viruses, higher levels of variation in the HA gene of the Yamagata lineage might indicate that this lineage had been under more evolutionary pressure than the Victoria lineage. The choice of vaccine strain could be the possible reason for this observation. In the winter of 2004-2005, the Shanghai strain (which is of the Yamagata lineage) was used as the vaccine strain and offered only limited protection against viruses of the Victoria lineage. In other words, the reassortment of these Victoria-like strains could describe another means by which they gained a better chance of survival and the competitive ability to cause epidemics.
The positive correlation found between the isolates and the number of clusters indicates that although one or some limited number of dominant strains contributed to one epidemic, other related strains also appeared during the epidemic, and the scenario seems to be similar to that for a "swarm" defined in a previous study (17). Different patterns of cluster distributions in these three subtypes also illustrated the disparity of the strategy of evolution for HA. For H1 and H3 viruses that had higher mutation rates, a shorter life span of the clusters is expected. Actually, no apparent difference between the cluster count per month, the average cluster size, the average lifetime, or the longest lifetime was found between the H1 and the H3 viruses (Table 1). Although the H1 viruses showed notable activity only in 2005-2006, the H1 and H3 subtypes revealed common characteristics by cluster analysis. For the influenza B viruses, on the other hand, very different cluster statistics were observed. In addition, the two lineages of influenza B viruses showed similar patterns in cluster statistics, although the Yamagata viruses revealed more sequence variations than the Victoria viruses.
The summer outbreaks caused by H3 were mainly caused by cluster H3-285 (which contained 91 sequences) in 2004 and by cluster H3-507 (which contained 162 sequences) and H3-567 (which contained 33 sequences) in 2005 (Fig. 1B and 3A). During the two winter seasons in 2002-2003 and 2003-2004 in which H3 caused the outbreaks, on the other hand, the virus strains generally dispersed into different clusters, and no single cluster could be considered strongly dominant (that is, the cluster contained 20 or more sequences). In other words, these winter outbreaks were caused by sporadic strains, while the outbreaks in the following summers were caused by the strains that were strongly dominant during this period. Furthermore, the ladder-like shape of the phylogenetic tree also indicates that only one major strain and other related minor strains of influenza A virus H3 circulated during a specific epidemic season, which supports well the proposed scenario (10, 20) from the epochal model of the evolution of influenza A virus.
Unlike the situation in which only one dominant strain of Taiwanese H3 viruses circulated in one season, two genetically distinct lineages of influenza A virus H1 cocirculated in Taiwan in the winter of 2005-2006. There was no apparent antigenic change in these H1 viruses, according to the results of the HI test (data not shown). From the analysis of the amino acid variations, we also saw a closer antigenic similarity of H1 to A/New Caledonia/20/99, which was used as the vaccine strain from 1999 to 2006. As A/Solomon Islands/3/2006 was selected as the 2007-2008 vaccine strain, it was indeed found to have a higher degree of sequence identity with the clade H1a viruses, which represented the dominant lineage in 2006.
Among the influenza B viruses, the dominant clusters in the Yamagata lineage were separated into three subclades, while in the Victoria lineage there was a major dominant cluster (cluster V-2) that contained 376 sequences and that was prevalent from March 2004 to June 2006. The observation that the dominant strains in the Yamagata lineage had shorter prevalence times than those in the Victoria lineage might have resulted from recently reported reassortants and might have involved their HA and neuraminidase genes (5, 12, 19). During the winter of 2006-2007, these reassortants caused one serious epidemic that was even larger than the other epidemics that occurred after 2000 (9). In contrast, the introduction of vaccine strain B/Shanghai/361/2002 during this time period, which set up greater evolutionary pressure for the Yamagata viruses, apparently drove them to evolve into more different subclades. Despite these genetic transitions, however, they seemed to have less of a competitive advantage than the Victoria strains in the winter of 2006-2007 that followed (data not shown).
In this work we performed the genetic characterization of Taiwanese influenza viruses on the basis of pairwise analysis of amino acid variations (at antigenic and nonantigenic sites), genetic clustering, and phylogenetic analyses. Although they have provided a good description of the evolutionary story for HA, some questions remain to be answered. One is the evolutionary relationship between these clusters. For example, we are interested in knowing whether the dominant clusters were located in the center of the sequence space for all clusters found in the same epidemic. Another important task is to find the most likely ancestor of these dominant clusters so that the transition between epidemics may be better described. In addition to clustering analysis, we measured the relationship of these clusters with the vaccine strain and other local strains that cocirculated, which may also provide more insight into the evolution of the HA gene in influenza A and B viruses.
We thank the investigators of the CDC-Taiwan Contracted Virology Reference Laboratory Network for collecting and providing clinical samples: Chuan-Liang Kao, National Taiwan University Hospital, Taipei; Jang-Jih Lu, Tri-Service General Hospital, Taipei; Yu-Jiun Chan, Veterans General Hospital, Taipei; Kuo-Chien Tsao, Chang Gung Memorial Hospital, Linkou; Ming-Jer Ding, Veterans General Hospital, Taichung; Mu-Chin Shih, Chinese Medical University Hospital, Taichung; Jen-Shiou Lin, Changhua Christian Hospital, Changhua; Jen-Ren Wang, National Cheng Kung University Hospital, Tainan; Kuei-Hsiang Lin, Kaohsiung Medical University Hospital, Kaohsiung; Yung-Ching Liu, Veterans General Hospital, Kaohsiung; Hock-Liew Eng, Chang Gung Memorial Hospital, Kaohsiung; and Li-Kuang Chen, Tzuchi Hospital, Hualien.
Published ahead of print on 6 February 2008. ![]()
Supplemental material for this article may be found at http://jcm.asm.org/. ![]()
These authors contributed equally to this work. ![]()
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»