The analysis of sequence variation in the mitochondrial and ribosomal regions has been shown to provide an efficient method for the identification of species in a wide range of animal taxa. In order to assess its effectiveness in the discrimination of genetic groups of Bemisia tabaci (Gennadius) species complex, populations from ten cotton cultivars were analyzed. MtCOI, and ITS1 sequences were evaluated to find out the genetic diversity through phylogenetic methods, viz., maximum likelihood and network neighbor-net analysis. Analysis of mtCOI-1 or barcoding region did not reveal significant variation. On the contrary, the mtCOI-II region revealed the presence of three B. tabaci genetic groups or subgroups corresponded with the global data. However, the ITS1 region could not discriminate these groups of the species complex. Our results indicate that more than the barcoding and ITS1 regions, mtCOI-II region or 3′ end of the gene will be more appropriate for the identification of variations among the populations of B. tabaci species complex.
Among the genetic markers mtCOI has gained widespread prominence during the past eight years (Dai et al. 2012). The increasing number of species complexes or cryptic species is a main concern in this era. The selection of the genes and methods of analysis has been the central problem. The 5 prime segment of the mtCOI gene selected as the universal barcoding region is found less effective in some taxon groups (Hebert et al. 2003; Meier et al. 2006; Elias et al. 2007). Some studies have suggested the Internal Transcribed Spacer 1 region (ITS1-ribosomal DNA repeating unit) as a candidate DNA barcode (Gao 2010; Chen et al. 2010).
In exploring the phylogenetic relationships within the taxa, phylogenetic methods that assume bifurcating trees, neighbor joining, maximum likelihood, maximum parsimony and minimum evolution are often used to explore the relationships though gene evolution cannot always be represented in this manner. Rather, genealogies of closely related taxa may be multifurcated; descendant genes coexist with persistent ancestors producing reticulate relationships. To solve this, networking approaches have been developed; these include pyramid technique, statistical geometry, split decomposition, median networks, median joining network approaches, molecular variance parsimony, netting, likelihood network, reticulogram and reticulophylogeny (De Barro & Ahmed 2011). The use of statistical parsimony emphasizes what is shared among haploypes that differ minimally rather than the difference among the haplotypes. Thus it provides empirical assessment of deviations from parsimony (Gentile et al. 2002).
Whitefly Bemisia tabaci Gennadius (Aleyrodidae: Hemiptera) is one of the most harmful pests globally. De Barro et al. (2011) explored the history of B. tabaci as biotypes and species complex composed of multiple species on the basis of partial mitochondrial cytochrome oxidase 1 (mtCOI) region. Boykin et al. (2007) and Dinsdale et al. (2010) provided the evidence to support the conclusion that it is a species complex. These studies proved that B. tabaci is composed of at least 28 putative species. It is important to be cautious here because we know only something about the maternal side through our analysis of a portion of a single gene mtCOI, and because we know much less about the nuclear DNA and overall genetic relatedness.
Successful species identification involves DNA isolation, PCR sequencing, and species assignments. However, the effect of sequencing success rate through markers varies remarkably. In the meantime the selection of proper phylogenetic analysis is also important to understand the genetic relationships. Therefore, in this present study mitochondrial and ribosomal sequences were analysed and compared to explore the genetic diversity within the B. tabaci species complex utilizing phylogenetic approaches, viz., tree and network based methods.
Materials and Methods
Sampling, DNA Extraction, PCR and Sequencing
Ninety specimens of B. tabaci were sampled from 10 cultivars of cotton from the farms of the Indian Agricultural Research Institute (IARI), New Delhi, India. DNA samples were prepared from individual insects by extraction of total DNA from fresh or 100% ethanol preserved specimens. The voucher specimens of these are deposited with the National Pusa Collection (NPC), Division of Entomology, IARI, New Delhi, India. Genomic DNA was extracted using DNAeasy Blood and Tissue Kit (Quiagen, Amph, Germany). The mtCOI-I region was amplified via PCR using Taq DNA polymerase with the primers LCOI490 (GGTCA ACA AATCA TAAAGA TATTGG) and HCO2198 (TAAA CTTCA GGGTG ACCAAA AAATCA) (Folmer et al. 1994); mtCOI-II region with the primers C1-J-2195 (TTGATTTTTTGGTCATCCAGAAGT) and TL2-N-3014 (TCCAATGCACTAATCTGCCATATTA) (Simon et al. 1994). For the ITS region of rDNA used the primers TW81 (GTTTCCGTAGGTGAACCTGC) and 5.8R (ATCCGCGAGCCGAGTGATCC) (De Barro et al. 2000). The amplification reaction was performed in a total volume of 25 μL including 2.5 μL of 10 X PCR buffer with 2 μL of 25 mM MgCl2, 0.5 μL of 10 mM dNTPs, 0.5 μL each of forward and reverse primer, IU of Taq, 17 μL of UltraPure water (Invitrogen). Thermocycler conditions were as follows; denaturation for 5 min at 94 °C followed by 35 cycles of denaturation 30 sec at 94 °C, annealing 40 sec (46 °C for mtCOI-I, 54 °C for mtCOI-II, and 54 °C for ITS1) and an extension time of 40 sec at 72 °C. Final extension was given for 5 min at 72 °C. PCR products were visualized on agarose gel after electrophoresis. Single bands were purified using a QIAquick PCR purification kit (Quiagen GmbH, Germany). Purified PCR products were sequenced directly in both directions by an automated sequencer (ABI prism® 3730XL DNA Analyzer; Applied Biosystems, USA) at Scigenomics Lab, Cochin, India. All sequences were aligned using BioEdit 4.0 program, using ClustalW 1.8 (Thompson et al. 1994). The sequences were used in a BLAST search to confirm the sequence identity. The raw DNA sequences were all checked manually by eye. After trimming the ends of raw sequences, these were aligned using MUSCLE (MEGA 5.0) under default parameters.
Maximum Likelihood Tree
All sequences in FASTA format were imported into the sequence alignment application of MEGA 5.0 (Tamura et al. 2007) software package and multiple sequence alignments were performed with the ClustalW (Jeanmougin et al. 1998) algorithm using default parameters.
Phylogenetic trees were reconstructed by using the general time-reversible model of DNA substitution and graphically displayed in a maximum likelihood (ML) tree by the program MEGA 5.0 (Tamura et al. 2011). To assess the phylogenetic support for groupings on the tree, we performed a bootstrap resampling analysis (1,000 replications). ML is a phylogenetic tree reconstruction method which is based on characters.
The neighbor-net method is fast and informative. When events like gene transfer or recombination occur, tree-based methods cannot explain complex evolutionary scenarios well, and networks methods seem more realistic (Bryant & Moulton 2004). In present study, data analyzed using a neighbor-net approach, which constructs split networks from inferred distance matrices, in the computer program SplitsTree4.10 (Huson & Bryant 2006).
Ninety specimens were obtained from 10 sampling sites selected cultivars selected. Three replications of B. tabaci populations were sequenced from each cultivar. The mtCOI gene achieved highest sequencing success rate of 100% among the 2 genes examined, while the ITS1 sequencing success rate was 70%. All sequences successfully sequenced were used in the subsequent analysis. The resultant mtCOI sequences had a length of 658 bp and 816 bp respectively, while ITS1 had an alignment length of 508 bp. All these sequences are deposited in the NCBI GenBank with accession numbers JN703437-56 for mtCOI and for ITS1. We obtained 3 ML trees based on each mt- COI and ITS1.
Of the sequences analyzed, the most parsimonious tree generated from the mtCOI-I did not show any major clades or differences (Fig. 1). The same trend was observed when these data were analyzed by neighbor-net analysis (Fig. 2). While mtCOI-II showed 3 major clades, i.e., those from the cultivars ‘P86’ and ‘F2036’ formed Clade I, those from ‘LRA’ stands as Clade II, and the rest (7 populations) from ‘P1752’, ‘LD327’, ‘HS1300’ and ‘P59’ formed one subclade, while those from ‘RS810’, ‘LRK’ and ‘Abadita’ formed a second subclade of Clade III (Fig. 3). The results of ML and neighbor-net analysis were complementary in the case of mtCOI-II (Fig. 4). The phylogram generated from ITS1 showed that there are 3 major clades, in which Clade I consists of populations from the cultivars ‘P1752’, ‘HS1300’, ‘LRK516’, ‘RS810’, ‘LD327’ and ‘P59’. Clade II consists of those from the cultivars ‘LRA5166’, ‘Abadita’ and ‘P86’. Clade III consists of only one population from the cultivar ‘F2036’ (Fig. 5). ITS1 also revealed the same trend when analyzed using neighbor-net (Fig. 6). Thus, these results revealed that only mtCOI-II with a different analyses strategy will be appropriate for genetic analysis and identification of members of B. tabaci species complex.
The sequences of B. tabaci from different regions of the world obtained (courtesy Laura Boykin) and analyzed using maximum parsimony method revealed the presence of 3 genetic groups (Thomas et al. 2014). The same trend was observed when these sequences were analyzed in neighbor-net method (Fig. 7). Genetic groups when categorized on the basis of mtCOI-II region revealed the following: the populations from the cultivars ‘HS1300’, ‘P59’, ‘LRK516’, ‘P1752’, ‘RS810’, ‘Abadita’ and ‘LD327’ fell under the Asia II group; those from the cultivar ‘LRA5166’ as Asia II-7; and those from ‘F2036’ and ‘P86’ to the Asia I. The mtCOI-II sequences of the populations from 10 cotton cultivars revealed the occurrence of 5 out of the total 16 groups or subgroups from Asia, falling under Asia I and Asia II-1 and Asia II-7 genetic groups. The results obtained from mtCOI-I and ITS1 regions were thus found to be at variance from those obtained with mtCOI-II.
The statistical analysis showed that the 10 populations of B. tabaci species complex from the selected cotton cultivars fall under 3 haplotypes, all belonging to the network that corresponded to one of the 28 putative species identified in Dinsdale et al. (2010) and Tay et al. (2012) based on mtCOI-II sequences. On the other hand, analyses of the barcoding marker or mtCOI-I failed to yield similar effective conclusions. Thus the B. tabaci species complex could be differentiated on the basis of mtCOI-II, as it better differentiates the nucleotide variations. The results obtained through the nuclear marker ITS1 provide a complimentary data set where the genetic groups obtained through mtCOI-II were found mixed together. Different methods of analysis used herein, viz., ML and neighbor-net infer the same for mtCOI-II and ITS1. However, the barcoding region or mtCOI-I showed different results with these 2 methods and made interpretations difficult. Since 2007, the relationships between members of the B. tabaci species complex are being evaluated in a more structured and systematic way (De Barro 2011). But many of these relationships had been inferred from considerations of a portion of the (mtCOI) gene. While this is a quite limited approach, which would benefit from the consideration of a much greater diversity of genetic material, it is the only publically available data that spans the diversity of the species complex. De Barro et al. (2000) and Abdullahi et al. (2003) considered the ribosomal ITS1, while De Barro et al. (2005) showed a good correlation between the ITS1 and mtCOI. It has often been shown that phylogenetic conclusions might reflect bias in the methodology used (Anisimova et al. 2013). Hart & Sunday (2007) and Chen et al. (2010) observed that the application of statistical parsimony network analysis identified a strong association between breaks in network connectivity and species level separation. This multiple approach also identified the need for more work to resolve the relationships. Our results further support the view that when attempting to estimate species diversity where morphological data is unhelpful, use of multiple markers and phylogenetic methods are needed to resolve the relationships. The data and analysis presented herein provide some information on the importance of selecting appropriate multiple locus regions, and also suitable phylogenetic methods.
In conclusion, it is explicit that there is no support for the genetic analysis of B. tabaci species complex with ITS1 and mtCOI-I data. Although the whitefly populations were genetically different, ITS1 and mtCOI-1 region could not separate these populations, even with different phylogenetic methods. Pending further studies and generation of data it might be worthwhile to conclude that perhaps the mtCOI-II region will be appropriate for genetic analysis and identification of the groups and subgroups of B. tabaci species complex.
Authors gratefully acknowledge the financial support received from the Indian Council of Agricultural Research (ICAR), New Delhi through the XIth Plan Network Project on Insect Biosystematics (NPIB).