Monitoring water quality with aquatic insects as sentinels requires taxonomic knowledge of adult and immature life stages that is not available in many parts of the world. We used deoxyribonucleic acid (DNA) barcoding to expedite identification of larval caddisflies from 20 sites in the headwaters of the Tigris River in northern Iraq by comparing their mitochondrial cytochrome c oxidase subunit I (COI) sequences to a global reference library (the Trichoptera Barcode of Life). We obtained full-length DNA barcodes for 16 COI haplogroups from 11 genera in 9 Trichoptera families. The most haplogroups and genera were recorded from Sulaimani Province. Two distinct COI haplogroups were found for the genus Psychomyia, and 5 haplogroups were found for Hydropsyche. The Hydropsyche COI haplogroups do not form a monophyletic clade with reference to the world fauna, but 4 out of 5 haplogroups are related to other Palearctic species. Three larval Rhyacophila specimens in a single COI haplogroup are closely related to specimens of Rhyacophila nubila Zetterstedt and Rhyacophila dorsalis (Curtis) from Europe, but adults from Iraq are needed to confirm their species identity.
Aquatic macroinvertebrates often are used as bioindicators of water quality. The difficulty of identifying immature aquatic insects to reliably named genera and species is a major impediment to this practice. Freshwater biomonitoring programs are used effectively in nations where the juvenile stages of the resident aquatic insect fauna are well known to taxonomists. However, even in these nations, larvae of most macroinvertebrates cannot be identified at the species level, and inconsistencies in sample sorting and identification have affected the accuracy of functional metrics (Lenat and Resh 2001, Haase et al. 2006, Stribling et al. 2008a, b).
Deoxyribonucleic acid (DNA) barcoding is the use of a short, standardized fragment of the mitochondrial cytochrome c oxidase 1 (COI) gene in species recognitions (Hebert et al. 2004). DNA barcoding is an effective method for associating life-history stages of Trichoptera, one of the most diverse aquatic insect groups (Zhou et al. 2007, Zhou 2009). The effectiveness of DNA barcoding of Trichoptera has been examined in subarctic Canada (Zhou et al. 2009, 2010), the Great Smoky Mountains National Park (a well-documented and biodiversity-rich locale, Zhou et al. 2011), and much of eastern North America (XZ, unpublished data). However, problems remain when a focal fauna is largely unknown. If adult samples cannot be identified reliably because of a lack of historical research and appropriate taxonomic keys, then DNA barcodes cannot be used to associate immature caddisflies with a nominal species. For North American caddisflies, only ∼25% of larvae are associated at the species level with adults, and 5 genera remain unknown in the larvae stage (J. C. Morse, Clemson University, personal communication). In countries where studies on larval taxonomy are lacking, the capacity for identifying caddisflies in benthic samples is minimal. Unfortunately, this scenario is typical for countries, such as Iraq, where the taxonomic knowledge necessary as a foundation for developing biomonitoring programs is lacking (Morse et al. 2007, Morse 2009).
The confluence of the Tigris and Euphrates Rivers once formed the largest freshwater wetlands in southwestern Asia, but from 1991 to 2003 these wetlands were drained via government-sponsored channelization projects. Monitoring water quality in the Tigris River and its tributaries is essential to the ongoing restoration of the Southern Marshes because the quality and quantity of water flowing from northern Iraq directly affects marsh recovery rates (Richardson and Hussain 2006). Despite the historical and ecological importance of the Tigris River Basin, its aquatic insect fauna remains poorly studied. Only 6 previously described Trichoptera species in 7 genera have been reported from Iraq (Al-Zubaidi and Al-Kayatt 1987, Malicky 1987). Mosely collected adults of Hydropsyche consanguinea McLachlan, Hydropsyche pellucidula Curtis, and Hydropsyche bulbifera McLachlan from near Mosul in 1934, and adults of Limnephilus turanus Martynov and Triaenodes (Ylodes) zarudnyi (Martynov) from near Basra in 1919 and 1926 (Mosely 1934, Malicky 1987). Over 50 y later, Al-Zubaidi and Al-Kayatt (1987) collected Hydropsyche consanguinea and Rhyacophila nubila Zetterstedt from Erbil Province in Kurdistan-Iraq. They also found larvae of the genera Agapetus, Hydropsyche, Hydroptila, Rhyacophila, and Setodes but could not identify these specimens to species.
The limited knowledge of the Iraqi caddisfly fauna will seriously delay establishment of a fully functional biomonitoring program in this country unless DNA barcoding can be used to expedite biodiversity surveys, species descriptions, larvae–adult associations, and taxonomic keys. In conventional studies, reliable taxonomic information linked to regular measurements of physical and chemical variables at reference sites is used to determine tolerance values for individual genera and species (Lenat 1993, Bonada et al. 2004). Another approach would be to link measurements of physical and chemical variables to COI haplogroups (i.e., clusters of specimens defined by sequence similarity in a neighbor-joining tree) instead of to formally named species. Functional bioassessment metrics also could be developed based on COI haplogroups before aquatic insect larvae are formally described or associated to known adults via classical methods. Our purpose was to lay a foundation for developing this link between DNA barcoding and bioassessment by documenting both morphological and COI haplogroup diversity of caddisfly larvae in the Tigris River Basin and establishing a DNA-barcode reference library for Iraqi caddisflies in the Barcode of Life Data Systems (BOLD) (Ratnasingham and Hebert 2007).
DNA barcoding already has been applied in model locales to document all eukaryotic life through the “DNA barcoding biota” initiative (Zhou et al. 2009). Close correspondence of DNA barcode clusters and morphological species in Ephemeroptera, Plecoptera, and Trichoptera at a subarctic site in Canada indicates that DNA barcodes can be used to gain rapid understanding of an unknown fauna. Thus, our objective was to construct a DNA-barcode library for Iraqi Trichoptera using all available materials, including male or female adults and immatures. We expected that many taxa registered in the DNA-barcode library would have provisional identifications but that barcoding would provide rapid insight into the morphological and molecular diversity of caddisflies for Iraq despite this early lag in formal identifications. The ultimate goal of our study was to develop a protocol that could be applied in other regions of the world to expedite our understanding of unknown aquatic insect faunas.
Materials and Methods
The Tigris River originates in southeastern Turkey, and many of its headwater streams are in Kurdistan-Iraq (Fig. 1). Twenty sites were sampled in 3 watersheds (Big and Little Zap Rivers and Diyala River) of the Tigris River Basin in Dohuk Province (4 sites), Erbil Province (5 sites), and Sulaimani Province (11 sites) (Fig. 1). These sites were sampled as part of a larger “Key Biodiversity Areas” survey conducted from 2007 to 2009 by Nature Iraq Organization.
Sampling and preservation
Benthic macroinvertebrates were sampled from the edges of the stream at each collection locality with a Hess stream sampler (0.09-m2 sampling area). Sites in Dohuk and Erbil Provinces were sampled in May and June 2008, and sites in Sulaimani were sampled in January 2009. At each site, 4 to 6 replicates were collected to obtain a representative sample for the area. The samples were washed immediately with 70% ethanol in the field and sieved through a 0.5-mm mesh. In the laboratory, each sample was washed again through the same sieve. Samples were examined with a dissecting microscope, and caddisfly larvae were removed from the sample and preserved in 70% ethanol. Adults were collected by hand at 2 sites and placed in separate vials containing 70% ethanol.
Larval specimens were examined with a dissecting microscope and further sorted into morphospecies based on gross external morphological characters that have been applied in larval caddisfly taxonomy, e.g., body and case size and shape, color patterns, and gill shape and placement. Adults were identified to genus based on keys in Malicky (2004), and larvae were identified based on keys and larval descriptions for the Nearctic and Palearctic faunas (Wallace et al. 1990, Morse et al. 1994, Wiggins 1996, Waringer and Graf 1997, Morse and Holzenthal 2008). Representatives of some morphospecies were examined by taxonomic experts (Pseudoneureclipsis: M. L. Chamorro, NMNH, Smithsonian Institution; Leptoceridae: J. C. Morse, Clemson University).
Tissues (generally the right hind leg) were removed for DNA analyses at the National Museum of Natural History (Smithsonian Institution, Washington, DC) and were shipped to the Canadian Centre for DNA Barcoding, University of Guelph, for DNA analyses. Individuals of all minor taxa were included in the DNA analyses, but individuals in the family Hydropsychidae were subsampled because they were collected in large numbers. Hydropsychid specimens were included from as many sites as possible and were chosen to show the widest range of morphological variations detectable with a dissecting microscope. A total of 150 Trichoptera specimens (144 larvae and 6 adults) were analyzed. Voucher information, DNA sequences, and trace files can be accessed in the project “Caddisflies of Iraq” (IQCAD) in BOLD (Ratnasingham and Hebert 2007). All COI sequences have been deposited in GenBank under accessions GU667663–GU667780.
Standard polymerase chain reaction (PCR) and DNA sequencing protocols (Ivanova et al. 2006, deWaard et al. 2008) were followed at the Canadian Centre for DNA Barcoding. The full-length barcode region (658 base pairs [bp]) of the COI gene was amplified with primers LepF1 (5′-ATTCAACCAATCATAAAGATATTGG-3′)/LepR1 (5′-TAAACTTCTGGATGTCCAAAAAATCA-3′) (Hebert et al. 2004). PCR products were visualized, cycle sequenced, purified, and bidirectionally sequenced on ABI 3730XL sequencers (Applied BioSystems, Carlsbad, California).
DNA-barcoding analysis and interactive identification confirmation
All COI sequences from Iraqi samples were included in the construction of a neighbor-joining (NJ) tree using analytical tools in BOLD with Kimura 2-Parameter distance methods. Because morphological sorting of many larval samples was provisional at both generic and species levels for this unknown fauna, the results from the DNA-barcode analysis were used to flag any inconsistencies at the species (haplogroup) level that suggested possible misidentifications. When inconsistencies arose, the relevant specimens were re-examined and identifications were updated accordingly. Furthermore, exemplars of each distinct haplogroup were examined against the global caddisfly barcode records in BOLD (XZ, unpublished data).
The phylogenetic relevance of the Iraqi Hydropsyche spp. was explored in a broad context that included a global sample of Hydropsyche species that were available through the on-going barcoding campaign on caddisflies (Trichoptera Barcode of Life [TrichopteraBOL]; www.trichopterabol.org). A Bayesian analysis was performed on the haplogroups recovered from the genus Hydropsyche. Exemplars were chosen from each haplogroup, and their COI sequences were compiled into a Nexus file along with 100 exemplars from other Hydropsyche species available in the TrichopteraBOL database from reliably identified and permanently vouchered specimens in 3 principal collections: the National Museum of Natural History, the University of Minnesota Insect Collection, and the Hans Malicky Collection. Potamyia and Cheumatopsyche were used as outgroups. The monophyly of the genus Hydropsyche is supported with both morphology (Schefter 2005) and DNA evidence from multiple genes (Geraci et al. 2010). The Bayesian analysis was performed in MrBayes v3.1.2 (Ronquist and Huelsenbeck 2003) using a GTR+I+G (General Time Reversible + Invariant + Γ) model with 6 Γ categories and default parameters. Six Metropolis-coupled Markov Chain Monte Carlo (MCMC) chains were run for 6 million generations. The 50% majority-rule consensus tree (20% burn-in) was examined to determine whether the relevant Iraqi haplogroups formed a monophyletic clade within Hydropsyche. Detailed voucher information on all specimens used in the phylogenetic analysis is available in BOLD ( http://www.boldsystems.org) in the projects IQSHY, SMCAD, and HYPSL (Appendix; available online from: http://dx.doi.org/10.1899/10-011.1.s1 (10.1899_10-011.1.s1.doc)).
Trichoptera diversity confirmed by morphology and DNA barcoding
The sequencing success was 81.3% after the first pass of PCR amplification using LepF1/LepR1 primers, and all but 1 specimen had sequence length >500 bp. Of these DNA sequences, 99.2% were of high quality and none were of low quality or unreliable. Thus, even benthic samples >2 y old and initially stored in 70% ethanol were amenable to DNA analyses. All discrepancies revealed by barcode analyses proved to be misidentification after careful re-examination of morphological characters. Sixteen larval haplogroups, including 11 genera from 9 families, were confirmed by both morphology and DNA barcodes (Fig. 2). However, specimens in the family Limnephilidae could be assigned only a provisional identification (nr. Halesus). The genus Halesus has not yet been reported from the Middle East, but the taxonomy of Limnephilidae larvae is not well known there and larvae of several genera described from the Middle East (Psilopteryx, Rizeiella, and Kelgena) are unknown. DNA barcode haplogroups of the genus Halesus were paraphyletic when the world fauna was considered (data not shown). Despite the uncertain identification for the nr. Halesus specimens, the barcoded specimens formed a cluster in the COI NJ tree (Fig. 2) with mean and maximum intraspecific distances of 0.6% and 1.7%, respectively (Table 1).
Summary of DNA-barcode genetic distances for Iraqi caddisflies and distribution of specimens (distribution codes refer to Fig. 1). COI = cytochrome c oxidase subunit I, BOLD = Barcode of Life data systems, ID = identification number, NN = nearest neighbor.
All COI haplogroups showed large (>2%) interspecific divergence from nearest neighboring taxa (Table 1) except for 1 Hydropsyche species pair (Hydropsyche CJG sp. IQ1 and H. CJG sp. IQ2) that showed only 1.9% minimum distance from each other. Members within each of these 2 COI haplogroups showed almost no intraspecific variation in COI (with 0.2% and 0% maximum intraspecific divergences, respectively). Morphological examination of these partially sympatric samples revealed differences in body and head colorations (Fig. 3a, b), a result suggesting these two taxa are closely related but distinct. Hydropsyche CJG sp. IQ3 and H. CJG sp. IQ5 (Fig. 3c, e) each were collected only in January from a single site each in Sulaimani Province (Table 1). Hydropsyche CJG sp. IQ1, H. sp. IQ2, and H. sp. IQ4 (Fig. 3d) were more widespread, but only H. CJG sp. IQ1 was found in all 3 provinces.
Two distinct COI haplogroups were found in Psychomyia larvae (Fig. 3f, g). Their barcode haplogroups had a minimum distance of 13.0% from each other's nearest neighboring member, and each had a mean intraspecific distance of 0.3% (Table 1). Both Psychomyia species were found at the Bekhma locality in Erbil Province (Site G, Fig. 1,Table 1). One adult male and 1 larva (Fig. 3f) of Psychomyia CJG sp. IQ1 were collected, but only larvae were found for Psychomyia CJG sp. IQ2 (Fig. 3g). The COI sequence of the larval Psychomyia CJG sp. IQ1 specimen differed from that of the adult male by only 1 bp, so we consider these specimens putatively associated. However, additional adults of both haplogroups are needed to confirm whether our specimens represent new species. The adult male of Psychomyia CJG sp. IQ1 collected in Erbil Province was similar to that of Psychomyia dadayensis Sipahiler from Turkey, but with variations in the shape of the apex of each inferior appendage. More specimens are needed to determine whether this specimen represents a variation of P. dadayensis or an undescribed species.
Another COI haplogroup, Agapetus CJG sp. IQ1 also was associated with adult females that were collected from the same locale (Table 1). Barcodes for 3 larval specimens of Rhyacophila closely matched barcodes of Rhyacophila dorsalis (Curtis) from Austria, Italy, and Germany, and Rhyacophila nubila Zetterstedt from Sweden and Norway, with minimum distances of ∼1.2% and ∼1.4%, respectively, to the nearest neighboring specimen of those nominal species (Table 2). An interim species name was maintained for the Iraqi larval specimens because the association of these Rhyacophila larvae is ambiguous. Morphological and biogeographical patterns in the adult males of these 2 species were studied extensively (Malicky 2002), and no intermediate form was found between these 2 species (despite the erection of several subspecies of R. dorsalis), a result suggesting that an identification will be possible when a male from the Iraqi population can be matched to the larvae via DNA.
Summary of DNA-barcode distances for Iraqi Rhyacophila and related species. COI = cytochrome c oxidase subunit I, BOLD = Barcode of Life data systems, ID = identification number, NN = nearest neighbor.
Geographic distributions of Iraqi caddisfly genera examined in our study
DNA barcodes were generated from specimens in 11 caddisfly genera in 9 families during our study (Table 3). Some overlap in generic distributions was found among Dohuk, Erbil, and Sulaimani Provinces, and Sulaimani Province had the highest richness (8 genera). Three genera were collected only in Dohuk Province, and 5 genera were found only in Sulaimani Province. Two genera, Psychomyia and Hydropsyche, were found all 3 provinces, but not all COI haplogroups were found in all provinces (Tables 1, 2). All of the taxa found in Iraq during our study had been recorded previously from Turkey except nr. Halesus, but Lepidostoma has not yet been reported from Iran (Table 3) (Malicky and Sipahiler 1984, Malicky 1986, Mirmoayedi and Malicky 2002). The genus Halesus has not been reported from the Middle East (Morse 2010), but because the larvae of Psilopteryx, Rizeiella, and Kelgena have not yet been described, we cannot confirm the identity of our specimens until they can be associated with named adults. Similar richness was found in the winter collections in Sulaimani Province (11 sites, 12 COI haplogroups in 8 genera) and summer collections in Erbil and Dohuk Provinces (9 sites, 9 COI haplogroups in 5 genera). The Chami Razan Area in Sulaimani Province had the highest generic and haplogroup richness of all 20 sites (6 haplogroups in 5 genera).
Phylogenetic relevance of Iraqi Hydropsyche
The Bayesian consensus topology based on COI sequence data recovered Hydropsyche CJG sp. IQ1 and H. CJG sp. IQ2 as sister taxa with 100% posterior probability (pp) support (Fig. 4). Hydropsyche CJG sp. IQ3 was most closely related to Hydropsyche modesta Navas from Turkey, and this species pair formed part of a larger clade (with 100% pp support) that included Hydropsyche contubernalis McLachlan (East Palearctic = EP, West Palearctic = WP), Hydropsyche hedini Forsslund (EP), and Hydropsyche maderensis Hagen (EP), Hydropsyche guttata Pictet (WP), Hydropsyche bulgaromanorum Malicky (EP, WP), and Hydropsyche ornatula McLachlan (EP, WP). Hydropsyche CJG sp. IQ4 was nested inside a clade (with 93% pp support) with Hydropsyche dinarica Marinkovic-Gospodnetic (WP), Hydropsyche iberomaroccana Gonzalez & Malicky (WP), Hydropsyche pellucidula (Curtis) (EP, WP), Hydropsyche botosaneanui Marinkovic-Gospodnetic (WP), and Hydropsyche incognita Pitsch (WP). Hydropsyche pellucidula, H. instabilis, and H. contubernalis all have been reported from Iran (Mirmoayedi and Malicky 2002) and Turkey (Malicky and Sipahiler 1984, Sipahiler and Malicky 1987), whereas H. ornatula has been reported only from Iran and H. modesta only from Turkey (Sipahiler and Malicky 1987). The COI barcode data suggest that H. CJG sp. IQ5 belongs to a species group for which representatives are not yet included in BOLD because it was not nested within a clade containing other species in the Bayesian consensus topology (Fig. 4).
Our study demonstrated a way to increase the speed at which a virtually unknown caddisfly fauna can be identified and associated by using DNA barcoding tools in an ecologically forensic way. This method has been used successfully in China (Zhou et al. 2007, Zhou 2009) and in a poorly studied locale in North America (Zhou et al. 2009). Only specimens of the genus Rhyacophila matched any existing barcodes available in BOLD, which contains nearly 2500 named caddisfly species (∼19% of the world fauna) and many undescribed species. This result could reflect: 1) the fact that the Middle Eastern caddisfly fauna is poorly represented in the DNA-barcode reference library, or 2) the possibility that many Iraqi species are new to science. Although we reported only COI haplogroup richness, we recognize that these haplogroups are putative species awaiting formal description and association with adults. Given that we found 11 genera in 9 families with just 1 type of sampling method in 1 habitat (Hess sampling of stream edges), we expect that with more-targeted multihabitat sampling, we will discover species richness levels in Iraq that are similar to (or higher than) those reported for Turkey and Iran (Malicky and Sipahiler 1984, Botosaneanu 1992, Mirmoayedi and Malicky 2002).
By characterizing generic and COI haplogroup diversity for caddisflies, we have begun to lay a foundation for the development of a biomonitoring program for the Tigris River Basin. Taxonomic specialists who have access to regionally appropriate keys can identify most late-instar caddisfly larvae to genus, but these specialists cannot be depended upon to identify large numbers of specimens collected during biodiversity surveys or biomonitoring projects in all areas of the world. Taxonomists and taxonomy positions are dwindling worldwide (Morse 2009), and with them the knowledge of the Trichoptera world fauna. In countries like Iraq, which have developing infrastructure, limited taxonomic resources, and no insect-rearing facilities, DNA barcoding is the only efficient way to associate immatures and adults to support the development of biomonitoring protocols. Formal descriptions may require years to complete, but DNA barcoding can serve as a tool now to enable biologists to quantify changes in COI-haplogroup-level distributions across space and time within weeks or months of a field-collecting event. Those data, in turn, can serve as a foundation for correlating water-chemistry data and species (or COI haplogroup) occurrences, which is necessary to derive tolerance values and biotic indices like those used in the US (Lenat 1993) and Europe (Bonada et al. 2004).
Construction of a DNA-barcode reference library for the Tigris River Basin aquatic insect fauna will have to be done based on a protocol that is different from the one used for well-known areas in North America and Europe. Instead of starting with reliably-identified museum voucher specimens to build a reference library for Iraq, many unknown adult and larval specimens will be included. Future collecting and taxonomic effort can then be directed toward putting barcoded COI haplogroups into phylogenetic context by comparing the Iraqi specimens to the TrichopteraBOL reference library, as was done in this study for Hydropsyche. Ultimately, varied life stages of the same COI haplogroup will be associated in the DNA-barcode library and all specimen-related data will be publicly available, making the process of classifying haplogroups within the Linnaean hierarchy transparent and repeatable. Additional markers (e.g., nuclear genes) also can be used to confirm the monophyly of COI haplogroups (Zhou et al. 2007). Formal scientific names will be linked to identifiable COI haplogroups when trained taxonomists are able to describe new species or to redescribe known species in more detail. This protocol will expedite the formal description and cataloguing of Iraqi caddisfly biodiversity.
We have demonstrated that DNA can be used to assist genus- and haplogroup-level identifications of Trichoptera by comparing larval COI sequences to the global DNA barcode library maintained in the TrichopteraBOL campaign. DNA barcoding of benthic macroinvertebrates will be crucial in developing countries that are trying to overcome a lack of knowledge of aquatic-insect taxonomy and trained taxonomists. DNA barcoding will help aquatic scientists in these countries generate the empirical data needed to implement sound bioassessment and monitoring protocols to protect and manage their water resources.
Field work was conducted by the benthic macroinvertebrates team in Nature Iraq (Ali Sadeq Al-Zubaidi and Ghazwan Al-Waili) during the Key Biodiversity Areas of Iraq Project, funded by the Italian Ministry for the Environment, Land and Sea (IMELS). Sequencing cost was supported by a grant from the Natural Sciences and Engineering Research Council of Canada (NSERC) and by grants from Genome Canada through the Ontario Genomics Institute to Paul Hebert, University of Guelph. The following institutions and individuals contributed specimens to our study or are currently housing voucher specimens: National Museum of Natural History, University of Minnesota Insect Collection, Nanjing Agricultural University, Rutgers University, and Hans Malicky. CJG thanks John C. Morse, M. Lourdes Chamorro, and Oliver S. Flint for taxonomic consultations; Terry L. Erwin for valuable editorial comments on the manuscript; and Maggie Barrett and Mackenzie Flight for imaging voucher specimens. MAA thanks Azzam Alwash, Adel Hillawi, Anna Bachmann, Giorgio Galli, Andrea Cattarossi, and Mauro Randone. Last, we thank colleagues at the Canadian Centre for DNA Barcoding for their assistance with laboratory work.