High-throughput sequencing technologies, such as RNA sequencing (RNA-Seq), have greatly enhanced our ability to sequence and characterize the transcriptome of nonmodel organisms. The ability to study expression of thousands of genes in highly threatened yet understudied organisms holds great potential for advancing the field of conservation biology. Despite rapid gains in our analytical abilities and understanding of the physiological underpinnings of the organism, genomic resources remain limited for nonmodel organisms such as freshwater mussels, one of the most imperiled groups of animals worldwide. Here we provide the first characterization of the transcriptome of the North American freshwater mussel Amblema plicata (threeridge) using an RNA-Seq approach. Gill tissue samples were collected from mussels in the Muskingum River in Washington County, Ohio, USA. RNA was extracted and sequenced on the Illumina HiSeq 2500 sequencer with output as 100-base-pair paired-end reads. De novo assembly of sequenced reads was performed using Trinity. Assembled transcripts were used as BLASTx queries against the National Center for Biotechnology nonredundant database, and functional annotation using gene ontology (GO) terms was performed using Blast2GO. Transcriptome assembly produced 264,027 transcripts. Of these transcripts, 54,331 (20.58%) received BLAST hits and 22,223 were annotated with GO terms. We provide examples of identified candidate genes that may be useful for studying physiological responses of freshwater mussels to various environmental stressors, such as temperature, hypoxia, and pollutants. The A. plicata transcriptome improves the genomic resources available for freshwater mussels, and may aid in the development of molecular tools, with the ultimate goal of increasing our understanding of freshwater mussel physiology and improving conservation techniques.
The advent of high-throughput, next-generation sequencing technologies has decreased the cost and time involved in genomic and transcriptomic data acquisition and has greatly facilitated genetic studies in nonmodel organisms (Ekblom and Galindo 2011). RNA sequencing (RNA-Seq) enables researchers to identify a species' transcriptome (the expressed portion of the genome) and characterize changes in that transcriptome through development or in response to various environmental conditions (Wang et al. 2009). Although the genome of most nonmodel organisms has not been fully sequenced and annotated, advances in methodologies, such as de novo assembly of RNA-Seq data, allow us to characterize the transcriptome of understudied species of interest through comparison with better-studied model species to infer possible gene function. Several studies have now reported comparative transcriptomic characterizations of previously understudied taxa (e.g., Riesgo et al. 2012; Francis et al. 2013), including the copepod Tigriopus californicus (Schoville et al. 2012), green spotted puffer fish (Tetraodon nigroviridis; Pinto et al. 2010), and Pacific white shrimp (Litopenaeus vannamei; Zeng et al. 2013). Our ability to sequence, functionally annotate, and study the expression of thousands of genes in almost any organism holds great potential for advancing the field of conservation biology (Allendorf et al. 2010; Garner et al. 2016; Corlett 2017). For example, transcriptomics can be used to identify markers for pathogen resistance (Harper et al. 2016) or to select source populations for reintroduction by predicting differences in stress responses to environmental change (He et al. 2016). Unfortunately, freshwater mussels (Bivalvia: Unionidae), one of the most endangered faunal groups worldwide, have few available transcriptomic resources (Wang et al. 2012; Bai et al. 2013; Cornman et al. 2014; Luo et al. 2014; Patnaik et al. 2016).
Freshwater mussels are relatively sessile filter feeders that rely almost solely on physiological adaptations to mitigate environmental stressors. The center of freshwater mussel biodiversity is found in North America, but more than half of native species are considered threatened, endangered, or extinct and their numbers are decreasing rapidly (Lydeard et al. 2004; Strayer et al. 2004; Haag and Williams 2014). These animals continuously and simultaneously face persistent and widespread anthropogenic forces, including exposure to toxic contaminants, excessive nutrient inputs, sediment loading from agricultural activities, competition from zebra mussels and other invasive species, hydrologic regime alterations caused by impoundments, and global climate change (Richter et al. 1997; Watters 2000; Strayer et al. 2004). Captive propagation, reintroduction, and ecosystem restoration are a few of the many conservation efforts used in the attempt to conserve freshwater mussels (Strayer and Dudgeon 2010; Haag and Williams 2014). However, there is a pressing need to develop and implement additional health assessment and monitoring methods. Studying gene expression may prove an effective strategy for understanding how these animals respond to a multitude of environmental stressors and conservation efforts (such as animal translocation) at the biomolecular level.
We have sequenced and characterized the first transcriptome of the North American freshwater mussel Amblema plicata (threeridge) using gill tissue from individuals collected in the wild, and we provide it as a publicly available resource. Amblema plicata is a species with stable populations but it is also congeneric with Amblema neislerii (fat threeridge), which is federally endangered in the USA and listed as critically endangered by the International Union for Conservation of Nature (Bogan 1996). Widespread in the Mississippi and Laurentian drainages, as well as parts of the Mobile River, A. plicata has one of the largest distributions of any unionid (Williams et al. 1993). In contrast, A. neislerii is endemic to the Apalachicola system, a much smaller range, and has suffered from impoundments and water drawdown (Box and Williams 2000; USFWS 2003). The transcriptome provided here can be used to improve conservation of the internationally monitored A. neislerii, as well as other closely related and threatened freshwater mussels in the family Unionidae. In addition to our transcriptomic characterization, we discuss genes that are likely to be of interest to investigators studying the physiological responses of freshwater mussels to various environmental stressors, such as temperature, hypoxia, and pollutants. The A. plicata transcriptome expands the genetic tool kit available for monitoring and managing freshwater mussels, one of the most endangered, yet understudied, groups of animals.
We collected three adult A. plicata from the Muskingum River in Washington County, Ohio, USA below Devola Lock and Dam #2 (39.468703 N, 81.489303 W) on August 7, 2015. Upstream of this location is mostly valley with limited agriculture in the floodplain and a few small towns. The river is impounded by a series of low-head dams and associated locks. This species was chosen because it is common, not listed by state or federal agencies, and found in a wide variety of habitats. We gently pried open their shells with reverse pliers and collected 11–21 mg of gill tissue from each individual. Each tissue sample was placed in a 2-mL RNase-free cryotube, snap frozen in liquid N2, and stored at –80°C.
RNA Extraction and Sequencing
Tissue samples were mechanically disrupted and homogenized using a Mini-BeadBeater-8 (BioSpec Products Inc., Bartlesville, OK, USA). Using an RNeasy Mini Kit (Qiagen, Valencia, CA, USA), RNA was extracted and its concentration and integrity were measured using an Agilent 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA, USA) at The Ohio State University Comprehensive Cancer Center (Columbus, Ohio, USA). All samples had an RNA integrity number value >8.9. RNA-Seq library preparation and sequencing were performed by the Molecular and Cellular Imaging Center at the Ohio Agricultural Research and Development Center (Wooster, Ohio, USA). RNA-Seq libraries were prepared using the Illumina TruSeq Stranded mRNA Library Prep Kit (Illumina, Inc., San Diego, CA, USA). Libraries were sequenced on the Illumina HiSeq 2500 Sequencer with output as 100-base-pair (bp) paired-end reads.
Transcriptome Assembly and Annotation
Quality of sequencing data was assessed with FastQC (version 0.11.5; http://www.bioinformatics.babraham.ac.uk/projects/fastqc/). Trimmomatic (version 0.36; Bolger et al. 2014) was used to scan raw reads with a sliding window of four bases and trim read ends when the average Phred quality score dropped below 15, which corresponds to a probability of an incorrect nucleotide call that is equal to 10-15. For downstream analyses, we used only those reads with a minimum of 70 bp remaining after quality trimming. De novo assembly of trimmed reads was performed with Trinity (version 2.3.2; Grabherr et al. 2011) using default parameters.
Summary statistics for sequencing and transcriptome assembly.
To assess the quality of the transcriptome assembly, we estimated the percentage of raw reads represented in the Trinity assembly by mapping with Bowtie 2 (version 2.1.0; Langmead and Salzberg 2012) using default parameters; we assessed assembly completeness according to conserved metazoan ortholog content using benchmarking universal single-copy orthologs (BUSCO; version 2.0; Simão et al. 2015).
Transcripts assembled by Trinity were used as BLASTx queries against the National Center for Biotechnology Information nonredundant database (downloaded July 13, 2017) with a word size of six (the number of nucleotides used by the algorithm to detect regions of similarity between sequences), an expect value (E-value) cutoff of 1E-5 (the number of matches expected to occur by chance alone), and a hit threshold number of 20 (maximum number of matches). Functional annotation of transcripts using gene ontology (GO) terms and InterProScan was performed with Blast2GO (version 4.1.9; Conesa et al. 2005; Götz et al. 2008) using default parameters.
Illumina sequencing produced 69,737,622 raw reads. After trimming, high-quality reads were assembled into 264,027 transcripts with a mean length of 720 bp, N50 of 1,255 bp (50% of transcripts are equal to or larger than this value), and guanine-cytosine content of 35.75% (Table 1). Bowtie2 calculated a 96.65% read alignment to the transcriptome assembly. BUSCO analysis indicated that the assembly produced 811 (82.9%) complete, 116 (11.9%) fragmented, and 51 (5.2%) missing BUSCOs. The Illumina sequence data were archived in GenBank under accession number SRP133691. Both the raw data and transcriptome assemblies can be found under BioProject PRJNA436349.
Of the 264,027 transcripts assembled by Trinity, 54,331 (20.58%) received BLAST hits, and 22,223 of these were annotated with GO terms. The taxonomic distribution of annotation data revealed that five of the top six species were also mollusks (Fig. 1). These species included the Pacific oyster (Crassostrea gigas), Japanese scallop (Mizuhopecten yessoensis), California two-spot octopus (Octopus bimaculoides), California sea slug (Aplysia californica), and owl limpet (Lottia gigantea). Transcripts were assigned by Blast2GO to one or more of the GO domains: “biological process” (14,869), “molecular function” (17,096), and “cellular component” (13,370). More than half of the transcripts within biological process were assigned to the second-level categories “cellular process,” “metabolic process,” and “biological regulation” (Fig. 2). The most common categories within molecular function were “binding” and “catalytic activity” (Fig. 2), and those in cellular component included “cell,” “cell part,” “membrane,” and “organelle” (Fig. 2).
We identified numerous genes in the transcriptome that may be useful in gene expression-based studies of freshwater mussels' responses to myriad environmental stressors and discuss examples in the next section (Table 2). Although the genes discussed are likely to be useful in studies of freshwater mussels, they are provided only as examples. Researchers interested in selecting genes for further study are advised to consult the Supplementary Data for an exhaustive list of transcriptome annotations. CLICK HERE.
Linking Environmental Stressors and Gene Expression
We provide a publicly available transcriptome resource for the freshwater mussel A. plicata and give examples of identified genes whose expression can be studied in response to environmental stressors. Such stressors damage nucleic acids, proteins, carbohydrates, and lipids, as well as the larger cellular structures consisting of these macromolecules, resulting in adverse health effects (Kültz 2005). In response, organisms have evolved various cellular stress response pathways to combat and mitigate damage and restore homeostasis. These pathways are initiated when a signal activates a specific transcription factor that relocates to the nucleus and upregulates expression of target genes (Simmons et al. 2009). The roles of these target genes vary, ranging from neutralizing reactive oxygen species to assisting in the refolding of denatured proteins, but all have cytoprotective functions. In the following sections, we discuss responses to temperature stress, hypoxia, and pollutants, three common stressors to freshwater mussels. We specifically discuss genes that have been confidently assembled and identified in the A. plicata transcriptome and that could be used for differential expression studies. These and the many other sequenced genes accessible in the Supplementary Data are needed to create custom oligonucleotide primers that can be used to conduct targeted gene expression studies using quantitative PCR techniques, or to design DNA probes for microarrays to simultaneously assay the expressive state of multiple genes. For further information about the design of qPCR and microarray techniques for conservation applications, see Gibson (2002) and Tymchuk et al. (2010). Studying changes in gene expression can increase our knowledge of how an organism reacts to a certain stressor and its level of sensitivity, which can then be used to improve management and conservation efforts.
Temperature.—Temperature stress is a universal threat among metazoans and is one of the most important factors influencing the behavior and physiology of ectotherms (Angilletta et al. 2002). Bivalves are especially vulnerable to extreme temperature changes since they are sessile organisms and have limited ability to seek out different microhabitats.
Examples of protein homology identified in the transcriptome of Amblema plicata that may be useful in studies of responses to environmental stressors. Descriptions of function are adapted from UniProt. The reported E-value is the lowest value for transcripts matching that homologue. See Supplementary Data for transcripts and associated annotations.
Heat shock proteins (HSPs) are molecular chaperones that were first discovered to be induced in Drosophila melanogaster in response to heat stress (Ritossa 1962; Tissières et al. 1974). Further studies confirmed the role of HSPs in thermotolerance in other organisms (Snutch et al. 1988; Airaksinen et al. 1998; Clark et al. 2008; Waagner et al. 2010). For example, the marine mussel Mytilus californianus has higher levels of HSP70 when found in warmer areas compared with those individuals living in cooler areas (Helmuth and Hofmann 2001). HSPs and other molecular chaperones are essential for survival at elevated temperatures since they ensure the correct folding of newly synthesized proteins and assist in refolding or degrading misfolded proteins accumulated during stress (Feder and Hofmann 1999; Kregel 2002). Other genes whose expression has been studied in response to heat stress include heat shock factor (e.g., in zebrafish; Råbergh et al. 2000) and those involved in combating reactive oxygen species, such as Cu/Zn–superoxide dismutase (Cu/Zn-SOD) (e.g., in the bumblebee Bombus ignitus; Choi et al. 2006) and catalase (e.g., in the snakehead Channa punctata; Kaur et al. 2005). All of these genes have been identified in the A. plicata transcriptome and can be incorporated into gene expression-based studies of freshwater mussel responses to temperature stress. Anthropogenic disturbances such as dam construction, clearing of riparian vegetation, irrigation, channelization, and industrial activities can influence lake and stream temperatures (Poole and Berman 2001; Hester and Doyle 2011). Furthermore, freshwater mussels will become increasingly threatened as climate change continues (Strayer and Dudgeon 2010). The incorporation of gene expressionbased studies will provide insight into organismal response to temporary and chronic exposure to temperature stress and, consequently, guide management decisions such as species translocation and habitat restoration.
Hypoxia.—Hypoxia, a decreased level of oxygen availability, is a common stressor of aerobic animals that rely on oxygen for energy production and metabolic function (Giaccia et al. 2004). Freshwater mussels may experience oxygen depletion during certain management practices, such as translocation between habitats (Waller et al. 1995); during eutrophication, when excessive amounts of nutrients cause oxygen depletion (Mallin et al. 2006); or during aerial exposure, as a result of drought-induced decline in water levels (Golladay et al. 2004). We identified several genes in the A. plicata transcriptome that can be used to study freshwater mussel responses to hypoxic conditions. Hypoxiainducible factor (HIF), which consists of the hypoxiainducible α subunit and the constitutively expressed β subunit, acts as the master regulator of oxygen homeostasis, and the expression of this transcription factor during hypoxic conditions results in upregulation of target genes (Semenza 2002; Greijer et al. 2005). Hypoxia has been shown to induce expression of HIF-1α in the Pacific oyster Crassostrea gigas (Kawabe and Yokoyama 2012) and the blue mussel Mytilus galloprovincialis (Giannetto et al. 2015). Because an important aspect of the hypoxia response is regulation of glycolysis, the expression of genes coding for such proteins as glyceraldehyde 3-phosphate dehydrogenase and triosephosphate isomerase have also been found to increase in response to low oxygen (Fields et al. 2014). Climate change has increased the frequency and severity of droughts in some regions, such as the southeastern USA (Mazdiyasni and AghaKouchak 2015), a hot spot for freshwater mussel diversity (Haag 2012) and home to the federally endangered A. neislerii. Already, water drawdown for upstream municipalities has resulted in the stranding of this listed species in some areas (unpublished work). Drought conditions have caused dramatic declines in mussel abundance (Haag and Warren 2008) and shifts in species composition (Galbraith et al. 2010). Gene expression-based studies can increase our understanding of how freshwater mussel physiological responses vary in response to droughts of varying intensity and duration, in combination with heat stress, and in different habitats.
Pollutants.—Water-quality degradation is among the most important causes of freshwater mussel declines (Strayer et al. 2004). Pollutants may include nutrients from agricultural runoff, pathogens, organic compounds such as sewage and pesticides, and inorganic compounds such as heavy metals (Schwarzenbach et al. 2010). Metallothioneins have received interest in toxicology since they bind heavy metals and may protect the organism against metal toxicity (Amiard et al. 2006). Metal exposure has been shown to increase metallothionein concentrations in numerous invertebrates, including annelids (e.g., a freshwater oligochaete; Deeds and Klerks 1999), mollusks (e.g., the freshwater mussel Pyganodon grandis; Giquère et al. 2003), and crustaceans (e.g., a copepod; Barka et al. 2001). Glutathione S-transferase is also important in detoxification processes and was found to be significantly higher in blue mussels (Mytilus edulis) living in polluted waters next to a thermoelectric power plant (Manduzio et al. 2004). The upregulation of many proteins can be induced by more than one specific stressor. For example, HSPs, which play a protective role during heat stress (as discussed above), are activated in response to other stressors as well, including heavy metals and pesticides (Lee et al. 2006), and they are often used as indicators of stress levels in toxicological studies (Gupta et al. 2010). Similarly, gene expression levels of Cu/Zn-SOD increased in the bumblebee B. ignitus in response to low and high temperatures and bacterial infection (Choi et al. 2006). Bacterial infection also altered gene expression levels of Cu/Zn-SOD in the scallop Chlamys farreri (Ni et al. 2007). Because freshwater mussels are filter feeders, they are constantly exposed to a wide range of pollutants. The long-lived, sessile nature of freshwater mussels also makes them useful indicators of water quality (Naimo 1995). With the growing interest in effects of contaminant mixtures and contaminants of emerging concern (de Solla et al. 2016; Montes-Grajales et al. 2017), gene expression-based studies can increase our understanding of the mode of action of various chemical and biological pollutants and their interactions.
We have provided a publicly available transcriptome of A. plicata and discussed how this resource can be used for gene expression-based studies in response to common stressors such as temperature, hypoxia, and pollutants. Because transcriptome profiling is relatively expensive, this transcriptome provides researchers a resource from which to select candidate genes for designing microarrays or conducting real-time quantitative PCR to conduct in situ or lab-based conservation work. Such targeted sequencing will allow relatively inexpensive studies of gene expression among multiple individuals and a wide variety of environmental conditions in both natural and experimental settings. Furthermore, the corresponding increase in genomic information for closely related species will enable a more in-depth functional characterization of the genes in the A. plicata transcriptome that currently lack annotations. Transcriptomes of nonmodel organisms also can be used for many other applications, such as detection of alternative splicing, development of molecular markers (e.g., single-nucleotide polymorphisms), gene discovery, and identification of conservation units, making them effective tools in evolutionary and population genetics analyses (Ekblom and Galindo 2011).
This work was supported by the Ohio Division of Natural Resources Division of Wildlife Grant through the Ohio Biodiversity and Conservation Partnership and the Columbus Zoo and Aquarium. We thank Marymegan Daly for assistance with project design and Jason Macrander for help with laboratory work.