Modern phylogeographic methods have confirmed that species with broad ranges often exhibit fine-scale patterns of genetic variation that are not reflected in their morphology. Recent genetic analyses of the straw-colored fruit bat (Eidolon helvum) deviate from this trend in identifying this species as broadly panmictic across its range in Sub-Saharan Africa. However, the limitations of sampling, along with potential for modern anthropogenic impacts to distort observed patterns, suggest that additional work is needed to assess true historical patterns of geographic variation in this species. We used Next Generation Sequencing (NGS) methods to assess patterns of variation found in historical samples of E. helvum and its sister species, E. dupreanum (a Malagasy endemic). Patterns of genomic variation observed among specimens collected between 1909 and 1983 were compared with those from more recently collected tissue samples from across much of the range of the genus. Our genetic analyses confirm that E. helvum and E. dupreanum are distinct species as traditionally recognized. Congruent with results from prior analyses of modern samples, no patterns of spatial genomic structuring were identified in E. helvum across continental Africa in either recent times or earlier in the 20th century. These results suggest that the currently observed pattern of panmixia in E. helvum is not a recent phenomenon; significant gene flow is apparently ongoing in this species across an exceptionally large area. This suggests that potentially zoonotic pathogens previously associated with populations of E. helvum may be similarly distributed or episodically transmitted across broad areas by this species. Our study additionally demonstrates that analyses utilizing ‘archival’ DNA from older specimens in museum collections have the potential to illuminate patterns of both past and contemporary biodiversity, and to help assess the impacts of habitat loss and climate change on species at the genomic level.
Quantifying and mapping the distribution of genetic variation within and among species within a comparative framework allows us to better understand both the drivers of speciation and the factors influencing the distribution of biodiversity across the landscape. In an applied context, these data may be used to prioritize conservation and management of both species and areas because patterns of genetic variation may be correlated with local adaptation to distinct habitat types (Ravigné et al., 2009), contemporary structural barriers (Manel et al., 2003), and/or pre-historic periods of isolation (Hayward, 2009). For example, evidence for local adaption may suggest that species translocation would be ill advised; genetically mixing populations that have adapted to disparate local conditions could cause outbreeding depression, which is the reduction of fitness resulting from the breakdown of coadapted gene complexes (Shields, 1982; Templeton, 1986). Evidence for restricted geneflow related to structural barriers can provide data important for reserve design and/or maintenance of migratory corridors, as well as information important for predictions of future effects of climate change on populations. Evidence for ancient isolation may additionally suggest a need for re-evaluation of taxonomic status or increased protection for areas of local endemism.
Observations of population substructuring or evidence for cryptic diversity within widely distributed species may offer valuable insights regarding the maintenance of genetic diversity and long-term viability of species, as well as providing critical information on how species respond to isolation (Shifman and Darvasi, 2001). Populations that have been isolated from the core distribution of a species for extended periods may be threatened by limited genetic variation as a result of genetic drift (Pardo et al., 2005) or inbreeding (Nei, 1972). Co-occurring patterns of isolation and local endemism across diverse taxonomic groups may suggest that an entire fauna is responding to a shared biogeographic history. In contrast, discordant patterns of geographically patterned variation across taxonomic groups may suggest that each lineage represents a unique evolutionary trajectory, and such patterns might limit the applicability of conservation policies designed around ‘umbrella species’ (Grady and Quattro, 1999). Species exhibiting discordant patterns of variation may respond differently to risk factors or exhibit increased resilience in the face of anthropogenic change.
Over the past two decades, modern molecular approaches to mapping fine-scale geographic patterns of genetic variation have uncovered previously unrecognized diversity in many widespread species. For several widespread African vertebrate species, such methods have revealed extensive cryptic diversity and evidence for shared biogeographic histories of unrelated taxa (e.g., bushbuck — Moodley and Bruford, 2007; crocodiles — Hekkala et al., 2011; forest geckos — Leaché et al., 2014; monitor lizards — Dowell et al., 2015; giraffes — Fennessy et al., 2016; leopards — Anco et al., 2018). In many cases, fine-scale patterns of genetic variation do not correspond to overt morphologically distinctiveness (e.g., crocodiles — Hekkala et al., 2011; harbor porpoises — Lah et al., 2016; forest elephants Bourgeois et al., 2018). These previously unrecognized patterns of diversification can help us to map and maintain global biodiversity.
Conversely, some species may show little genetic variation and generally lack evidence of geographically structured genetic partitioning. Such panmictic species exhibit an equal likelihood of gene flow occurring between neighboring populations as between distant populations (> 4,500 km) (Wallace, 1894; Beerli and Palczewski, 2010; Peel et al., 2013). For terrestrial species, barriers to gene flow may arise through geological processes such as mountain uplift or the incursion of water bodies. However, it is expected that volant, or flying, animals such as birds would have the ability to disperse more easily and hence be less subject to geographic barriers than their non-flying relatives (Bohonk, 1999; Burns and Broders, 2014). Accordingly, panmixia across broad geographic ranges may be more common in volant species than non-volant lineages (Reudink et al., 2011; Peel et al., 2013).
Among mammals, bats (order Chiroptera) represent the only truly volant group and may provide an opportunity to test hypotheses about the role of dispersal in maintaining panmixia. Several species of bats have been reported to show patterns of panmixia including Tadarida brasiliensis in North America (Russell et al., 2005; Speer et al., 2017) as well as Pteropus sp., Nyctalus noctula, Rousettus leschenaultii, Cynopterus sphinx, Eptesicus serotinus, and Epomophorus gambianus in the Old World (Webb and Tidemann, 1996; Petit and Mayer, 1999; Chen et al., 2010; Peel et al., 2013; Moussy et al., 2015; Riesle-Sbarbaro et al., 2018). Despite these examples, panmixia is thought to be relatively rare in bats, with most species exhibiting patterning of genetic structure across their range (Juste et al., 2009; Clare et al., 2013; Stoffberg et al., 2012).
The straw-colored fruit bat, Eidolon helvum (Kerr, 1792), is a frugivorous bat that is one of the most conspicuous migratory megachiropteran species in Africa. Aspects of the natural history and distribution of this species make it an ideal candidate to test hypotheses regarding what factors, if any, may drive population structuring in a volant mammal. This species has a wide distribution across equatorial and sub-Saharan Africa from Senegal in the west to Ethiopia in the east, and south to South Africa (Lang and Chapin, 1917; Bergmans, 1990; see Fig. 1). Until recently, most of the distributional and ecological information available for E. helvum was based on hypotheses and observations made by Herbert Lang and James Chapin over 100 years ago during their expeditions in the African Congo (Lang and Chapin, 1917). Bergmans (1990) described a geographical distribution of E. helvum with some isolated populations scattered outside the main range of the species, but the full distribution of this taxon remains to be confirmed. Eidolon dupreanum, the sister species and only other member of the genus, lives solely on the island of Madagascar (Bergmans, 1990; Peterson et al., 1995; Shi et al., 2014; Andriafidison et al., 2020).
Eidolon helvum are moderately large pteropodids with a body mass of 230–350 g and forearm lengths ranging from 117 to 132 mm (O'Toole, 2019). Synchronized seasonal roosting patterns exist across the range of E. helvum with large colonies (some estimated 5–10 million individuals) reported in the Democratic Republic of the Congo, Uganda, Ivory Coast, Malawi, Nigeria, Angola, Zambia, and Mauritania (Mutere,1967; Ansell, 1981; Thomas, 1983; DeFrees and Wilson, 1988; Bergmans, 1990; Cosson et al., 1996; Sorensen and Halberg, 2001; Hranac et al., 2019; van Toor et al., 2019; Hassanin et al., 2020). This species has been described as an opportunistic feeder, migrating to regional food supplies which primarily include fruiting trees (DeFrees and Wilson, 1988). The need to forage results in patterns of mass movement each day when feeding begins and bats leave their communal roost trees to find fruiting trees (Webala et al., 2014). Roost sites selected during the day are in tall trees, lofts in caves, and rocks (O'Toole, 2019). Trees used as day roosts are large with spreading branches, commonly found in dense groves with thick undercover (O'Toole, 2019). In its natural habitat, E. helvum remains alert and active during the day with eyes open, ears erect, and in constant motion (Jones, 1972). At night, roosts are apparently chosen according to food availability. Roost trees are variable in terms of height, size, and spatial distribution (Marshall, 1985; Taylor and Kankam, 1999). Roosting clusters are located six to 20 m above ground on sturdy branches (O'Toole, 2019).
During periods of migration (between October and December in East Africa) colonies disperse into small groups and form temporary roosts from which they eventually form ‘regular’ roosts (Thomas, 1983; Richter and Cumming, 2006). While some populations of E. helvum are non-migratory in Sub-Saharan Africa, individuals have been observed to travel as far as several thousand kilometers (King-don, 1984; Ossa et al., 2012). There is currently no evidence for gender-specific migratory behavior.
Breeding is seasonal, with most copulation occurring from April to June (O'Toole, 2019). Eidolon helvum exhibits gestational diapause wherein after fertilization, the egg develops until the blastocyst stage but does not continue development until implantation later in the year, usually in October. Births take place from February to May prior to the onset of the higher of the two rainfall peaks. Females produce one offspring per pregnancy and birth occurs in maternity colonies that are clusters of females (Mutere, 1967; Funmilayo, 1979)
Genetic analyses by Peel et al. (2013) described E. helvum as the largest panmictic unit of any known mammal species. Based on microsatellite data and some mitochondrial/nuclear markers from contemporary samples, Peel et al. (2013) found evidence of genetic connectivity across much of the range, with some differentiation exhibited by an island population off the coast of Equatorial Guinea (São Tomé). They suggested high connectivity of populations across the continent through the central equatorial breeding and migratory zone. However, sampling in the Peel et al. (2013) study was limited in geographic scope (Fig. 1), and few genetic analyses have been conducted beyond the equatorial region of the continent. Additional sampling is required to determine if the panmixia extends over the entire range of the species. Unfortunately, sampling of extant E. helvum populations is limited by both political boundaries and ongoing forest fragmentation across the range of the species. Challenges presented by terrain, threats of politically unstable regions, disease outbreaks, and government bureaucracy surrounding permitting inhibit opportunities to collect samples from mammals including bats in these critically threatened ecosystems. Because of these limitations on sampling, we focused our research on available specimens archived in museum collections.
Archival samples offer a potentially rich source of data for genetic studies given modern genomics techniques (Bi et al., 2013). Museum specimens in many cases can fill in sampling gaps in more recent collections, and in some cases can allow direct temporal comparisons between past and current genetic diversity (Wandeler et al., 2007; Bi et al., 2013). Archival specimens may sample habitats no longer accessible or existent, such as in areas affected by natural disasters or converted by humans to agriculture or urban spaces (Davis, 1996; Ponder et al., 2001; Suarez and Tsutsui, 2004). It is unclear to what degree contemporary changes to African forests, such as fragmentation, hunting, and the wildlife trade, may have influenced observed patterns generic variation in vertebrates, particularly those tightly tied to forest ecosystems by virtue of their ecological traits (e.g., dietary and roosting habits) or subject to focused hunting. Studies of archival genetic materials provide a means of testing a priori assumptions regarding patterns of population structure and dynamics of living species within a temporal context (Bi et al., 2013). The ability to evaluate data derived from archival samples relative to those based on contemporary samples provides the opportunity to increase sampling range and address questions with new context, comparing groups of samples from different areas and time periods. By using archival museum specimens of E. helvum to expand on work done with contemporary samples by Peel et al. (2013), a greater depth of coverage can be achieved in threatened but less accessible areas (i.e., the Congo River Basin) to better understand the historical diversity of this widespread species.
The goal of the current study was to use genomic analyses of historical samples collected in the early part of the 20th century to expand sampling to assess whether there is evidence that population substructuring may have previously existed across the range of E. helvum. Historical samples can provide baseline evidence for pre-deforestation patterns in bats like E. helvum, which rely on forest trees for roosting and feeding. We sought to test several nonexclusive hypotheses of how populations might be substructured (Fig. 2). These bats, like members of some avian communities (De Klerk et al., 2002; Huntley et al., 2019), may respond to terrestrial barriers. For example, the Mambila Mountains, the Ethiopian highlands, and the Katanga Plateau could act as barriers to gene flow (Fig. 2A). This trend has not been explicitly demonstrated in terrestrial African taxa but was suggested by Moreau (1972) for Palaearctic-African bird migration systems. An alternative isolating mechanism might result from river basins acting as refugia for different populations during periods of continental drying. The Niger River, Congo River, Nile River, and Zambezi Rivers may have harbored E. helvum populations (Fig. 2B). Prior studies have shown that some avian and mammalian taxa are confined to river basins where all resources are readily available as suggested by Huntley et al. (2018) regarding the Guineo-Congolian Forests. Several taxonomic groups including terrestrial reptiles and mammal exhibit patterns consistent with refugia formed during the expansion and contraction of the Sahara (Dowell et al., 2015; Fennessy et al., 2016; Anco et al., 2018; Bertola et al., 2019; Leaché et al., 2020). Alternatively, E. helvum may truly represent a panmictic species as suggested by the data presented in Peel et al. (2013), extending as a single genetic population across the tropical rain forests of Africa (Fig. 2C). This study was designed to evaluate these alternatives with a geographically and temporally broader data set than employed by Peel et al. (2013).
Materials and Methods
Sample Collection and Processing
A total of 41 archival specimens of E. helvum were sampled from the collections of the Department of Mammalogy at the American Museum of Natural History (AMNH) (Table 1). Samples localities (Fig. 1) were chosen to maximize the breadth of geographic range but were limited by the availability in the collections. Where possible, one male and one female were sampled from each locality in order to ensure that sex-specific variation of genetic connectivity would not skew results. Of the samples collected, 29 were dry study skins. Study skins are specimens that are prepared by removing the internal organs and much of the skeleton, after which the skin is stuffed with cotton and dried in a position to facilitate measurements by future researcher. For each study skin that we sampled, a small fragment of tissue was taken from the forearm and lip margin. Ethanol-preserved specimens account for the remaining 12 samples. These samples are whole body specimens that have been preserved and/or stored in ethanol (Simmons and Voss, 2009), and will hereafter be referred to as ‘wet’ specimens. Samples collected after the late 1920's were routinely fixed with formalin to fix the tissues and ensure the integrity of the specimen, but formalin treatment was not recorded for all samples. For wet specimens, we took muscle tissue from the abdominal region where prior incisions had typically been made during preparation in order to facilitate preservation.
Samples collected from the AMNH. Catalog # in bold were samples retained after quality filtering
All molecular laboratory work was conducted in the Sackler Institute of Comparative Genomics at the AMNH. To minimize contamination, extractions and amplifications were conducted in a clean room facility separate from contemporary samples and post-PCR products, and negative experimental controls and contamination prevention protocols were used throughout (Cooper and Poinar, 2000; Pääbo et al., 2004; Gilbert et al., 2005; Willerslev and Cooper, 2005; Hekkala et al., 2011). Archival DNA (aDNA) was isolated using a modified version of the MinElute Reaction Cleanup kit (Qiagen) (Dowell et al., 2015; Anco et al., 2018). Tissue samples were rinsed with molecular grade water prior to extractions. Samples were then digested in a 55°C heat block using 20 µl proteinase K and 180 µl ATL lysis buffer. After three days, an additional 20 µl of proteinase K was added and digested for two more days. Once digested, the DNA was extracted and cleaned using the Qiagen MinElute kit and eluted to 150 µl per sample. Genomic DNA was quantified by a qubit fluorimeter (Invitrogen).
Genomic Libraries and High-throughput Sequencing
Genomic libraries were built using 50 µl of each sample and Illumina platform-specific oligonucleotide adapters unique to each library using the NEBNext Ultra DNA Library Prep Kit for Illumina following the TruSeq DNA Sample Preparation V2 protocol combined with the protocol from Yao et al. (2017). The DNA was processed without shearing due to fragmentation from age. Quantification indicated that DNA per extract was low. Therefore, following Cui et al. (2013), adapters were diluted 1:20. WE conducted AMPure Bead XP clean ups to remove adapter dimers. Sample libraries were Dual Indexed using NEBNext Multiplex primers to allow pooling for sequencing on a single HiSeq lane (Kircher et al., 2012). NEBNext High Fidelity 2X PCR Master Mix was used to amplify the libraries given its proofreading properties limiting nucleotide misincorporations that may arise from cytosine deamination (Ginolhac et al., 2011). After amplification, genomic libraries were cleaned using the Qiagen MinElute Purification Kit. The libraries then underwent quality control assessment using the Agilent 2100 Bioanalyzer to determine fragment size. Concentration was determined using a Qubit 2.0 Fluorometer. All 41 genomic libraries were pooled and sent for sequencing at Novogene (Davis, CA). Size selection using SPRI beads was performed to remove adaptor dimers (ca. 120–140 bp). Sequencing consisted of 150 bp paired-end reads on the Illumina HiSeq 4000 platform.
Alignment and Assessment
AdapterRemoval2 was used to trim Illumina adapters off DNA sequences as well as any sequences contaminating the samples post library build (Lindgreen, 2012; Schubert et al., 2016). Trimming also reduced false variant discovery by filtering out low quality reads that were damaged (Kircher et al., 2012). BWA MEM was used to assemble each sample against an E. helvum reference genome (GenBank ID: ASM46528v1 — Parker et al., 2013) and a Pteropus alecto mitogenome (GenBank ID: ASM32557v1 — Zhang et al., 2013), since an E. helvum mitogenome is not currently available. Fast QC was run to determine the number and quality of the reads (Andrews, 2016). FastQScreen was run to determine the approximate amount of Eidolon DNA present compared to exogenous DNA and overall contamination levels (Wingett and Andrews, 2018). SAMtools was used to sort, index, and determine contamination quantification by examining informative sites (Li et al., 2009). Picard was used to further manipulate the sequences and remove duplicate copies.
Genome Analysis Tool Kit (GATK) was used to call Single Nucleotide Polymorphisms (SNPs) (McKenna et al., 2010). The toolkit was also used to filter the variant call file (VCF) by removing insertions, deletions, and multiallelic sites (Van der Auwera et al., 2013). Plink was used for additional filtration of the VCF to exclude individual samples that were missing too much genotype data as well as the exclusion of SNPs on the basis of missing genotype rate (Purcell et al., 2007). The Admixture program was run on the filtered VCF for 1–9 populations (K = 1–9) to estimate maximum likelihood ancestry and determine the amount of DNA in an individual from distantly related species or populations that resulted from interbreeding between previously reproductively isolated populations (Alexander et al., 2009). The program Admixture focuses on a maximum likelihood estimation. Admixture requires genotype data from the proposed admixed and ancestral populations and can be used efficiently for whole genome sequencing SNP data. As the number of generations since the beginning of admixture increases, more markers are required to detect all ancestry switches because recombination events accumulate linearly with the number of generations. (Shriner, 2017) Admixture was also used to estimate FST values as well calculating the cross-validation error for each population parameter.
Of the 41 samples analyzed, 28 samples yielded DNA extracts with measurable concentrations of nucleic acids (> 0.5 ng/uL — Table 1). For the remaining 13 samples, the concentration of DNA in the extracts were too low for the Qubit to detect (< 0.5 ng/uL). The pre-library preparation concentrations are shown in Table 1. All negative extraction controls contained no traces of genomic DNA. Libraries were prepared from all 41 samples using only 1–10 ng of DNA input. Subsequent bioanalyzer runs indicated a peak of DNA fragments ∼170–350 bp long after libraries were prepared and in some cases a peak around 150 bp indicating adaptor dimer presence. Technicians at the sequencing facility (Novogene, Davis, California) used a size selection protocol to remove adapter dimer prior to sequencing. Sequencing on the HiSeq4000 resulted in an average of 16,851,030 reads per sample were obtained (range = 802,224–154,910,621). The quality control programs FastQC and FastQScreen provided number of reads and highlighted any issues with contamination that needed to be filtered out. All steps taken to check for contamination gave negative results.
Mitogenomic Sequencing Data Recovery
Alignment to the mitochondrial genome was attempted first using the Pteropus alecto mitogenome (GenBank ID: ASM32557v1 — Zhang et al., 2013). On average 0.05% of the reads mapped to the mitogenone and 0 reads mapped to the cytochrome b gene used by Peel et al. (2013). This result suggests that the majority of mitochondrial sequences (> 350 bp) that may have been present in the post-library preparation pool were size selected out during the SPRI bead size selection removing adaptor dimers. Accordingly, no mitochondrial regions were used in subsequent analyses.
Genomic Sequencing Data Recovery
For the nuclear genome, all (n = 41) pooled samples had reads that were successfully aligned to the E. helvum reference genome (GenBank ID: ASM46528v1 — Parker et al., 2013) indicating at least partial recovery of nuclear data from the archival samples. Twelve ethanol-preserved ‘fluid’ specimens from the collection at the AMNH produced genomic data. After trimming and initial quality filtering, 78.7% of reads mapped to the nuclear reference genome (Table 1). However, after a conservative threshold of filtering for missing genotypes and missing SNPs per individual and more stringent alignment to the nuclear genome, 30 of 41 samples were retained (catalog # in bold). Of these 30 samples, six were fluid samples indicating that 50% of the original fluid specimens were able to pass filtering and yield viable results (Table 1).
Population Genomic Analyses
Using the Admixture software across a number of subpopulations (K values) ranging 1–9, the most well supported K value and lowest cross-validation error was determined to be nine (Fig. 3). Cross-validation is often used in the selection of the most likely model to estimate the test error of a predictive model. Limitations of computing power did not allow estimation above K = 9. Admixture plots for all K values between one and nine (Fig. 4). In the Admixture plots, the vertical bars indicate individual samples and the haplotypes are indicated by different colors at each K value. Corresponding haplotypes that were detected within each sampling area for K = 9 are color-coded over the sample's locality (Fig. 5). In that plot, our samples of E. dupreanum (Madagascar; shown in red) were separated from the majority of the E. helvum samples, indicating that the analysis was capable of detecting species limits within Eidolon.
The use of historical archived museum collections provides new genomic data to explore hypotheses regarding the distribution of genomic variation in E. helvum. Data were recovered from a majority of samples including specimens stored in both wet and dry curated collections, some over 110 years old. Unfortunately, due to the pre-sequencing processing of libraries, we were unable to recover mitochondrial data for direct comparative analysis of the mitochondrial cytochrome b gene region published by Peel et al. (2013). Although the depth of coverage limited the quality of the dataset and therefore constrains possible interpretations, nuclear data provide some evidence for previously unrecognized variation across the distribution of E. helvum. A minimum depth of 5× is recommended for well supported admixture analyses (Meisner and Albrechtsen, 2018). The current data provided by the single lane of shotgun sequencing for a pool of 41 sample libraries averaged 1.5×, with some genomic regions being represented and other regions completely missing. Despite these limitations, admixture analyses resulted in the most well supported K value and lowest cross-validation error at K = 9, suggesting population structuring into eight subpopulations plus the sister taxon, E. dupreanum.
Theoretically, a panmictic population should have the lowest cross validation error at K = 1, implying that there is no differentiation between subpopulations. Therefore, our analyses of nuclear SNP data for historical collections of E. helvum does not strongly support the Peel et al. (2013) hypothesis of panmixia in this taxon. However, neither do our results refute the idea that significant gene flow is ongoing across the entire range of this species. Prior to conclusively arguing that populations are strongly substructured in this species, additional sequencing with fewer samples per lane should be attempted and additional sampling and sequencing for each population should be conducted (see Future Directions below). In contrast, results of this study strongly support the validity of taxonomic recognition of E. dupreanum in Madagascar as a distinct species. The lack of admixture between mainland African E. helvum and Malagasy E. dupreanum individuals from three distinct regions within Madagascar supports the hypothesis that this species originated from a single dispersal event to the island, and that the Mozambique channel represents a strong barrier to geneflow between these sister species. A single dispersal event to Madagascar corresponds well to patterns observed in several other mammalian taxonomic groups (Masters et al., 2006; Yoder et al., 2006) and is consistent with the recognition of E. dupreanum as a distinct species.
Behavioral and Ecological Influences
Specific behavioral and ecological characteristics of species contribute to observed patterns of genomic variation. For example, sex-biased dispersal is known to contribute to discordant patterns of population structuring observed in analyses of separate mitochondrial and nuclear data sets (Larmuseau et al., 2010). Additionally, fission fusion patterns of population aggregation may result in increased opportunities for genetic admixture (Archie et al., 2008). Migration in particular can structure patterns of genomic variation in subsets of communities when different cohorts migrate together either spatially or temporally. For examples, distinct salmon runs, while fully genetically compatible, frequently exhibit signatures of isolation by natal stream or seasonal onset of migration (Kovach et al., 2013). African bovids which migrate across vast areas retain evidence for structuring based on breeding grounds (Rege and Tawah, 1999). Such factors may have promoted patterns of limited substructuring of E. helvum across mainland Africa based on mitochondrial cytochrome b sequences (Peel et al., 2013), while nuclear SNP data presented here suggest at least some level of partitioning across mainland Africa.
Although little is currently known about mating patterns in Eidolon, competition for roost trees during fission-fusion roosting and foraging in this species is thought to drive movement of individuals across large areas (Richter and Cumming, 2006), which might have the effect of reducing regional differentiation. Given the effect just a few individuals can have on estimates of gene flow (Pardo et al., 2005), the impact of E. helvum roosting in large colonies, some estimated to be between five and 10 million bats (Sorensen and Halberg, 2001) might dampen differentiation among regional roosting areas. Migratory patterns observed in this species follow a north-south axis following seasonal bursts of resource availability (Fahr et al., 2015), and may result in structuring among previously unrecognized population subsections. In addition, while a large portion of the E. helvum population was non-migratory, some individuals are known to have migrated distances longer than 2000 km (Ossa et al., 2012). Eidolon helvum may also exhibit multiple small scale migrations (100–300 km) in a stepwise fashion following the seasonal abundance of local food resources within Sub-Saharan Africa (Richter and Cumming, 2006). These behavioral and ecological factors may cloud interpretations of genomic data unless very large numbers of individuals are samples across even greater geographic areas.
The data indicating panmixia in E. helvum presented by Peel et al. (2013) may be relevant for risk assessments relevant to the potential spread of pathogens posing risks to human health. Eidolon helvum is thought to be a potential reservoir host for several zoonotic viruses including Lyssavirus, Henipavirus, and Ebolavirus (Peel et al., 2013; Ogawa et al., 2015). Zoonotic spillover into humans is thought to be possible after exposure to urine, feces, or the preparation/consumption of bushmeat (Kamins et al., 2011; Baker et al., 2012; Drexler et al., 2012). No specific spillover events for Lyssavirus, Henipavirus, or Ebolavirus from E. helvum have been reported to date; however, it is not clear if this is due to lack of circulating viruses (e.g., bats having only transitory infections and not actually acting as reservoir hosts), lack of spillover events, or lack of detection due to poor medical surveillance and/or limited access to specific diagnostic assays in the parts of Africa affected (Mallewa et al., 2007; Baker et al., 2013; Ogawa et al., 2015). Regardless, the increased interactions between bats and humans caused by habitat loss, land use change, and bushmeat hunting increase the likelihood of possible future spillover events. Evidence for panmixia in a potential vector species across wide swaths of mainland Africa would suggest greater risk for widespread outbreaks of infectious disease than would evidence for population substructuring and limited gene flow (Plowright et al., 2011). Our analyses of genomic data suggest some level of structuring in E. helvum populations, but additional data are necessary to further investigate this possibility. Additional analyses of more samples collected across space and time would benefit epidemiological studies for human health.
The Value of Archival Museum Samples
The DNA retained in archival collections is often highly degraded (Mason et al., 2011; Burrell et al., 2015; Liedigk et al., 2015; Yao et al., 2017), but Next Generation Sequencing techniques are providing researchers with the ability to sequence fragmented DNA faster and in an increasingly cost-effective manner. These methods may make it possible to sequence genomes at low coverage in a manner that makes them useful for studies of population variation, phylogeographic or phylogenomic analyses.
In the current study, high precision and high-quality sequencing reads were not returned from all samples. However, even the low-yield samples provide insights regarding which preparation types produce the most useable DNA. WE obtained useful genetic sequences from tissue samples from the forearm and lip margin of dry study skins as well as from abdominal tissue from ethanol-reserved specimens. Different samples produced varying qualities of reads but we detected no evidence for a systematic pattern in data recovery based on tissue type or age of samples. Most previous studies using archival samples for genetic work have used dry tissue fragments due to concerns about the effects of formalin. Fixation of specimens in formalin prior to long-term storage in alcohol started around the mid-1920s (Simmons, 2014). Formalin has a tendency to shear DNA, reducing fragment length and hence increasing the work necessary for adequate assembly (Ruane and Austin, 2017). However, sometimes fluid-preserved material represents the only record of a species in a particular time and place. In this study, we augmented dry tissue samples with some samples from fluid-preserved specimens that represent otherwise unsampled regions such as Sudan and Liberia.
An additional, but unavoidable issue with museum specimens may be the quality of documentation. Early records and specific locality information is often missing and the details regarding the treatment of the samples in the field may limit interpretation of data.
The results of this study and those of other molecular studies utilizing archival DNA indicate the potential of these methods to increase our understanding of both past and contemporary biodiversity. As habitat loss and climate change threaten the diversity of mammals in Africa, it is essential to document regions of genetic diversity and endemism as potential conservation units. In some cases, representative samples may only exist in museum collections due to recent local extirpation. Future research on E. helvum can be supplemented by increasing sample sizes using specimens from other natural history collections. Additional samples of E. helvum are available from the National Museum of Natural History in Washington, DC (Ghana, Nigeria, and Cameroon), the Field Museum of Natural History in Chicago, IL (Sierra Leone and Liberia), the Muséum National d'Histoire Naturelle in Paris (Mali and Mauritania), and the Museum of Natural History in London (isolated populations in Niger and South Africa) (Bergmans, 1990), just to name a few. The latter are particularly important because potential geographic barriers exist between the core populations of E. helvum and those in Niger and South Africa.
The authors would also like to thank the Mammalogy staff at the American Museum of Natural History (AMNH) for permission and access to sampling from the collections. Additional thanks to the research staff in the Sackler Institute for Comparative Genomics at the AMNH for assistance in the lab as well as with the bioinformatic analyses. We would also like to thank our peers at Fordham and the AMNH for constant feedback and support throughout this entire project. Funding was provided by the Fordham University Department of Biological Sciences, Fordham University Graduate School of Arts and Sciences, and the Mammalogy Department at the American Museum of Natural History.