The genus Anneslea Wall. (Pentaphylacaceae, previously Theaceae) contains only four species: A. fragrans Wall. occurs in China and Southeast Asia and is the most widespread of these species, while A. donnaiensis (Gagnep.) Kobuski and A. paradoxa H. H. Nguyen & Yakovlev are found only in Vietnam, and A. steenisii Kobuski is observed only in Sumatra (Angiosperm Phylogeny Group, 2016; Hassler, 2017). For A. fragrans, six varieties have been reported, with four distributed in China and two in Malaysia and Vietnam (Hassler, 2017). Anneslea was listed as a relict genus in tropical Asia (Liao and Jin, 2014), and its current status remains unknown and calls for scientific attention. Using “Anneslea” as a keyword, only two papers were found in a search of the Web of Science database ( http://apps.webofknowledge.com), both of which analyzed the chemical constituents extracted from A. fragrans.
Anneslea fragrans, an evergreen tree or shrub (3–15 m in height), is an important species (Min and Bruce, 2007). Its Chinese name, which translates to “tea pear” in English, was given because of its reddish, Camellia-like flowers and edible, pear-like fruits. It has strong ecological adaptability to extreme environments, grows quickly, and is highly resistant to pests (Shen and Wang, 2009). It is planted as a garden tree in China.
In this study, we shotgun sequenced the A. fragrans genome with Illumina sequencing technology. Based on the assembled contigs, 30 polymorphic genomic simple sequence repeat (SSR) loci were developed and characterized in three populations of this species. The transferability of these markers was tested in Ternstroemia gymnanthera (Wight & Arn.) Bedd., T. kwangtungensis Merr., and Cleyera pachyphylla Chun ex H. T. Chang, which were previously listed in Theaceae, and later ascribed to Pentaphylacaceae (Min and Bruce, 2007; Angiosperm Phylogeny Group, 2016).
METHODS AND RESULTS
A seedling of A. fragrans was collected from the South China Botanical Garden, Guangdong, China (23°11′9.09″N, 113°22′22.51″E), and planted in the greenhouse of Sun Yat-sen University. Total genomic DNA was extracted from the fresh leaves using the modified cetyltrimethylammonium bromide (CTAB) method (Doyle and Doyle, 1987). A DNA library was constructed following the Illumina protocol and sequenced with HiSeq X Ten System (Illumina, San Diego, California, USA). The raw data were filtered with NGSQCToolkit_ v2.3.3 (Patel and Jain, 2012), where low-quality reads containing unknown “N” bases or more than 10% bases with a Q value <20 were removed. Finally, a total of 25.4 million cleaned 145-bp paired reads were obtained and de novo assembled into 445,162 contigs with Edena v3.131028 (Hernandez et al., 2008). The mean length and the N50 value were 433 bp and 455 bp, respectively, and the longest contig was 39,694 bp. The cleaned raw data and the assembled contigs were deposited in the National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA; SRR5880481) and Transcriptome Shotgun Assembly (TSA; GFTZ00000000) databases.
Characteristics of 30 polymorphic genomic SSR loci developed for Anneslea fragrans.a
Applying the perl script MISA (Thiel et al., 2003) with the default parameters except that the settings for mononucleotide repeats were removed from the analysis, a total of 30,409 SSRs were detected in 25,855 contigs. Among these SSR loci, dinucleotide repeats were the most common (80.4%), followed by trinucleotide (15.3%), tetranucleotide (3.0%), pentanucleotide (0.8%), and hexanucleotide repeats (0.5%).
Using the online perl scripts p3-in and p3-out ( http://pgrc.ipk-gatersleben.de/misa/primer3.html) and Primer3 (Rozen and Skaletsky, 1999), we successfully designed paired primers for 20,179 SSR loci with an expected PCR product size of 100–280 bp and melting temperature of 60°C. The paired primers designed for the 20,179 SSR loci and their characterization are provided in Appendix S1 (apps.1700086_s1.txt). Among them, primers for the 100 SSR loci containing the largest number of dinucleotide or trinucleotide repeats were screened for polymorphism in the following experiments.
Fresh leaves were collected from three populations of A. fragrans in China (population Eshan [ES] located in Yunnan Province, population Yangchun [YC] located in Guangdong Province, and population Jinggangshan [JGS] located in Jiangxi Province; Appendix 1) and then stored in plastic bags with silica gel. Total DNA was extracted with the modified CTAB method. For the first trial experiment, two individuals were randomly selected from each population, PCR amplifications were performed for the selected 100 paired primers following the procedure used in Fan et al. (2013), and agarose gel electrophoresis (1%) was used to check amplification. Seventy-nine loci successfully amplified in the six individuals with expected product size (Table 1, Appendix 2). The assembled sequences for these 79 SSR loci were deposited in the NCBI GenBank database (accession no.: MF579141-MF579219).
The PCR products were further inspected with the Fragment Analyzer Automated CE System (Advanced Analytical Technologies [AATI], Ames, Iowa, USA), in which the Quant-iT PicoGreen dsDNA reagent kit (35–500 bp; Invitrogen, Carlsbad, California, USA) was used. Finally, the raw data were analyzed using PROSize version 2.0 software (AATI); these results showed that among these 79 SSR loci, 30 were polymorphic among the six individuals (Table 1).
Polymorphism of the 30 loci was checked in 54 individuals collected from the three populations. PCR products were inspected to calculate the polymorphism level using the above-mentioned procedures. GenAlEx version 6.5 (Peakall and Smouse, 2012) was used to calculate linkage disequilibrium, deviation from Hardy–Weinberg equilibrium (HWE), average number of alleles per locus, observed heterozygosity, and expected heterozygosity. The tests for linkage disequilibrium showed that 59 of the 435 tests were significant (P < 0.05; Appendix 3), indicating that some paired loci may be linked with each other. The number of alleles per locus ranged from four to 10 (7.01 ± 1.60), the observed heterozygosity values ranged from 0.053 to 1.000 (0.817 ± 0.241), and the expected heterozygosity values ranged from 0.572 to 1.000 (0.796 ± 0.145). HWE testing showed that 14, 22, and 20 loci demonstrated significant deviation from HWE in the populations ES, YC, and JGS, respectively (Table 2).
Transferability of the 30 loci was tested in four to six individuals of three related species: T. gymnanthera, T. kwangtungensis, and C. pachyphylla (Appendix 1). Results showed that 22, 21, and 19 paired primers amplified in T. gymnanthera, T. kwangtungensis, and C. pachyphylla, respectively, and 19 amplified in all three species (Table 3).
In this study, we developed 30 new polymorphic genomic SSR markers based on whole-genome shotgun sequencing of A. fragrans. Our study showed that shotgun sequencing is an efficient way to develop highly polymorphic genomic SSR markers. These SSR markers are valuable in population genetic studies of Anneslea and its relatives.
Genetic characterization of 30 polymorphic microsatellites of Anneslea fragrans.a
Cross-amplification of 30 Anneslea fragrans genomic SSR markers in three related species.
This work was supported by the Basic Work Special Project of the National Ministry of Science and Technology of China (2013FY111500); the Fourth National Survey on Chinese Traditional Medicine Resources (20147002); the Natural Science Foundation of Guangdong Province, China (2016A030313326); the Science and Technology Planning Project of Guangdong Province, China (2015A030302020); and Chang Hungta Science Foundation of Sun Yat-sen University