Breadfruit (Artocarpus altilis (Parkinson) Fosberg, Moraceae) is a multipurpose tree crop with a great potential for increasing food security, thanks to its nutritious and starchy fruit (Ragone, 1997). In the Pacific Islands, it is a traditional staple crop, typically grown in backyards and small holdings. Breadfruit's wild progenitor, A. camansi Blanco, is a native species of New Guinea where the first steps of breadfruit domestication occurred. Pacific seafarers migrated eastward carrying breadfruit in the form of seeds or cuttings (Kirch, 1997). Other events, such as accumulated somatic mutations and meiotic defects in diploid genomes of A. altilis, resulted in seedless triploid cultivars that predominate in eastern Polynesia (Zerega et al., 2004). Witherup et al. (2013) developed simple sequence repeat (SSR) loci from microsatellite-enriched libraries and validated 25 of them across a large number of A. altilis cultivars, wild congeners, and relatives. This traditional SSR isolation approach is a costand labor-intensive process that requires repeat enrichment, cloning, and Sanger sequencing. Next-generation sequencing (NGS) technologies allow a good coverage of large genomes, cost-effective identification, and rapid characterization of hundreds of SSRs in nonmodel organisms without previous genomic resources (Zalapa et al., 2012). Gardner et al. (2015) used this technology to develop 15 chloroplast SSRs from transcriptome data of Artocarpus spp. We report here on the development and validation of a new set of 50 nuclear SSR markers for breadfruit and related species using NGS Illumina technology.
METHODS AND RESULTS
Leaf fragments of 41 samples of A. altilis (33 diploids, six triploids), A. camansi, and A. heterophyllus Lam. originating from Vanuatu, New Caledonia, French Polynesia, Tonga, Samoa, and the Mariana Islands were collected from living trees conserved in field genebanks (Appendix 1) and stored in a drying agent (silica gel) at room temperature. DNA was extracted according to the mixed alkyl trimethylammonium bromide (MATAB) protocol described by Risterucci et al. (2000). Total genomic DNA isolated from A. altilis ‘Novan’ (sample VUT002; National Center for Biotechnology Information [NCBI] Bio-Sample SAMN04508170) was used to generate the library with the Nextera DNA Library Preparation Kit (Illumina, San Diego, California, USA) according to the manufacturer's protocol. Paired-end sequencing was carried out at the Grand Plateau Technique Régional platform (Montpellier, France; http://www.gptr-lr-genotypage.com) on an Illumina MiSeq system using the MiSeq Reagent Kit version 3 (2 × 300 bp). The sequences were assembled using ABySS software (Simpson et al., 2009). SSRs were detected using MISA Perl script (Thiel et al., 2003) with search parameters set as follows: at least five repeats for dinucleotide motifs, four repeats for trinucleotide motifs, and three repeats for tetra-, penta-, and hexanucleotide motifs. Primers were designed with Primer3 software using standard settings (Rozen and Skaletsky, 1999). A total of 2,341,465 paired-end sequences were assembled into 1,281,784 contigs. Among them, 115,499 contigs exhibited at least one microsatellite locus and enabled us to define PCR primers on 46,504 contigs, totaling 47,607 SSR loci ( Appendix S1 (apps.1600021_s1.txt)). The cumulative length of these contigs was around 15.5 Mb, totaling approximately 6% of the sequence length generated in this study. Raw sequencing data were submitted to the NCBI Sequence Read Archive (accession SRP070931) under BioProject PRJNA312880.
As a first step, 96 loci were selected according to the following criteria for motif type, repeat length, and amplicon size. We firstly excluded dinucleotide motifs, because these are prone to enzyme slippage during amplification, which may make allele designation difficult (Guichoux et al., 2011). Only perfect motifs were selected, as they are more likely to follow the stepwise mutation model. We selected loci with lengths of 11 to 16 repeats, as recommended by van Asch et al. (2010). Lastly, we selected loci with amplicon sizes ranging from 100 to 400 bp to facilitate the construction of multiplex sets.
The 96 primer pairs were then tested for amplification with a subset made up of four samples (two A. altilis, one A. camansi, and one A. heterophyllus). Only 15 failed to amplify. The remaining 81 primer pairs were classified according to their polymorphism and the overall quality of the profile. Among them, we chose to select only 50 polymorphic single-locus markers with no ambiguity in allele size determination (Table 1). These 50 SSRs were assessed using the 41 samples listed in Appendix 1. For comparison, we genotyped the same samples with 18 SSRs developed by Witherup et al. (2013). PCR reactions were performed in a solution A (25-µL total volume) containing 2.5 µL of PCR buffer (10 mM Tris-HCl, 50 mM KCl, 2 mM MgCl2, 0.001% glycerol), 2.5 µL of dNTP (Jena Bioscience GmbH, Jena, Germany), 0.25 µL of MgCl2, 0.2 µL of 10 µM forward primer with an M13 tail at the 5′-end (5′-CACGACGTTGTAAAACGAC-3′), 0.25 µL of 10 µM reverse primer, 0.25 µL of fluorescently labeled M13-tail (6-FAM, NED, VIC, or PET [Applied Biosystems, Foster City, California, USA]), 0.1 units of Taq DNA polymerase (Sigma-Aldrich, St. Louis, Missouri, USA), 5 µL of template DNA (5 ng/µL), and 14 µL of H2O. The PCR conditions were as follows: an initial denaturation at 94°C for 5 min; 30 cycles at 94°C for 45 s, 55°C for 45 s, and 72°C for 1 min; and a final extension at 72°C for 10 min. PCR products were pooled in a solution B containing: 2 µL of 6-FAM, 2 µL of VIC, 2.5 µL of NED, and 3.5 µL of PET. From this solution B, a volume of 4 µL was taken and added to 10 µL of Hi-Di formamide and 0.12 µL of GeneScan 600 LIZ Size Standard and analyzed on an ABI 3500xL Genetic Analyzer (Life Technologies, Carlsbad, California, USA). Alleles were scored using GeneMapper version 4.1 software (Applied Biosystems). Basic statistics were computed using PowerMarker software (Liu and Muse, 2005).
Of the 50 loci assessed, all amplified and were polymorphic in A. altilis, 44 in A. camansi, and 21 in A. heterophyllus. The number of alleles per locus ranged from two (mAaCIR0167) to 19 (mAaCIR0121), with an average of seven alleles per locus (Table 2). When genotyping the samples with 18 of the SSRs developed by Witherup et al. (2013), we obtained similar results, but with a smaller number of alleles, ranging from one (MAA3) to 10 (MAA156) with an average of six alleles per locus (Appendix 2).
The Hardy–Weinberg equilibrium (HWE) test was only performed on diploids from Vanuatu and revealed that 20 of the new SSRs exhibited significant deviation from HWE (Table 2). This is not surprising as we did not sample populations but cultivated varieties, most of them clonally propagated and maintained in the form of a few trees planted in backyards or gardens. In the triploids, we calculated the percentage of heterozygous individuals and gave the number of individuals harboring one, two, or three alleles for each microsatellite locus. For 60% of the microsatellite loci, we observed unambiguous genotypes (i.e., with three alleles), ranging from one individual (mAaCIR0178) to five individuals (mAaCIR0080). Fifty percent of the loci were highly informative with a polymorphism information content value (PIC; Botstein et al., 1980), calculated on diploid data, greater than 0.7; only seven had a PIC less than 0.5, with a minimum value of 0.29 for mAaCIR0078. Although less informative, this latter category of loci may have characteristics, such as private alleles, useful for detecting admixture between species.
These 50 new nuclear SSR loci will be useful for assessing the identity and genetic diversity of breadfruit cultivars on a small geographical scale and for gaining a better understanding of farmer management practices (seed or vegetative propagation methods, exchanges, and dispersal). They will help to optimize the management of national genebanks by identifying duplicates and guiding future collecting activities. Of the 47,607 SSR loci identified, a very large number of additional markers could be further developed to address future research needs (genetic mapping, QTL, and association studies).
The authors thank H. Vignes for assistance with the construction of the library, X. Argout for assistance with bioinformatics analysis, C. Hamelin for data curation of TropGeneDB, X. Perrier for helpful comments on the manuscript, and P. Biggins for editorial input.
Genetic properties of 18 SSR markers developed by Witherup et al. (2013) tested on Artocarpus altilis and congeners.