Variation in breeding systems between species of the same taxonomic group complicates the consistent application of species concepts, and perhaps even the logically consistent circumscription of species. Several genera of arid-zone ephemerals in the Angianthus clade (Asteraceae: Gnaphalieae: Gnaphaliinae) contain both outcrossing and non-outcrossing species. The latter are recognised by producing an order of magnitude fewer pollen grains per anther and an often reduced number of corolla lobes, and they are frequently more widespread than are the former. In its current taxonomy, the genus Pogonolepis comprises an otherwise morphologically indistinguishable pair of one outcrossing and one non-outcrossing species. I generated sequence capture data to test the genetic segregation of P. stricta and P. muelleriana and the utility of sequence capture data for species circumscription and diagnostics. Phylogenetic analysis showed the two species to form two specimen clades, supporting the current taxonomy. Contrary to expectations, non-outcrossing P. muelleriana exhibited lower gene concordance, in line with values expected from recombination, as well as higher heterozygosity than its outcrossing sister species. More research on the breeding system and population structure of the two species may be required to explain these results.
Variation in breeding systems in a taxonomic group can make it difficult, if not impossible, to consistently apply a species concept in its study. This applies particularly to the biological species concept in the sense of a reproductive isolation concept (Mayr 1970), because it assumes sexual reproduction. At a theoretical level, many botanists consider any unifying species concept to be inapplicable in plants (Levin 1979), often with reference to the greater diversity of breeding systems, than in vertebrates at least, and a seemingly easier introgression between species of plants than for animals (Mishler and Donoghue 1982).
However, in practice, many plant taxonomists nonetheless intuitively attempt to be consistent in their approach to species circumscription. Although species concepts are rarely explicitly stated in the methods section of taxonomic treatments, the notes on newly described species often allow the reader to infer the principles applied by taxonomists. Traditionally, many botanists have based the recognition of taxa, including species, on unique combinations of characters subjectively considered ‘important’, usually for being flower or inflorescence characters, even if homoplasious. In the Asteraceae, these include presence or absence of pappus, capitulum scales, or ray florets.
Modern taxonomic revisions in botany frequently use ordination and clustering analyses of morphological or genetic data, in which distinct clusters of specimens are interpreted as species (e.g. Ohlsen et al. 2010; Walsh 2015; De Salas and Schmidt-Lebuhn 2018). Again, without necessarily specifying it, they thus effectively apply the genotypic cluster species concept (Mallet 1995).
The Angianthus clade (Asteraceae: Gnaphalieae: Gnaphaliinae) contains many species that have adapted to aridity and irregular rain fall events in the arid zone of Australia by becoming ephemeral (Schmidt-Lebuhn and Bovill 2021). In addition to their short life cycles and, consequently, small sizes, they are frequently characterised by compound capitula and loss of pappus. Several genera show a mixture of outcrossing and inbreeding or asexual species, for example, Chthonocephalus Steetz (Short 1990a), Gnephosis Cass., specifically the species formerly separated as Chrysocoryne Endl. (Short 2015), Hyalosperma Steetz (Wilson 1989), Millotia Cass. (Short 1995), Myriocephalus Benth. (Short 2000), Podotheca Cass. (Short 1989), and Trichanthodium Sond. & F.Muell. ex Sond. (Short 1990b).
Particularly interesting are the cases of Pogonolepis Steetz and Actinobole Fenzl ex Endl. The two genera represent three pairs of outcrossing and non-outcrossing species that are morphologically indistinguishable except by their anther sizes and number of pollen grains per anther (Short 1985, 1986).
Pogonolepis comprises one such species pair. The genus is found in Western Australia, South Australia, New South Wales and Victoria on often saline sand or loam. Plants are annual herbs, generally richly branched at the base, with major branches procumbent–ascendent or erect and bearing simple capitula either terminally or at the ends of arrays of short secondary branches (Fig. 1a).
Pogonolepis stricta Steetz (Fig. 1b) is found only in Western Australia, from around Lake Macleod and Lake Carnegie in the north to Busselton and Cape Arid National Park in the south. It produces ~2000–4260 pollen grains per anther, and the anthers are ~0.85–1.30 mm long, indicating that the species is outcrossing. Its chromosome number is 2n = 8 or 10 (Short 1986).
Pogonolepis muelleriana (Sond.) P.S.Short (Fig. 1a) occurs in approximately the same parts of Western Australia as does P. stricta except for the furthest south-west, but also in South Australia, New South Wales and Victoria, from Fowlers Bay in the west to Orange in the east. The Western Australian and the eastern populations are separated by a wide disjunction. It produces ~62–404 pollen grains per anther, and the anthers are only 0.38–0.80 mm long, suggesting that the species is either selfing or asexually reproducing. Its chromosome number is 2n = 12 or ~20–24 (Short 1986).
In Actinobole, the outcrossing species are A. drummondianum P.S.Short and A. oldfieldianum P.S.Short, and their non-outcrossing counterparts are A. condensatum (A.Gray) P.S.Short and A. uliginosum (A.Gray) H.Eichler respectively. As in Pogonolepis, the outcrossing species are restricted to Western Australia, whereas one of the two non-outcrossing species, A. uliginosum, ranges across most of the arid zone of Australia. As most of the Angianthus clade is likely to be ancestrally Western Australian (Schmidt-Lebuhn and Bovill 2021), this suggests that the western part of Western Australia may be the area of origin of these species pairs, and that non-outcrossing species are more likely to spread beyond this area, because they need only a single seed to establish a new population.
Short (1985) provided the following four arguments for recognition of otherwise indistinguishable outcrossing and non-outcrossing populations at the species level: (1) absence of individuals with intermediate pollen:ovule ratios and anther sizes; (2) other morphological differences exist, albeit two traits directly related to anther and flower size and, thus, presumably also dependent on the breeding system; (3) precedent, i.e. that the same approach had been taken by other taxonomists; (4) diagnosability of the two species by the number of pollen grains.
None of these arguments is biological and evolutionary, and the underlying species concept remains unclear. Ideally, classification would reflect degrees of relatedness (Schmidt-Lebuhn 2012). Most Asteraceae are ancestrally sexually reproducing and self-incompatible, and it is possible that non-outcrossing populations have arisen repeatedly from outcrossing ones, making the former polyphyletic and genetically a nested subset of the latter. This was, in fact, Short’s (1986) own hypothesis of chromosome number evolution in Pogonolepis.
In the present study, I examine the genetics of the genus Pogonolepis to test whether P. muelleriana and P. stricta are sister species or whether the former has repeatedly arisen from the latter. The system is used as a case study both for the larger number of non-outcrossing species in the Angianthus clade and for the use of sequence capture at species level. Introduced primarily as a marker system for phylogenetics, sequence capture has recently been explored for population genomics (Slimp et al. 2020). With the Angiosperms353 bait kit (Johnson et al. 2019) increasingly becoming a standard for flowering plants, and reference data across that group being produced by the Plant And Fungal Tree Of Life (PAFTOL) and Genomics for Australian Plants (GAP) consortia, it would be beneficial if the same system could be used for species delimitation and the study of breeding systems.
Materials and methods
Forty-four specimens labelled as Pogonolepis were selected from herbarium CANB to represent both species approximately evenly and to cover most of their geographic ranges (Fig. 2). Very small specimens were excluded because destructive sampling would compromise their quality, as were specimens whose colouration suggested that DNA quality would be very poor. After re-identification of some mislabelled sheets (see below), 21 specimens of each species are present in the dataset.
The following three species of the Angianthus clade were chosen as outgroups, and raw reads produced for the Genomics for Australian Plants consortium were downloaded from the Bioplatforms Australia Data Portal ( data.bioplatforms.com): Cephalosorus carpesioides (Turcz.) P.S.Short, Bioplatforms sample ID 80038; Dithyrostegia amplexicaulis A.Gray, 80053; and Hyalochlamys globifera A.Gray, 80073.
Laboratory work was conducted by the Australian Genome Research Facility. Genomic DNA was extracted using the NucleoSpin Plant II Kit (Macherey-Nagel, Düren, Germany) and sonicated using a Covaris (Woburn, MA, USA) AFA plate to a fragment size of ∼350 base pairs. Libraries were built using NEBNext Ultra II (NEB, Ipswich, MA, USA), and sequence capture was conducted on pools of 12 libraries by using the Angiosperms353 MYbaits kit (Daicel Arbor Biosciences, Ann Arbor, MI, USA) NovaSeq. Enriched libraries were sequenced on Illumina NovaSeq. 6000 SP with v1.5 300 cycle chemistry. Sequence data were obtained for all sampled specimens.
Reads were assembled against target sequences using hybpiper-rbgv (Jackson et al. 2021), a Nextflow pipeline adapted from HybPiper (Johnson et al. 2016) against a target file designed for broad representation of Asteraceae by mining transcriptome data for Angiosperms353 targets (McLay et al. 2021).
The results of HybPiper’s paralog finder were analysed with yang-and-smith-rbgv (Jackson et al. 2021), a Nextflow pipeline implementing the four gene-tree-based paralogy resolution algorithms collated by Yang and Smith (2014). Cephalosorus carpesioides, Dithyrostegia amplexicaulis, and Hyalochlamys globifera were used as outgroups. I selected results of the Monophyletic Outgroups algorithm (MO), which recursively moves through each gene tree from the root and removes the smaller of each pair of sister clades sharing sequences from the same sample. This means that, at most, one ortholog group is retrieved from each gene, producing a more complete sample × gene matrix than alternative algorithms. Custom-written Python scripts were used to ensure that gene alignments were in frame and to concatenate them into a supermatrix.
Gene heterozygosity, i.e. the percentage of loci that were heterozygous in a sample, was inferred using the first two steps of the HybPhaser pipeline (Nauheimer et al. 2021). To compare heterozygosity between the two species with their different breeding systems, I calculated mean and median gene heterozygosity across all samples of a species.
Phylogenies of the concatenated supermatrix and of all gene alignments were inferred with IQ-TREE (ver. 2.0.7 and ver. 2.1.3, see http://www.iqtree.org/; Minh et al. 2020a), partitioning the alignment by genes and under automatic model testing. Three branch support measures were inferred, namely, UltraFast Bootstraps (UFB; Minh et al. 2013), gene Concordance Factors (gCF), and site Concordance Factors (sCF), using 100 quartets (Minh et al. 2020b). The latter two measures calculate the percentage of informative gene trees and alignment columns that support a branch in the species tree respectively.
To establish a baseline of gCF expected under free recombination of genes as expected from sexual reproduction, I produced 20 replicate phylogenies after randomly reassigning all gene sequences to samples in the species showing the higher gCF values in the original data.
Enrichment efficiency was very high except in few samples, with 6–65% of reads on target (median 57%). HybPiper retrieved sequences for 307–349 genes per sample (median 342), for a total of 351 genes retrieved across the dataset. Of these, 164–336 genes per sample covered at least 50% of the target length (median 312). Paralog warnings were reported for 2–56 genes per sample (median 19), for a total of 123 genes across the dataset affected by putative paralogy.
The MO algorithm retrieved 254 ortholog groups, excluding other gene alignment because the in- and outgroups were not reciprocally monophyletic. In the final dataset, each sample had 173–249 ortholog groups (median 240), and each gene tree had 8–47 terminals (median 46). The concatenated supermatrix comprised 184 175 characters, of which 39 111 were parsimony-informative, 35 759 variable but not parsimony-informative, and 109 305 constant.
Preliminary analyses showed a small number of samples in unexpected positions. These were all found to be mislabelled or misidentified after morphological re-examination, in particular specimens of P. stricta that were mislabelled as P. muelleriana, presumably because of their occurrence in Western Australia, where many collectors may only expect to encounter the latter species (see Discussion for details). After the relevant corrections were made, the likelihood phylogeny of the concatenated supermatrix resolved Pogonolepis muelleriana and P. stricta as reciprocally monophyletic (Fig. 3).
However, branch support for the monophyly of Pogonolepis stricta is weak; UFB is only 94, where 95 is generally considered significant, gCF is low at 1.59, and sCF of 33.7 is in line with that of alternative topologies (33.6 and 32.6). Pogonolepis muelleriana has stronger support with UFB of 100, gCF of 9.96, and sCF of 44.0, v. alternative topologies at only 30.1 and 25.9.
Genetic diversity of the non-outcrossing P. muelleriana clade was lower than that of the outcrossing P. stricta clade, with 14 113 parsimony-informative, 15 283 variable but non-informative, and 154 779 constant sites in the former v. 22 645 parsimony-informative, 29 071 variable but non-informative, and 132 459 constant sites in the latter.
Gene Concordance Factors were, overall, slightly higher in the non-outcrossing P. muelleriana clade (mean 5.39) than in the outcrossing P. stricta clade (5.08), but this was driven by a single, high-gCF outlier branch subtending the clade of Dodd 330 and Phillips s.n. (CBG21784). The median values were 0.850 in P. muelleriana and 3.995 in P. stricta. In the randomisation, the median gCF values for the affected species ranged from 0.426 to 2.120, with a mean of 1.017 ± 0.447, and a median of 0.868, and were, thus, in all cases considerably lower than in P. stricta. Site Concordance Factors were approximately equal between the two clades, with P. muelleriana showing a mean of 38.5 and median of 35.9, and P. strica a mean of 36.7 and a median of 37.2.
Conversely, mean gene heterozygosity was high and very similar in both clades (92.6% in non-outcrossing P. muelleriana v. 93.9% in outcrossing P. stricta), but, in this case, outliers reduced and reversed the perceived difference, as the medians were 96.1 and 93.7% respectively.
When sampling material for DNA extraction, specimen identifications were tentatively taken as presumably correct. After preliminary analyses of molecular data, several specimens sampled for the project were found to be placed outside the ‘expected’ clades. They were re-examined morphologically to verify whether their placement indicated that the two species of Pogonolepis were not distinct in their current circumscriptions, or whether these sheets were simply misidentified.
Schmidt-Lebuhn 1633, a specimen from New South Wales, where only P. muelleriana is found, was mislabelled as P. stricta, a carry-over from a confused field note. Two other specimens also labelled as P. stricta and placed in the P. muelleriana sequence clade were Western Australian, where P. stricta is more common. Duplicates of Eichler 20304 at AD and PERTH had already been re-identified as P. muelleriana by Phil Short, the authority on the genus. I re-identified Dodd 330 to P. muelleriana after examination of the florets.
Two other specimens were placed outside of both species of Pogonolepis in preliminary analyses. Confusingly, Short 1252, with duplicates in CANB, HO, MEL, and NSW, is indeed P. muelleriana. However, during specimen selection I overlooked that the number was represented with two sheets in CANB, one of which on closer examination turned out to be Millotia muelleri (Sond.) P.S.Short, presumably being the result of a mix-up when duplicates were transferred between herbaria or mounted. This second sheet was accidentally sampled for lab work.
The second, Chandler 802, has now been re-identified as Angianthus newbeyi P.S.Short. This local endemic of the Norseman area in Western Australia (Short 1990c) was previously known from only three other collections, and this fourth is the first and only at CANB.
Unexpected placement of specimens in the molecular phylogeny was thus in all cases confirmed as to be expected from their correct taxonomy after re-examination of morphological characters. All originally mislabelled specimens were retained in the analyses, in the latter two cases as outgroups, because Angianthus and Millotia are, like Pogonolepis, part of the Angianthus clade (Schmidt-Lebuhn and Bovill 2021).
Species delimitation in Pogonolepis
As implied by the above, molecular data confirmed Short’s (1986) taxonomy of Pogonolepis, but not his hypothesis of the polyphyletic origin of P. muelleriana suggested by its diversity in chromosome numbers. On current evidence, it appears that extant populations of P. muelleriana and P. stricta constitute two sister lineages, and that the former may have arisen from a single event. However, clade support for P. stricta is considerably lower than for its sister, potentially reflecting a lower degree of lineage sorting since divergence of the two species. Another caveat is that geographic sampling of P. muelleriana in Western Australia was limited to the southern half of its distribution in that state (Fig. 2), so it remains possible that broader sampling may yet show multiple origins. However, it appears highly likely that all its eastern Australian populations form a clade, because they are well represented in the data.
This result intuitively supports the continued recognition of P. muelleriana and P. stricta as species. However, it is desirable to make methodological and taxonomic approaches transparent. In the present case, the existence in the Angianthus clade of other pairs differing only in reproductive system (Short 1985) and of various other genera with combinations of sexually and asexually reproducing species (Short 1983, 1989, 1990a, 1990b, 2015) means that it would be logical to apply a consistent approach to species delimitation in this clade.
The problem remains that because the two members of each pair have different reproductive systems, naive application of a single species concept is impossible. If, for example, non-outcrossing populations are to be separated from outcrossing ones because there is no gene flow between the two entities (biological species concept), then, logically, each individual of the non-outcrossing populations would have to be its own species. This illustrates the force of arguments that species are better considered as phenomena in need of explanation, rather than a pre-existing concept into which patterns of diversity are forced (Mishler and Wilkins 2018).
Such an approach provides the conceptual flexibility under which, in this case, P. stricta can be considered a biological species and P. muelleriana an agamospecies (Zachos 2016). Although it may be argued that the data presented here show the two taxa to be reciprocally monophyletic and, therefore, the concept of monophyletic species to be equally applicable, the statement that a sexually reproducing species such as P. stricta is monophyletic is a category error. Because it is sexually reproducing, there is no phylogenetic structure to be -phyletic in, be it mono-, para- or poly- (Hennig 1966).
Reproductive systems and gene concordance
To examine the effect of the reproductive strategy on the concordance between gene trees and the concatenated phylogeny, I calculated gCFs. My expectation was they would be higher in non-outcrossing P. muelleriana, because it seemed logical to assume that it would show less recombination between different genes than outcrossing P. strica. However, with the exception of one outlier branch, the opposite was the case, and despite equal and geographically dispersed sampling, the median gCF was much higher in the outcrossing species. Randomisation of terminal names to simulate complete recombination between genes in one species produced gCF values in line with those observed in non-outcrossing P. muelleriana.
Values of sCF were close to 33% across internal nodes in both species, implying that their internal structure is poorly resolved in both cases, as would be expected if there is either no phylogenetic structure because of recombination or too few informative characters. The large number of informative characters in both species suggests that the former is more likely to be the case.
It is nonetheless unclear why the outcrossing species shows unexpectedly high gene concordance and lower heterozygosity than does its non-outcrossing sister. To understand the processes behind the genetic structure of the species, a more detailed understanding of the reproductive system will be required. Short (1986) interpreted P. muelleriana as selfing. Although he cautioned that the possibility of apomixis could not be excluded on available evidence, he associated it with abnormal pollen formation as found in Taraxacum Weber (Richards 1973), which he did not observe in the species. However, there is no reason to assume that apomicts show abortive pollen, and pseudogamous apomicts require (potentially self-)pollination to activate the endosperm even as no fertilisation of the egg cell takes place (Noirot et al. 1997).
There are therefore several possibilities for the reproductive system of P. muelleriana, including selfing, facultative apomixis, and obligate apomixis. In the first two cases, outcrossing is still possible, albeit, presumably, much less likely to happen than in P. stricta, because of the much smaller number of pollen grains. It would be difficult to emasculate the minuscule, sequentially maturing flowers of a capitulum, but flow cytometric examination of fruits would allow selfing and apomixis to be differentiated by the genome-size ratio of embryo and endosperm (Matzk et al. 2001; Chen et al. 2019). In addition to the reproductive system, dispersal distances would influence the genetic structure of a species (Hamrick and Loveless 1986), but the two species of Pogonolepis do not differ in their fruit morphology or adaptations to dispersal.
Another possibility is that an allopolyploid origin of at least part of Pogonolepis muelleriana explains the unexpected genetic structure, because the species is known to have either 2n = 12 or ~20–24 chromosomes, compared with 2n = 8–10 in P. stricta (Short 1986). In this scenario, alleles inherited from divergent parental populations would produce higher heterozygosity and potentially lower gene concordance than expected from a non-outcrossing diploid. HybPiper produced more paralog warnings for samples of P. muelleriana (median 23, mean 25.8) than for P. stricta (median 15, mean 20.0), which could be seen to provide some support for this interpretation under the assumption that autopolyploidy would lead to paralogs too similar to be recognised.
What complicates this interpretation is that gene heterozygosity values in both species of >90% are unexpectedly high and in line with what would be expected of hybrids (Nauheimer et al. 2021). This raises the possibility that the entire genus may have a polyploidisation event in its recent ancestry, even before the duplication in Pogonlepis muelleriana.
Finally, the high heterozygosity itself suggests another possible explanation. The phylogeny was inferred using data assembled with HybPiper, which returns a single sequence for a locus if variants of that locus are not divergent enough to be flagged as possible paralogs. This means that different alleles of a heterozygous locus may be retrieved effectively randomly from each sample, and this could then reduce gene concordance on the phylogeny. Given that non-outcrossing Pogonolepis muelleriana showed higher gene heterozygosity, this effect may be more pronounced in that species, leading to its lower observed gene concordance.
Utility of sequence capture at the species level
The confirmation as being misidentified after morphological re-examination of all specimens seemingly in the ‘wrong’ position in the molecular phylogeny demonstrates the feasibility of diagnosing species affiliation using Angiosperms353 sequence capture data, assuming sufficient reference data are available, in this case for several specimens each of the two species of Pogonolepis and some outgroups.
In addition, the data resolved geographic structure for both species of Pogonolepis. In P. muelleriana, the three Western Australian specimens formed a grade under the eastern specimens, as expected from a western origin of this species followed by dispersal to the east. Specimens of P. stricta were split approximately evenly into two strongly supported clades, marked A and B in Fig. 3. Clade A contained samples from south-western Western Australia in a triangle bounded by just south of Shark Bay, Bunbury, and Hyden. Clade B comprised samples from a northern, mostly interior, area approximately bounded by just east of Shark Bay, Geraldton, the Hamersley Lakes and Lake Way.
That this level of resolution can be achieved inside species suggests sequence capture as an attractive approach for phylogeographic studies, not least because it is able to produce data more reliably from herbarium specimens with potentially degraded DNA than are many other molecular methods.
Raw sequence reads are available on the NCBI Short Read Archive. Data matrices and tree file are available on the CSIRO Data Access Portal, see https://doi.org/10.25919/k9zt-0034.
Declaration of funding
The author acknowledges the contribution of the Genomics for Australian Plants Framework Initiative consortium (see https://www.genomicsforaustralianplants.com/consortium/) in the generation of data used in this publication. The Initiative is supported by funding from Bioplatforms Australia (enabled by NCRIS), the Ian Potter Foundation, Royal Botanic Gardens Foundation (Victoria), Royal Botanic Gardens Victoria, the Royal Botanic Gardens and Domain Trust, the Council of Heads of Australasian Herbaria, CSIRO, Centre for Australian National Biodiversity Research and the Department of Biodiversity, Conservation and Attractions, Western Australia.
The author thanks Chris Jackson and Lars Nauheimer for help during bioinformatic analysis and Katharina Nargar for comments on a previous version of the manuscript. The author is also grateful for the constructive feedback from two anonymous reviewers, whose suggestions on how to expand the discussion of site Concordance Factors and gene heterozygosity made a significant difference to the final version of the manuscript. The study used the lab and sequencing services of the Australian Genome Research Facility.