A complete single unit of a ribosomal RNA gene (rDNA) of M. croslandi was sequenced. The ends of the 18S, 5.8S and 28S rRNA genes were determined by using the sequences of D. melanogaster rDNAs as references. Each of the tandemly repeated rDNA units consists of coding and non-coding regions whose arrangement is the same as that of D. melanogaster rDNA. The intergenic spacer (IGS) contains, as in other species, a region with subrepeats, of which the sequences are different from those previously reported in other insect species. The length of IGSs was estimated to be 7–12 kb by genomic Southern hybridization, showing that an rDNA repeating unit of M. croslandi is 14–19 kb-long. The sequences of the coding regions are highly conserved, whereas IGS and ITS (internal transcribed spacer) sequences are not. We obtained clones with insertions of various sizes of R2 elements, the target sequence of which was found in the 28S rRNA coding region. A short segment in the IGS that follows the 3′ end of the 28S rRNA gene was predicted to form a secondary structure with long stems.
The Australian bulldog ant Myrmecia croslandi is one of the better studied species of ants, and has the lowest chromosome number (2n=2) known in higher organisms. From this large metacentric chromosome, two smaller chromosomes were believed to be produced by a centric fission (Hirai et al., 1996, and the references therein). These species exhibit chromosome number polymorphism, with 2n=2, 3 and 4 (Crosland and Crozier, 1986; Taylor, 1991). M. croslandi belongs to the Myrmeciinae, which is thought to be an ancestral group of ants (Hölldobler and Wilson, 1990; Ogata, 1991). The phylogenetic position of the Myrmeciinae within Formicidae, however, has not been clarified (Taylor, 1987; Baroni Urbani et al., 1992).
The ribosomal RNA genes (rDNAs) are essential for all organisms and contain a mosaic of DNA sequences showing extremely different evolutionary rates (Pélandakis et al., 1991). 28S rDNA of eukaryotes consists of twelve divergent domains (D1–D12) (Hassouna et al., 1984) that have been widely used in molecular systematics. A phylogenetic tree of hymenopteran insects constructed by using the D2 region and the mitochondrial 16S rDNA sequences is not compatible with one constructed based on morphological data (Schmitz and Moritz, 1998). The analysis of 1100-bp sequences of 18S and 28S rDNAs produced a polydivergent tree for the ant subfamily relationships (Sullender, 1998).
There have been only a few studies on the complete rDNA unit structure in insects: the structures studied were from Drosophila melanogaster (Diptera) (Tautz et al., 1988) and Acyrthosiphon pisum (Hemiptera) (Amako et al., 1996). In the present study, we determined the complete nucleotide sequence of a single unit of rDNA of M. croslandi. This is the third report of such a study in insects and the first in the Hymenoptera. We characterize the unit of rDNAs by comparing it with that of D. melanogaster rDNAs, which has been widely used as a general reference.
MATERIALS AND METHODS
Ant materials and rDNA clones
Myrmecia croslandi male pupae (sample no. IH89-31, with 2n=3), a gift from Hirotami T. Imai, were stored at −80°C. pMc.r1 and pMc.r2 are plasmid DNA clones each carrying an EcoR I fragment of the rDNA of M. croslandi (sample no. IH89-030, with 2n=2) as described in Hirai et al. (1994, 1996).
DNA extraction, genomic library and hybridization probes
High-molecular-mass genomic DNAs were extracted from 10 frozen male pupae following the procedure described by Sambrook et al. (1989). Since the DNA fragments containing the rDNA region of M. croslandi were shown to be between 4.5 and 15 kb in length by Hind III digestion (Hirai et al., 1994), genomic DNA digested partially by Hind III was ligated into Lambda DASH II using the Ligation High kit (Toyobo). They were then packaged with Gigapack III (Stratagene) and introduced into Escherichia coli XL1-Blue cells as described (Sambrook et al., 1989). Two probes were used for plaque filter hybridization: the 1.5 kb Sal I-EcoR I fragment (pHO-8) containing a portion of the 3′ half of the coding region of the 28S rRNA gene, derived from pMc.r2, and the 0.5 kb PCR-amplified fragment (pMc.18S-2pcr) containing a portion of the 18S rRNA gene near the 5′ end (see Fig. 1). The phage DNAs were isolated as described (Sambrook et al., 1989). Three probes, pHO-10 (a subclone of pHO-8), pHO-8 and pMc.18S-2pcr, were used for hybridization analyses to examine structures of λ phage DNAs (see Fig. 1).
Polymerase chain reaction (PCR)
PCR amplification was conducted using the following primer pairs with Ex Taq polymerase (Takara), according to the manufacturer's protocol. The primer pair 18S-1U (5′-AGTAGTCATATGCTTGTCTC-3′) and 18S-1L (5′-AATCATTCAATCGGTAGTAG-3′) amplified most of the 18S rDNA (see Fig. 1, pMc.18S-1pcr). The primer pair 18S-1U and 18S-2L (5′-TGCTGCCTTCCTTGGATGTG-3′) amplified a quarter of the 18S rDNA (pMc.18S-2pcr) (see Fig. 1). For amplification of a portion of the IGS region including the 3′-most end of 28S rDNA and extending downstream (pMc.IGSpcr), the primer set HO-X-2(U) (5′-CCTGGCGGGGTGTTGTACTC-3′) and IGS-2L (5′-ACCGAATATCAGAAGGAAAGAC-3′) was used (see Fig. 1). The amplification conditions were 30 cycles of 1 min at 96°C, 1 min at 58°C, and 2 min at 74°C, following an initial denaturation at 96°C for 2 min. After electrophoresis, the amplified fragment was purified using SUPREC-01 (Takara) and inserted into pCR2.1 using a TA cloning kit (Invitrogen).
A half microgram each of ant genomic DNA was singly digested with EcoR I, Hind III and Ban III and separated by electrophoresis on a 1.0% agarose gel. The DNA fragments were transferred to a nylon membrane with 0.4 N NaOH and allowed to hybridize with a probe. The probe was a 0.3 kb Ban III-digested fragment, derived from the 4.0 kb Hind III fragment of λMc.8 (see Fig. 1), labeled with DIG-11-dUTP using a DNA labeling kit (Boehringer Mannheim). Hybridization was carried out in a solution of 5×SSC, 0.1% (w/v) N-lauroylsarcosine, 0.02% SDS and 1% (w/v) casein at 68°C overnight. Filters were washed twice in 2×SSC, 0.1% SDS at room temperature for 5 min and twice in 0.1×SSC, 0.1% SDS at 68°C for 15 min. Hybridization was detected with CDP-Star (Boehringer Mannheim) according to the manufacturer's protocol.
Subcloning and sequencing
Isolation of bacteriophage DNAs, subcloning of fragments into plasmids (pUC 118, 119 and pBluescript), transformation into E. coli (JM109, INVa-F' and XL1-Blue), plasmid growth, isolation of plasmid DNAs, and restriction enzyme digestions were performed according to standard methods (Sambrook et al., 1989). Sequences of subcloned DNAs were determined using a DyeDeoxy terminator Cycle Sequencing kit (Perkin Elmer) and an ABI PRISM 310 Genetic Analyzer (PE Applied Biosystems). Subclones containing a portion of the internal IGS (λMc.1–7 and λMc.11–12) were sequenced using a GPS (genome priming system) −1 kit (BioLabs). The sequences determined were submitted to DDBJ/EMBL/GeneBank (accession numbers: AB052895, AB121786, AB121787, AB121788, AB121789 and AB121790).
Cloning and alignment of M. croslandi rDNA
The clone pMc.r2, containing a portion of 28S rDNA, was previously used for fluorescence in situ hybridization to chromosomes of species of Myrmecia (Hirai et al., 1994). We first determined the complete nucleotide sequence of pMc.r2 (accession number AB052895) (Fig. 1). This clone contained part of the 18S rRNA gene, ITS 1, the 5.8S rRNA gene, ITS 2 and part of the 28S rRNA gene, leaving the sequence including the intergenic spacer (IGS) lying between the right portion of the 28S rRNA gene and the left portion of the 18S rRNA gene to be cloned and sequenced. In order to clone and sequence the entire single unit of the rDNA, we amplified the 18S rDNA region that was not contained in pMc.r2, using a forward primer designed using the 18S rRNA gene sequences common to D. melanogaster (Tautz et al., 1988) and Polistes dominulus (Chalwatzis et al., 1995), and a reverse primer designed using the sequence of the 18S rDNA at the 5′ end of pMc.r2. A 1.8 kb fragment obtained was cloned (pMc.18S-1pcr) and sequenced (accession number AB121786).
pMc.r1 is another ant rDNA clone prepared previously and is longer than pMc.r2 (Hirai et al., 1994). Restriction mapping showed that the region around the 3′ end of pMc.r1 was different from that of pMc.r2 (Hirai et al., 1994). We determined the sequence of this region and found that pMc.r1 contained an inserted sequence in the 28S rRNA gene region (Fig. 1).
We then screened a library constructed from genomic DNAs partially digested with Hind III with pMc.18S-2pcr and/or pHO-8 (Fig. 1) as probes. We obtained two clones which hybridized with both pMc.18S-2pcr and pHO-8, three clones with only pHO-8 and two clones with only pMc.18S-2pcr.
These clones were digested with Hind III and probed with pMc.18S-2pcr, pHO-8 and pHO-10. After restriction mapping, subcloning and partial sequencing of both the 5′-and 3′-end regions, the clones were aligned with pMc.r2 and pMc.18S-1pcr (Fig. 1). λMc.1 contained the 18S and 28S regions, λMc.8 contained the upstream sequence of 18S and λMc.11 contained the sequence spanning from part of 18S to downstream of 28S. There is a single Hind III site within the 18S rRNA gene; a Hind III site was also found at the 5′ end of λMc.11 and at the 3′ end of the λMc.8.
We designed a forward primer toward the 3′ direction based on the sequence near the 3′ end of λMc.11 and a reverse primer toward the 5′ direction from the sequence near the 5′ end of λMc.8 in order to determine if any Hind III fragments exist between the 3′ end of λMc.11 and the neighbouring Hind III site at the 5′ end of λMc.8 within a portion of the intergenic spacer (IGS) region. An amplified fragment from genomic DNA was cloned (pMc.IGSpcr) and sequenced. We found that the DNA fragment contained the 3′ end portion of λMc.11 and the 5′ end portion of λMc.8 and only one Hind III site (Fig. 1). The DNA sequence data clearly indicate that the rDNA illustrated in Fig. 1 is a complete unit of the tandemly arranged repeated gene, and that we obtained clones covering the entire unit.
The sequence of the rDNA repeating unit
To extend the sequence analyses in the either direction from the core rDNA coding region, we further examined sub-clones λMc.1, λMc.8 and λMc.11. The 6.0 kb Hind III-Hind III fragment from λMc.1 and the 1.6 kb EcoR I-Hind III fragment at the 3′ end of λMc.11were sequenced (accession numbers AB121787, AB121788, respectively). The 1.7 kb Hind III-Ban III fragment of the 5′ end of λMc.8 was also sequenced (accession number AB121789), but the remaining 2.3 kb Ban III-Hind III fragment was not sequenced completely, because it contained small subrepeat structures (see below).
The 5′ and 3′ ends of each rRNA gene were determined based on sequence comparison using the sequences of D. melanogaster as a reference: the 18S rRNA gene was 1,922 bp, the 5.8S rRNA gene was 109 bp and 28S rRNA gene was 4,243 bp long.
λMc.1, λMc.11 and pMc.r1 each contained an inserted sequence with a different size at the R2 target site (Burke et al., 1999) that we found in 28S rDNA. R2 element (a nonLTR retrotransposon) recognizes 54 bp-long sequence highly conserved in Arthropoda at the target site. These insertions were partially sequenced and aligned with the known R2 element sequences of insect and other arthropod species (Burke et al., 1999). The results indicated that the insertions indeed represent the R2 element. Among the 3 clones, only λMc.11 had a duplication of the 26 bp target sequence.
A fragment of λMc.1 (6.0 kb, see Fig. 1) was completely sequenced and a fragment of λMc.8 (8.0 kb, see Fig. 1) was partially sequenced. These sequences showed that the difference in these two clones was due to the number of the internal subrepeat sequences of IGS (see below).
Dot-matrix analyses of rDNA
The rDNA unit of M. croslandi was compared with that of D. melanogaster (Tautz et al., 1988) by dot-matrix analyses (Fig. 2A). For this purpose, the sequence of the ant rDNA was arranged so that the 5′ end of Hind III site of the 6.0 kb sequence of λMc.1 within the IGS subrepeat was numbered as the first nucleotide, and the 3′ end of Ban III site of the 1.7 kb sequence of λMc.8 within the IGS subre-peat was numbered as nucleotide 15,186. Twenty bases were compared at a time, requiring 18 bases to be identical to register a dot. Linear-like plots appeared in the rDNA coding regions. The sequences of the rRNA genes were highly conserved with respect to those of D. melanogaster (identity; 18S: 76%, 5.8S: 86%, the first half of 28S: 52%, the latter half of 28S: 65%). No sequence similarity was detected between the IGS and ITS regions, as has been noted in previous studies (Long and Dawid, 1980; Hillis and Dixon, 1991).
The ant rDNA was compared with itself using the same method (Fig. 2B). Extensive stitch-like clusters were observed in the IGS region, suggesting that the same sequence arrays are present as tandemly arrayed subrepeats. Minor stitch-like clusters were also observed near the 5′ end of IGS. Analysis of the sequences corresponding to these minor clusters, nucleotide position 12477–13116, based on Zuker method for prediction of RNA structure (Zuker, 1989), showed that they could form a secondary structure with long stems (Fig. 3).
Sequences of subrepeats inside the IGS region
The 6.0 kb fragment of λMc.1 contained two different kinds of subrepeats with lengths of 293 bp and 229 bp. The 229 bp subrepeats were found sandwiched between the 293 bp subrepeats sporadically and not in any uniform manner. The 293 bp subrepeats had a Ban III site within the sequence, but the 229 bp subrepeats did not (data not shown but see the database, accession number AB121787) (Fig. 4). Thus, the Ban III fragment sizes are expected to be either 0.3 kb (293 bp) from [293 bp+293 bp] subrepeats or 0.5 kb (522 bp) from [293 bp and 229 bp] subrepeats. When the Ban III digest of the 6.0 kb fragment of λMc.1 was separated by electrophoresis on an agarose gel, a 0.3 kb and a 0.5 kb band were observed. On the other hand, when the Ban III digest of the 8.0 kb fragment of λMc.8 was similarly examined, a 0.8 kb-long band was observed in addition to the bands of 0.3 kb and 0.5 kb (data not shown). Cloning and sequencing of the 0.8 kb fragments showed that they contained tandemly arrayed 0.3 kb and 0.5 kb subrepeats with 293 bp subrepeats that had lost a Ban III site (accession number AB121790).
Closer examination showed that the 293 bp subrepeat could further be divided into two shorter subunits, 149 bp (A) and 144 bp (B or B'), with similar sequences (identity: A/B 81%, A/B' 81%, B/B' 88%). There was a Ban III site within the 149 bp subunit but not within the 144 bp subunit. Similarly, the 229 bp subrepeat could be divided into two shorter subunits of 93 bp (C) and 136 bp (B”) (identity: A/B” 79%, B/B” 81%, B'/B” 94%) (Fig. 4).
The DNA of λMc.8 was digested with Hind III and delivered two fragments of 4.0 kb and 8.0 kb. The results indicated that there is a single Hind III site within the region of subrepeats in the IGS. Digesting the 4.0 kb fragment with Ban III, we obtained 0.3 kb, 0.5 kb and 1.7 kb fragments. The sequence of the 1.7 kb fragment showed that this fragment contained neither 293 bp nor 229 bp subrepeats (see Fig. 1).
Genomic Southern hybridization
The length of the IGS was estimated by genomic Southern hybridization using a 0.3 kb Ban III fragment of the IGS subrepeats from λMc.8 as a probe. Ant genomic DNAs were completely digested with EcoR I, Hind III or Ban III and probed (Fig. 5). In the EcoR I digest, major bands were detected at and around 14 kb. In the Hind III digest, a major band at 8.0 kb, two less prominent bands at 2.4 kb and 2.2 kb, and minor bands at 6.0 kb, 4.0 kb and 2.0 kb were observed, while in the Ban III digest three distinct bands (0.8 kb, 0.5 kb and 0.3 kb) were detected. Taken together (see also Figs. 1 and 4) the results indicate that the IGS region varies in length (7–12 kb) due to variation in the length of subrepeats.
The present study revealed the complete structure and sequence of a single rDNA repeating unit of M. croslandi. The boundaries of the individual coding regions and associated ITSs of the 18S, 5.8S and 28S rRNA genes were determined by using the sequences of D. melanogaster rRNA genes as references (Tautz et al., 1988). As is typical for eukaryotic rDNA units (Long and Dawid, 1980), the unit of M. croslandi rDNA contains, in addition to the core genes, a long intergenic spacer (IGS) sequence. The M. croslandi IGS consists of terminal regions at both ends with unique sequences and a long internal region with repeated short stretches.
Eukaryotic rRNA genes are generally composed of a few hundred to a few thousand copies of tandemly arrayed repeated units, and those of insects are no exception (Long and Dawid, 1980). In D. melanogaster, there are 200–250 copies of rDNA units that are tandemly arrayed (Glover, 1981). Hymenopteran insects are also known to have multiple copies of rDNAs (Bigot et al., 1992). Numbers of Myrmecia species, including M. croslandi, were shown to carry multiple copies of rDNA in the significantly vast area on the chromosome complements (Hirai et al., 1994, 1996). The fact that we could PCR amplify a fragment (cloned as pMc.IGSpcr, see Fig. 1), whose sequence was aligned at its 3′ end with the 5′ end of the rDNA unit, and at its 5′ end with the 3′ end of the rDNA unit (Fig. 1), indicates that the copies were arrayed in tandem fashion.
There have been only a few reports describing the complete molecular structure and the sequence of the rDNA unit in insects (Tautz et al., 1988; Amako et al., 1996). The order of the coding and non-coding (spacer) regions, as is typical of that in the eukaryotes, is shown to be the same among M. croslandi (Hymenoptera), D. melanogaster (Diptera) and A. pisum (Hemiptera) by comparing the sequence results of the entire single repeating unit of their rDNAs (present results; Tautz et al., 1988; Amako et al., 1996). High sequence similarity between M. croslandi and D. melanogaster in the entire coding regions (and for the 28S coding region of A. pisum (Amako et al., 1996)) was demonstrated as expected (data not shown; the A. pisum sequence data for the 18S and 5.8S regions have not been submitted to the database and hence are not available for comparison). The sequences in the coding regions are well conserved among various invertebrate and vertebrate species (Hillis and Dixon, 1991).
In contrast, the sequences in the non-coding regions are quite dissimilar among the three insect species. The subrepeat region within the IGS in M. croslandi consists of two different repeats (293 bp and 229 bp), that in D. melanogaster of three different repeats (95 bp, 330 bp and 240 bp) (Tautz et al., 1988), and that in A. pisum of identical 247 bp repeats (Kwon and Ishikawa, 1992). The sequences of the subrepeats have no similarity among the three species, each belonging to a different order. Tautz et al. (1987) compared the IGS sequences of four Drosophila species and presented the results showing the overall evolutionary relationships between these species.
The complete sequence of the rDNA of M. croslandi revealed in the present study provides the means to design PCR primers anywhere inside the rDNA of M. croslandi. Primers prepared based on the unique sequence of the IGS region amplified a DNA fragment of M. croslandi, but did not amplify any DNA fragment in two other ant species (Myrmecia pilosula and Nothomyrmecia macrops, our unpublished data). This highly specific amplification suggests that the sequence information of IGSs may even be species-specific.
It will thus be of considerable interest to examine the IGS sequences, both unique and subrepeat, among species of the same Genera, Subfamilies and Families in various insect Orders. If the subrepeat patterns and sequences reflect phylogenetic relationships at the Genus, Subfamily, or Family levels, the information should be very useful for the clarification of evolutionary relationships. Even in the cases in which the subrepeat patterns and sequences do not reflect phylogenetic relationships, the information should at least be useful for the identification of a species in the social insects, such as ants, in which mating experiments are often quite difficult, if not impossible, to perform. In any event, for the comparison of distantly related groups of species, only the sequences of the conserved regions appear to be useful, and thus PCR primers must be designed accordingly to amplify only conserved regions.
In the present study, the sequence of the 1.5 kb EcoR I–Hind III fragment around the 3′ end of the 28S rRNA gene obtained from λMc.11 was refractory to determination by the ordinary dye-termination method. The sequence was finally determined by the GPS (genome priming system) method. The GPS method interrupts stem construction by the insertion of a transposable element, allowing us to determine the nucleotide sequence (Biery et al., 2000). We found a relatively short sequence containing both tandemly and non-tandemly arranged stretches following the end of the coding region of 28S rRNA gene (Fig. 2B). The sequence represents a characteristic feature found in the ant rDNA unit, and previously unknown in other species such as D. melanogaster (Tautz et al., 1988) and A. pisum (Amako et al., 1996). This short sequence is expected to form a secondary structure with long stems (Fig. 3), which may disturb the polymerase extension reaction in the sequencing process. Whether this structure would affect the rDNA transcription process in vivo remains to be determined.
We found clones carrying an R2 element (a site-specific non-LTR retrotransposon) inserted at the 54 bp-long consensus R2 target sequence found in the 28S coding region (Fig. 1). One clone had an R2 insertion next to the target sequence and a duplication (26 bp) of the target sequence, supporting the R2 transposing model (Burke et al., 1999). The proportion of rDNA copies carrying R2 elements was high; four of the eight clones isolated from the library constructed with EcoR I digest of the genomic DNA (pMc.1 to 8), and all four clones isolated from the Hind III library, contained R2 elements (data not shown). It has been said that the copy number of functional rDNAs is not large, since the R2 insertion may disturb transcription (Luan and Eickbush, 1996). The large bands hybridized with rDNA fragments detected in some Myrmecia species by FISH analyses (Hirai et al., 1994) might simply represent an extensive array of rDNAs that became inactive due to the insertion of R2 elements (Jakubczak et al., 1991). The number of active rDNAs may be comparable to that observed in other species, compensated to the normal level by amplification of rDNA without R2 insertion.
Knowledge of the complete unit of rDNA of ants may contribute not only to the elucidation of the molecular phylogeny of Hymenoptera, but also may provide materials for studies of rDNA functions. The subrepeat region in IGS has been shown to control the transcription of rDNA in Drosophila (Kohorn and Rae, 1983; Grimaldi and Nocera, 1988), and in other organisms (Grummt, 1982; Mougey et al., 1996). The rate of rDNA transcription is efficiently regulated according to the growth rate of the cells and body size (Zaffran et al., 1998; Giordano et al., 1999; Frank et al., 2002). The mechanisms to be studied include the differences in gene expression between haploid (male) and diploid (female) cells in Hymenoptera, and differentiation of body sizes in different castes of eusocial insects such as ants.
We thank Hirotami T. Imai of the National Institute of Genetics for his introduction to the studies and generous provision of the ant material. We are grateful to Hajime Ishikawa and Hiromichi Makita of the University of the Air, Zhi-Hui Su and Kazunori Yamazaki of the JT Biohistory Research Hall, and Kyoichi Sawamura of the University of Tsukuba for their valuable advice.