Recent biochemical, genetic and bioinformatic studies have demonstrated that peptide signaling plays a greater than anticipated role in various aspects of plant growth and development. More than a dozen secreted peptides are now recognized as important signals that mediate cell-to-cell communication. Secreted peptide signals often undergo post-translational modification and proteolytic processing, which are important for their function. Such “small post-translationally modified peptide signals” constitute one of the largest groups of peptide signals in plants. In parallel with the discovery of peptide signals, specific receptors for such peptides were identified as being membrane-localized receptor kinases, the largest family of receptor-like molecules in plants. These findings illustrate the critical roles of small peptide ligand-receptor pairs in plant growth and development. This review outlines recent research into secreted peptide signals in plants by focusing on small post-translationally modified peptides.
Peptides are generally defined as polypeptide chains smaller than 100 amino acid residues. Recent biochemical, genetic and bioinformatic analyses have revealed that secreted peptides are important components in intercellular signals that coordinate and specify cellular functions in plants as well as animals. In addition, a number of genes encoding small secreted peptides have been identified in Arabidopsis (Lease and Walker, 2006; Silverstein et al., 2007; Ohyama et al., 2008). Thus, unraveling the signal transduction pathways mediated by secreted peptides is currently one of the most exciting goals in plant biology.
The first plant signaling peptide identified in plants was tomato systemin, which was reported in 1991 (Pearce et al., 1991), and more than ten secreted peptide signals and putative peptide signal genes have since been identified in plants (Table 1) (Matsubayashi and Sakagami, 2006; Butenko et al., 2009; Wang and Fiers, 2010; Shimada et al., 2011). One structurally characteristic group of peptide signals is “small post-translationally modified peptides”, which include phytosulfokine (PSK) (Matsubayashi and Sakagami, 1996), tracheary element differentiation inhibitory factor (TDIF) (Ito et al., 2006), CLAVATA3 (CLV3) (Fletcher et al., 1999; Kondo et al., 2006; Ohyama et al., 2009) and root meristem growth factor (RGF) (Matsuzaki et al., 2010) (Table 2) (Figure 1). These peptides are characterized by a small mature size (less than 20 amino acids) and the presence of post-translational modifications such as tyrosine sulfation, proline hydroxylation and arabinosylation. In these peptide signals, peptide chain length and post-translational modifications are generally very important for their receptor binding activity and physiological functions. Posttranslational modification is thought to affect peptide conformation through steric interactions with the peptide backbone, thereby modulating the binding ability and specificity of peptides to target proteins (Walsh et al., 2005). Posttranslational modification can also protect peptides from proteolytic attack and, in some cases, enhance tissue delivery of the peptide through specific transporters in vivo (Seitz, 2000). Another group of secreted peptide signals is “cysteine-rich peptides”, which are characterized by the presence of an even number of cysteine residues (typically 6 or 8) necessary for the formation of intramolecular disulfide bonds (Figure 1). Several secreted peptide signals such as SCR/SP11 (Schopfer et al., 1999; Takayama et al., 2000), stomagen (Kondo et al., 2010; Sugano et al., 2010), LURE (Okuda et al., 2009) and epidermal patterning factors (EPFs) (Hara et al., 2007; Hara et al., 2009) belong to this group.
In parallel with the discovery of these peptide signals, specific receptors for several peptide signals were identified as being membrane-localized receptor kinases, the largest family of receptor-like molecules in plants (Matsubayashi et al., 2002; Hirakawa et al., 2008; Ogawa et al., 2008; Guo et al., 2010). These findings illustrate the importance of peptide signaling in the regulation of plant growth and development. Here, I outline recent advances in the current understanding of peptide signals in Arabidopsis, currently regarded as a new class of plant hormones, by focusing on small post-translationally modified peptides.
STRUCTURAL CHARACTERISTICS OF SMALL POST-TRANSLATIONALLY MODIFIED PEPTIDES
Small post-translationally modified peptides are characterized by the presence of post-translational modifications mediated by specific transferases and by their small sizes (less than 20 amino acids) resulting from proteolytic processing (Figure 1). These peptides are initially translated as ≈100-amino-acids precursor peptides containing an N-terminal secretion signal, and are then structurally modified by specific modification enzymes and processing enzymes localized in the ER or Golgi complex to give mature, biologically functional peptides (Figure 2). This peptide signal group includes PSK (Matsubayashi and Sakagami, 1996), hydroxyproline-rich systemin (HypSys) (Pearce et al., 2001a), TDIF (Ito et al., 2006), PSY1 (Amano et al., 2007), CLV3 (Fletcher et al., 1999; Kondo et al., 2006; Ohyama et al., 2009), CLAVATA3/ESR-Related (CLE) peptides (Cock and McCormick, 2001 ; Ohyama et al., 2009), C-terminally encoded peptide 1 (CEP1) (Ohyama et al., 2008) and RGF (Matsuzaki et al., 2010) (Table 2).
List of peptide signals in plants.
List of structurally characterized small post-translationally modified peptide signals in Arabidopsis. The sulfated tyrosine residue is shown as Tyr(SO3H). The hydroxyproline residue is shown as Hyp. The hydroxyproline residue modified with three residues of L-arabinose is shown as [(L-Ara)3]Hyp.
Interestingly, the primary sequences of the precursor polypeptides of these peptide signals have unique structural features (Figure 2). First, they are encoded by multiple paralogous genes possibly generated by gene duplication. Second, these paralogous genes encode approximately 70- to 120-amino-acid secreted polypeptides that exhibit significant sequence diversity, with the exception of conserved regions near the C-terminus that correspond to the mature peptide domains (Figure 2). Limited conservation in amino-acid sequences outside of the mature peptide domain implies that the regions cleaved by proteolysis are not under strong selection pressure. Third, the precursor polypeptides for these peptide signals contain few or no Cys residues. This is in sharp contrast to the presence of multiple Cys residues (typically 6 or 8) in the cysteine-rich peptides that are structurally stabilized by intra-molecular disulfide bonds, thus suggesting that such disulfide bonds interfere with proteolysis by the processing enzymes.
Importantly, these characteristic structural features of the precursors of the small post-translationally modified peptide signals enable prediction of genes encoding novel peptide signal candidates. In other words, genes encoding cysteine-poor secreted peptide families with conserved C-terminal domains may encode small post-translationally modified peptide signals. Indeed, CEP1 and RGF were identified in part by in silico screening of peptide families with these characteristics (Ohyama et al., 2008; Matsuzaki et al., 2010).
POST-TRANSLATIONAL MODIFICATIONS IN PEPTIDE SIGNALS
Post-translational modifications are known to affect peptide conformation through steric interactions with the peptide backbone, thereby modulating the binding ability and specificity of peptides for target receptor proteins. To date, the following types of posttranslational modification have been identified in secreted peptide signals in plants: tyrosine sulfation, proline hydroxylation and hydroxyproline arabinosylation.
Tyrosine sulfation is a post-translational modification found in peptides and proteins synthesized through the secretory pathway (Figure 3A). To date, three tyrosine-sulfated peptide signals, PSK (Matsubayashi and Sakagami, 1996), PSY1 (Amano et al., 2007) and RGF (Matsuzaki et al., 2010), have been identified in plants (Table 2). This modification is mediated by a specific enzyme, tyrosylprotein sulfotransferase (TPST), which catalyzes the transfer of sulfate from 3'-phosphoadenosine 5'-phosphosulfate (PAPS) to the phenolic group of tyrosine (Komori et al., 2009). Although the tyrosine sulfation motif in peptides is not clear-cut, the minimum requirement for tyrosine sulfation in plants is the presence of an aspartic acid residue N-terminally adjacent to a tyrosine residue (Asp-Tyr). Multiple acidic amino acids near this tyrosine residue significantly enhance sulfation (Hanai et al., 2000a).
Arabidopsis TPST (AtTPST) is a Golgi-localized 62-kDa transmembrane protein (Komori et al., 2009). AtTPST is expressed throughout the plant body, and the highest levels of expression are observed in the root apical meristem. Surprisingly, AtTPST shows no sequence similarity with animal TPSTs, despite both enzymes catalyzing identical sulfate transfer reactions using the same co-substrate, PAPS. AtTPST is a type I transmembrane protein with the transmembrane domain located near the C-terminus, whereas animal TPSTs are type II transmembrane proteins with the transmembrane domain located near the N-terminus (Moore, 2003). This structural diversity strongly suggests that the AtTPST gene has evolved from an ancestral gene distinct from that of animal TPSTs. In other words, plants and animals independently acquired enzymes for tyrosine sulfation through convergent evolution.
A loss-of-function mutant of AtTPST (tpst-1) displayed a marked dwarf phenotype accompanied by stunted roots, loss of root stem cells, pale green leaves and early senescence (Komori et al., 2009). As Arabidopsis TPST (AtTPST) is a single copy gene, phenotypes of its loss-of-function mutant should reflect deficiencies in the biosynthesis of all the functional tyrosine-sulfated peptides.
Hydroxyproline (Hyp) residues have been observed in defenserelated hydroxyproline-rich systemin (TobHypSys) (Pearce et al., 2001a), TomHypSys (Pearce and Ryan, 2003), PSY1 (Amano et al., 2007), TDIF (Ito et al., 2006), CEP1 (Ohyama et al., 2008), CLV3 (Kondo et al., 2006; Ohyama et al., 2009), CLE2 (Ohyama et al., 2009) and RGF1 (Matsuzaki et al., 2010) (Figure 3B).
Proline hydroxylation is mediated by prolyl 4-hydroxylase (P4H), which belongs to a family of 2-oxoglutarate-dependent dioxygenases that require 2-oxoglutarate and O2 as co-substrates (Myllyharju, 2003). P4H is a type II membrane protein with the transmembrane domain located near its N-terminus, and that localizes in both the ER and Golgi complex. To date, 13 P4H genes have been identified in Arabidopsis (Hieta and Myllyharju, 2002; Tiainen et al., 2005; Yuasa et al., 2005; Velasquez et al., 2011). Although some sequence motifs have been reported for efficient proline hydroxylation (Shimizu et al., 2005), no definitive consensus sequence has been determined for proline hydroxylation of secreted peptide signals in plants.
Hyp residues in several secreted peptide signals, such as PSY1, CLV3 and CLE2, are further modified with an O-linked L-arabinose chain (Amano et al., 2007; Ohyama et al., 2009) (Figure 3C). Linkage analysis of the sugar moiety and α-L-arabinofuranosidase treatment suggests that arabinose residues are linked with one another via β-1,2-bonds (Ohyama et al., 2009) (Figure 4). This structure is supported by NMR analysis of Hyp-bound linear triarabinoside isolated from cell wall preparations of Arabidopsis cell suspension cultures (Bollig et al., 2007). Linear β-1,2-linked triarabinoside has also been detected in lectins derived from potato tubers (Ashford et al., 1982), and from hydroxyproline-rich glycoproteins derived from cell wall preparations of suspension-cultured tobacco cells (Akiyama and Kato, 1976; Akiyama et al., 1980).
O-glycosylation generally occurs via the successive addition of nucleotide-activated sugars catalyzed by glycosyltransferases in the Golgi complex. It is thought that the biosynthesis of Hyp-bound β-1-2-linked triarabinoside involves two distinct arabinosyltransferases. The first is responsible for the formation of a β-linkage with the 4-position of hydroxyproline (hydroxyproline arabinosyltransferase), and the second forms a β-1-2-linkage between arabinofuranose residues (arabinosyltransferase) (Bollig et al., 2007). Among 461 putative Arabidopsis glycosyltransferase genes ( http://www.cazy.org), recent chemical genetic screening suggests that XEG113 (At2g35610) encodes β-1-3-arabinosyltransferase or bifunctional β-1-2, 1-3-arabinosyltransferase (Gille et al., 2009). In contrast, there have been no reports on hydroxyproline arabinosyltransferase.
In animals and yeasts, biosynthesis of small peptide signals often involves proteolytic processing of precursor polypeptides to produce mature functional peptides. Examination of the primary sequences of many animal peptide signals has shown that cleavage of a precursor polypeptide occurs on the C-terminal side of paired basic amino acids, such as KK, KR, RK and RR. In animals, this cleavage is catalyzed by subtilisin/kexin-like prohormone convertases (Rehemtulla and Kaufman, 1992).
Proteolytic processing is also critical for biosynthesis of small post-translationally modified peptides in plants, but the processing mechanisms of plant peptides are different from those in animals. First, there is no paired basic amino acid motif adjacent to the mature peptide domain within the precursor polypeptides (Figure 2). Instead, processing enzyme activity that cleaves the N-terminal side of single Arg residue of CLV3 precursor polypeptide has been detected in crude cauliflower plant extract in vitro (Ni and Clark, 2006; Ni et al., 2011), indicating that substrate specificity of plant subtilases is distinct from those of animal subtilisin/kexin-like prohormone convertases. Second, one of the Arabidopsis subtilases, AtSBT1.1, is responsible for the initial processing of PSK4 in vivo (Srivastava et al., 2008), but its processing site is between Leu and His located somewhat upstream of the mature peptide domain. These observations suggest that basic amino acids may be the initial processing sites, but do not always directly define the boundary of the mature peptide domain. Plant subtilases have greater structural similarity with prokaryotic degrading-type subtilases than with animal processing-type subtilases, due to a lack of the P-domain characteristic of animal enzymes (Berger and Altmann, 2000). Proteolytic processing of plant peptide signals may involve a number of complex steps, such as initial cutting and further trimming of the peptides. These proposed processing mechanisms sharply contrast with those of animal peptide signals, in which the mature peptide is typically generated after initial processing on the C-terminal side of basic amino acids and subsequent removal of terminal basic residues by carboxypeptidases.
It is also unclear how the final processing site is defined by such an ambiguous proteolysis system in plants. One possibility is that the mature peptide domain escapes proteolysis due to the presence of post-translational modifications, which often confer resistance to proteolytic digestion. In addition, many of the small peptide signals contain multiple Pro residues, which also confer resistance to proteolytic digestion by common proteases. CLE peptides contain conserved Pro or Hyp residues at the 4th, 7th and 9th positions. Pro residues are also present in the RGF peptide family at the 5th, 9th and 10th positions. Proteolytic processing of plant peptide signals may involve regulatory mechanisms that are more complex than those of animals.
INFLORESCENCE DEFICIENT IN ABSCISSION (IDA) peptides have a conserved domain near their C terminal, but there is marked intrafamily sequence diversity in other domains (Butenko et al., 2003), which is similar to the intrafamily sequence diversity in the precursor peptides of the PSK, CLE and RGF peptide families. Indeed, synthetic conserved domain peptides induced early floral abscission at higher concentrations (Stenvik et al., 2008). These sequence characteristics strongly suggest that small peptides encoded within the C-terminal conserved domains act as ligands for corresponding receptors. Biochemical identification of the mature sequences of these small peptides and characterization of ligand-receptor interactions are needed to determine whether these putative secretory peptides function as signals.
METHODS TO IDENTIFY MATURE PEPTIDE STRUCTURES
As stated above, small post-translationally modified peptides are synthesized in the cell as larger precursor polypeptides, which are biologically inactive and undergo a variety of post-translational modifications and proteolytic processing steps to yield the active mature peptides. Thus, the development of a systematic methodology to identify the structures of mature peptides localized in apoplasts is indispensable for the field of secreted peptidomics. One of the currently used techniques is growing submerged cultures of Arabidopsis plants overexpressing target genes. Under this condition, Arabidopsis plants develop cuticle-less hyper-hydric leaves with large intercellular spaces filled with water, through which secreted peptides in the apoplast directly diffuse into the culture medium (Ohyama et al., 2008). Secreted peptides accumulated in the culture medium can be effectively identified by o-chlorophenol extraction followed by LC-MS analysis. Mature peptide structures of CLV3, CLE2, and RGF1 were identified by this approach (Ohyama et al., 2009; Matsuzaki et al., 2010).
Another method is to detect peptides by in situ MALDI-TOF mass spectrometry, where the small molecules including peptides are analyzed by directly scanning thin tissue sections by UV laser pulses. This technique was used for mature peptide analysis of CLV3 overexpressed in the callus cells (Kondo et al., 2006). One weakness of this method is, however, that less ionizable molecules such as glycopeptides can not be detected.
RECEPTORS FOR SMALL POSTTRANSLATIONALLY MODIFIED PEPTIDE SIGNALS
The receptors or putative receptors for peptide signals identified to date belong to the receptor kinase (RK) or receptor-like protein (RLP) families (Shiu and Bleecker, 2001). Among RKs, the largest subfamily is the leucine-rich repeat RK (LRR-RK) family, which consists of 216 members in Arabidopsis. The majority of receptors for small post-translationally modified peptide signals belong to this family. Representative examples include PSKR1 (receptor for PSK) (Matsubayashi et al., 2002), CLV1 (receptor for CLV3) (Clark et al., 1997; Ogawa et al., 2008) and TDR/PXY (receptor for TDIF/CLE41/CLE44) (Hirakawa et al., 2008). In addition, HAESA(HAE) and HAESA-LIKE 2 (HSL2), putative receptors for IDA, are also members of the LRR-RK family (Jinn et al., 2000; Stenvik et al., 2008). LRR-RKs are further divided into 13 subgroups based on similarity within the cytoplasmic kinase domain (Shiu and Bleecker, 2001). Interestingly, PSKR1, CLV1, TDR/PXY and HAE belong to the LRR X or LRR XI subgroups, which have no introns within the nucleotide regions that correspond to the extracellular domain. In contrast, LRR-RK genes such as BAK1 (Li et al., 2002; Nam and Li, 2002) and SERK1 (Karlova et al., 2006), whose products are thought to contribute to receptor heterodimerization, have many introns. One interpretation is that the molecular evolution of LRR-RKs that directly interact with small ligands relies on fine structural optimization by point mutations within ligand binding domain, whereas molecular evolution of LRR-RKs that interact with relatively large ligands or proteins may rely on rearrangement of exon units that allows drastic conformational changes. Among the 13 LRR-RK subgroups, intron-less types are found in LRR III, LRR VII, LRR IX, LRR X, LRR XI and LRR XII. An increasing number of LRR X and LRR XI members are now being confirmed as receptors for endogenous small peptide ligands, suggesting that these subgroups are an attractive target for binding analysis with peptide ligand candidates.
SMALL POST-TRANSLATIONALLY MODIFIED PEPTIDE SIGNALS IN ARABIDOPSIS
Here, I will briefly summarize structures and functions of currently known small post-translationally modified peptide signals in Arabidopsis (for details of specific peptides, see review articles).
Phytosulfokine (PSK) is a 5-amino-acid secreted peptide which contains two sulfated tyrosines (Table 2). PSK was initially identified as a growth-promoting signal involved in the “density effect” in plant cell cultures (Matsubayashi and Sakagami, 1996). It was later shown that PSK also promotes in vitro tracheary element differentiation of Zinnia mesophyll cells (Matsubayashi et al., 1999), somatic embryogenesis (Kobayashi et al., 1999; Hanai et al., 2000b; Igasaki et al., 2003), and pollen germination (Chen et al., 2000). PSK is produced from ∼80-amino-acid precursor peptides via post-translational sulfation by TPST and proteolytic processing (Figure 2A) (Yang et al., 1999; Matsubayashi et al., 2006). In vitro studies have suggested that this proteolytic processing is mediated, at least in part, by a subtilisin-like serine protease, AtSBT1.1 (Srivastava et al., 2008). Genes encoding PSK precursors are widely expressed in a variety of tissues and upregulated by wounding (Matsubayashi et al., 2006). PSK is recognized by a membrane-localized leucine-rich repeat receptor kinase (LRR-RK), PSKR1 (Matsubayashi et al., 2002). Disruption of PSKR1 and its two homologs in Arabidopsis causes pleiotropic growth defects such as short roots, smaller leaves and early senescence (Matsubayashi et al., 2006; Amano et al., 2007). Although the molecular target of PSK signaling is not well understood, circumstantial evidence has suggested that PSK signaling affects basic potential for growth, and thereby exerts a pleiotropic effect on plant growth and development.
The clavata3 (clv3) mutant was initially identified by phenotypic screening for mutants with progressive shoot apical meristem enlargement. CLV3 encodes a small peptide signal that regulates stem cell fate in Arabidopsis shoot apical meristem (Fletcher et al., 1999; See Betsuyaku et al. 2011). CLV3 acts as a negative regulator of stem cell maintenance by repressing WUS, which encodes a homeodomain transcription factor that is expressed in the organizing center and promotes the identity of stem cells. Conversely, WUS positively regulates CLV3 expression in the stem cell region. This negative feedback loop insures that stem cells are restricted to the centre of the shoot apical meristem. For detailed physiological functions of CLV3, see the comprehensive review article (Wang and Fiers, 2010).
CLV3 encodes a 96-amino-acid polypeptide that is modified into a small peptide by proteolytic processing and post-translational modification (Figure 2B). In Arabidopsis callus overexpressing CLV3, a 12-amino-acid peptide in which two proline residues are hydroxylated was detected (Kondo et al., 2006). In contrast, mature CLV3 peptide identified in the culture medium of whole-plant submerged cultures of CLV3-overexpressing Arabidopsis plants was a 13-amino-acid glycopeptide, in which the 7th Hyp residue was further modified with three L-arabinose sugars (Ohyama et al., 2009) (Table 2). CLV3 is recognized by three functionally redundant receptors; CLV1 (Clark et al., 1997; Ogawa et al., 2008), CLV2/CORYNE (SOL2) (Kayes and Clark, 1998; Miwa et al., 2008; Muller et al., 2008; Guo et al., 2010) and RPK2 (Kinoshita et al., 2010). Arabinosylated CLV3 peptide interacts more strongly with CLV1 than non-arabinosylated forms (Ohyama et al., 2009). CLV3 also triggers immune signaling and pathogen resistance via the flagellin receptor kinase FLS2 to enhance innate immunity in the shoot apical meristem (Lee et al., 2011). On the other hand, the CLV3-mediated arrest of the shoot apical meristem was not dependent on FLS2, indicating that CLV3 signaling via two different LRR-RKs can be uncoupled (Lee et al. 2011). The effects of CLV3 arabisynolation on FLS2-mediated stem cell immunity remains to be addressed.
CLE Family Peptides
The above-mentioned CLV3 belongs to a family of secreted peptides, designated CLV3/embryo surrounding region (CLE) peptides, which possess a conserved 14-amino-acid domain (called the CLE domain) at or near their C-terminus (Cock and McCormick, 2001; see Betsuyaku et al. 2011). In Arabidopsis, the CLE family consists of 32 members, of which the majority are transcribed in a variety of tissues (Jun et al., 2010). Functional characterization of CLE members has suggested that they potentially affect numerous developmental processes in plants, such as shoot apical meristem development, root apical meristem development and vascular cell differentiation (Wang and Fiers, 2010). CLE40 is specifically expressed in the stele and in differentiating columella cells of the root cap. CLE40 secreted from columella cells promotes distal root meristem differentiation by acting through the receptor kinase ARABIDOPSIS CRINKLY4 (ACR4) (Stahl et al., 2009). TDIF/CLE41/CLE44 expressed in phloem and neighboring cells is recognized by TDR/PXY and controls procambial cell fate by suppressing the xylem cell differentiation of procambial cells and promoting their proliferation (Ito et al., 2006; Hirakawa et al., 2008). However, the exact in vivo function of other CLE peptides remains largely unknown due to the high degree of genetic redundancy.
The mature peptide structure of one CLE member, CLE2, is a 12-amino-acid glycopeptide with the 7th Hyp residue post-translationally modified with three L-arabinose residues (Ohyama et al., 2009) (Table 2). Although the expression pattern and physiological function of CLE2 in Arabidopsis plants has yet to be characterized, CLE2 glycopeptide binds CLV1 at nanomolar binding affinity, thus suggesting that CLE2 potentially functions as a ligand for CLV1. In legumes, the CLE2 orthologs LjCLE-RS1 and LjCLE-RS2 are thought to act as root-derived mobile signals that systemically regulate nodule numbers through the CLV1 ortholog HAR1 receptor kinase (Okamoto et al., 2009). Binding of the CLE2 glycopeptide to CLV1 in Arabidopsis strongly supports the molecular basis of this autoregulation model.
TDIF is a 12-amino-acid peptide in which two proline residues are hydroxylated (Table 2). This peptide was initially identified as a factor inhibiting transdifferentiation of dispersed Zinnia (Zinnia elegans L.) mesophyll cells into tracheary elements (the main conductive cells of the xylem) (Ito et al., 2006; see Betsuyaku et al. 2011). The TDIF peptide suppresses xylem cell development at subnanomolar concentrations and promotes cell division in vitro. The TDIF sequence is identical to the CLE domain sequences of Arabidopsis CLE41 and CLE44, which form a distinct subclade in the Arabidopsis CLE peptide family. Further in vivo studies in Arabidopsis have revealed that the TDIF/CLE41/CLE44 expressed in phloem and neighboring cells is recognized by TDR/ PXY, a leucine-rich repeat receptor kinase located in the plasma membrane of procambial cells, and controls procambial cell fate by suppressing xylem cell differentiation of procambial cells and promoting their proliferation (Hirakawa et al., 2008). Interestingly, WOX4 transcription factor, a homolog of the shoot stem cell regulator WUS, is required for promoting the proliferation of procambial/cambial stem cells in response to the TDIF/CLE41/CLE44 signal (Hirakawa et al., 2010), suggesting some similarity in the mechanisms of stem cell niche maintenance in the shoot and vascular bundle development.
Root meristem growth factor (RGF1) is a 13-amino-acid tyrosine-sulfated peptide involved in maintenance of the root stem cell niche in Arabidopsis (Matsuzaki et al., 2010). RGF1 was identified in a search for sulfated peptides that recover root meristem defects of the tpst-1 mutant in combination with in silico screening of genes encoding sulfated peptides and practical bioassays using synthetic sulfated peptides. This approach is based on the assumption that the severe short root phenotype of the tpst-1 mutant reflects deficiencies in the biosynthesis of all the functional tyrosine-sulfated peptides, including undiscovered peptides. RGFs are produced from ≈100-amino-acid precursor peptides via post-translational sulfation and proteolytic processing (Figure 2C). RGF family peptides are expressed mainly in the stem cell area and the innermost layer of central columella cells, and diffuse into the meristematic region. RGF peptides regulate root development by stabilizing PLETHORA transcription factor proteins, which are specifically expressed in the root meristem and mediate patterning of the root stem cell niche.
Although mature peptide structure has not been determined, IDA is highly likely to be a small post-translationally modified peptide. An Arabidopsis mutant, inflorescence deficient in abscission (ida), which retains its floral organs indefinitely, was initially identified by screening for mutants with delayed floral abscission (Butenko et al., 2003). The IDA gene encodes a 77-amino-acid polypeptide that is expressed in the floral organ abscission zone throughout the floral abscission process. Studies have also identified five genes paralogous to IDA, designated IDL1-5, in Arabidopsis. Sequence alignment of the deduced peptides of this family indicate the presence of a highly conserved proline-rich domain near the C-terminus, which is similar to the structures of the precursor polypeptides of small post-translationally modified peptides (Figure 2D). Indeed, synthetic peptide (without modifications) induced early floral abscission in wild-type flowers at higher concentrations (Stenvik et al., 2008).
The available evidence suggests that a small peptide encoded by IDA is a ligand for HAE and HSL2, which are LRR-RLKs that are involved in controlling floral organ abscission (Jinn et al., 2000; Cho et al., 2008; Stenvik et al., 2008). HAE and HSL2 are expressed at the base of petioles and pedicels, as well as in abscission zones of floral organs. Similar to the ida mutant, hae hsl2 double mutant plants were completely deficient in floral organ abscission.
PSY1 is an 18-amino-acid secreted glycopeptide containing one sulfated tyrosine residue and an L-arabinose sugar chain (Table 2) (Figure 2E). This glycopeptide was identified by exhaustive analysis of tyrosine-sulfated peptides in plant cell culture media (Amano et al., 2007). The expression pattern and physiological activity of PSY1 is similar to that of PSK. PSY1 is expressed in various Arabidopsis tissues and promotes cellular proliferation and expansion at nanomolar concentrations.
CEP1 is a 15-amino-acid peptide with two Hyp residues (Table 2). CEP1 was initially identified by in silico gene screening for structural features of prepropeptides of known small post-translationally modified peptides (Ohyama et al., 2008). The common feature of known small post-translationally modified peptide signals is that they are encoded by multiple paralogous genes whose primary products are approximately 70- to 110-amino-acid cysteine-poor secreted polypeptides that share short conserved domains near the C-terminus (Figure 2F). CEP1 family peptides fulfill these criteria. CEP1 is mainly expressed in the lateral root primordia and, when overexpressed or externally applied, significantly arrests root growth. Therefore, it is a strong candidate as a novel peptide signal.
A number of genes encoding secreted peptides have been identified in the Arabidopsis genome. In the TAIR7 protein data set, as many as 979 genes encode potential secreted peptides (SignalP score >0.75) of between 50- to 150-amino-acid residues. Given that the Arabidopsis genome contains more than 600 receptorlike kinases for which corresponding ligands have not yet been identified, no small proportion of such secreted peptides are expected to function as ligands for receptor-like kinases.
There are several approaches to identify novel peptide signals or peptide signal candidates in plants. In the classical approach, peptide signals are identified by bioassay-guided fractionation of the appropriate crude samples. The first peptide signal in plants, systemin, was identified in tissue homogenates using this approach in 1991 (Pearce et al., 1991). Since then, new peptide signals have been identified every five years; PSK was identified in cell culture medium in 1996 (Matsubayashi and Sakagami, 1996), defense-related peptide TobHypSys in tissue homogenate in 2001 (Pearce et al., 2001a), endogenous peptide elicitor AtPep1 in Arabidopsis leaf homogenate in 2006 (Huffaker et al., 2006) and TDIF in cell culture medium in 2006 (Ito et al., 2006). The feasibility of this approach largely depends on the sensitivity of the bioassay system and quality of the materials to be purified.
Phenotype-based mutant screens for Arabidopsis seedlings have also contributed to identification of peptide signal genes such as CLV3 (Fletcher et al., 1999) and IDA (Butenko et al., 2003). The major limitation of this approach is, however, a lack of phenotypes in the majority of single loss-of-function mutants due to considerable genetic redundancy in higher plants. CLV3 and IDA represent the rare case of small peptide genes that show obvious phenotypes in single mutants. Indeed, no peptide signal genes have been identified since the discovery of IDA in 2003. Thus, an alternative strategy for identification of novel peptide signals is necessary to overcome the limitations of bioassay-based approaches and conventional mutant screening.
In order to pick out peptide signal candidates from among the hundreds of potential secreted peptide genes, several novel approaches have been reported. One unique approach is the gain-of-function screening, by which peptide signal candidates are identified by the phenotypes of plants overexpressing individual genes. EPF1, which encodes a secreted cysteine-rich peptide that controls stomatal patterning, is an example of successful identification by this approach (Hara et al., 2007). A combination of this approach with in silico screening techniques such as coexpression analysis, which is effective in narrowing down candidates, may further increase the probability of peptide signal identification. Stomagen, a cysteine-rich peptide that positively regulates stomatal differentiation, was identified with the help of coexpression analysis using stomata-specific genes (Kondo et al., 2010; Sugano et al., 2010).
Another novel approach is to focus on post-translational modifications of the peptides. Post-translational modification requires co-substrates such as the sulfation donor PAPS and arabinosylation donor UDP-L-arabinose synthesized using ATP Thus, biosynthesis of small post-translationally modified peptides requires considerably higher energy when compared with normal proteins and peptides. Nevertheless, a number of post-translationally modified peptides have been evolutionally conserved, suggesting that these “expensive” modified peptides afford physiological merits that exceed their energy costs. In this context, posttranslational modifications can be indicative of biologically active peptides. The peptidomics approach targeting sulfated peptides has successfully identified a novel peptide signal, PSY1 (Amano et al., 2007). In addition, accumulating sequence information on the precursor polypeptides for various small post-translationally modified peptides has enabled the prediction of genes encoding this type of peptide in silico (Ohyama et al., 2008). The common feature of known small post-translationally modified peptide signals is that they are encoded by multiple paralogous genes whose primary products are approximately 70- to 110-amino-acid, cysteine-poor, secreted polypeptides that share short conserved domains near the C-terminus (Figure 2). Indeed, this in silico gene screening approach has uncovered a peptide signal, CEP1 (Ohyama et al., 2008).
Phenotypic analysis of loss-of-function mutants of post-translational modification enzymes also affords important information about the functions of post-translationally modified peptides. As post-translationally modified enzymes recognize particular consensus sequences within multiple target peptides, loss-of-function mutations should be reflected as deficiencies in the biosynthesis of modified peptides. Thus, the presence of novel post-translationally modified peptide signals should be revealed through phenotypic analysis of such mutants. Indeed, a novel peptide signal involved in root meristem stem cell niche maintenance, RGF1, was successfully identified by a search for sulfated peptides that recover root meristem defects in the tpst-1 mutant (Matsuzaki et al., 2010). In this context, identification of hydroxyproline arabinosyltransferase, followed by phenotypic analysis of loss-of-function mutants of this enzyme, could provide a clearer picture of the functions of arabinosylated peptide signals in plants.
In addition to the above-mentioned approaches, single-cell whole-transcriptome analysis using next-generation sequencing techniques may greatly contribute to the identification of peptide signals that show cell-type-specific expression. The usefulness of single-cell whole-transcriptome analysis has already been demonstrated in the course of identifying the cysteine-rich peptide, LURE, which is secreted from two synergid cells within the female game-tophyte and is involved in the final step of pollen tube guidance (Okuda et al., 2009). Development of novel ideas and techniques form the basis for further research into plant peptide signals.
Our research was supported by a Funding Program for Next Generation World-Leading Researchers (NEXT Program) from the Japan Society for the Promotion of Science (JSPS) (No. GS025).