The interaction between plants and their pathogens is complex. Plant pathogens have evolved a broad set of proteins that enable a stealthy entry into the plant cell and facilitate the evasion of host defenses. Among other defenses, plants have evolved a series of proteins that monitor their cells for signs of infection. Downstream of these monitors is a signaling and response system triggered upon infection. The molecular basis of the host–pathogen interaction is now much better understood, as a result of the development of genomic data and tools. For example, the complete genomic sequence is available for a model plant, Arabidopsis, and for one of its bacterial pathogens, Pseudomonas syringae pv. tomato DC3000. Detailed molecular analyses of these two organisms have revealed much about plant defenses. Modern genomics tools, including applications of bioinformatics and functional genomics, allow scientists to interpret DNA sequence data and test hypotheses on a broader scale than previously possible.
The relationship between plants and their pathogens is a battle that has taken place over an evolutionary timescale. The metaphor of a biological “arms race” is often used to describe this relationship (Holub 2001). The idea of this metaphor is that one species evolves in response to changes in another competing species (Dawkins and Krebs 1979). For example, the evolution of enhanced virulence in a pathogen increases the selection pressure for resistance in the host plant. This can produce a cycle in which neither competitor retains the upper hand for many generations. Over time, this coevolutionary process leads to a specific and complex series of interactions between plants and their pathogens.
The study of these interactions has a long and rich history in science, with plant pathologists tackling these complex systems first with classical tools, such as physiology, histology, microbiology, and plant breeding and genetics, and more recently with advanced biochemistry and molecular biology approaches. In the last 5 to 10 years, many of the critical host proteins that detect the presence of pathogens have been characterized. Numerous components of the plant signaling system have also been identified that function downstream of the detection molecules. In parallel, the pathogen proteins that are used to suppress host defenses and drive the infection process (so-called effector proteins) have also been identified, using molecular biological technologies and genetics.
The most recent addition to plant pathologists' toolbox is genomics. The development and application of genomics methods has dramatically advanced researchers' understanding of plant–pathogen interactions. The details of the genomics tools are described below in the context of specific examples. First, we introduce and define several terms associated with genomics tools. The central pillar and namesake of genomics is the genome, the complete (or nearly complete) sequence of all of the chromosomes of an organism. Although genomic sequencing is complex and costly, the payoff is profound, because the data provide an encyclopedia of knowledge about the function and evolution of a particular organism. The first step after the assembly and completion of a genomic sequence is identification of the genes. This “annotation” step is performed with tools built by those working in the field of science known as bioinformatics, which applies computational algorithms and statistical methods to the analysis of biological data.
In the year 2000, the first plant genome was completely sequenced (AGI 2000). The plant species that was selected for sequencing is known as Arabidopsis (or mouse-eared cress), shortened from its full Latin name Arabidopsis thaliana. Arabidopsis is not a crop species, but it belongs to the Brassicaceae family, which includes oilseed rape, cabbage, mustard, turnip, and cauliflower. It was selected because Arabidopsis is very easy to grow in the lab, it has a relatively small genome, and it is the focus of study for a large number of researchers. We should note that a specific Arabidopsis ecotype called Col-0 was used for genomic sequencing (a strain collected in Columbia, Missouri, by George Rédei); throughout this paper, our discussion of Arabidopsis will refer to this ecotype.
The genomic sequence was also recently completed for the DC3000 strain of Pseudomonas syringae pv. tomato (abbreviated as PstDC3000), which causes bacterial speck disease on tomato and Arabidopsis (Buell et al. 2003). With the availability of the genomic sequence for both the host and the pathogen, scientists are poised to make tremendous advances in understanding the molecular basis of the interaction between these two organisms. Moreover, genomic analyses of this interaction can build on classical studies, because Pseudomonas has been well characterized using more traditional plant pathology approaches.
In the following sections, we discuss how genomic tools are being used to dissect the interaction between plants and their pathogens. We focus mainly on the interactions between PstDC3000 and Arabidopsis. The analysis of Pseudomonas–Arabidopsis interactions with genomic tools is leading scientists to a better understanding of the plant genes responsible for recognition of and defense against pathogens.
The concept of a gene-for-gene model
Genetic analyses of the interactions between flax (Linum usitatissimum) and flax rust (Melampsora lini) led Harold H. Flor to the concept of a gene-for-gene model. He found that the resistance of flax to a specific flax rust strain could be inherited monogenically by the next generation. He found that the “avirulence” of flax rust to a specific flax variety is heritable monogenically as well (Flor 1942, 1947). It is important to note that the terms avirulence and virulence have different meanings in the field of plant pathology than in other fields. In plant pathology, avirulence is the inability of the pathogen to infect, caused by the plant's detection of a pathogen-produced factor (e.g., avirulence proteins, also called effector proteins), and virulence is the ability to infect. Pathogens may be virulent for several reasons: for example, (a) the pathogen doesn't produce an avirulence protein, (b) the plant lacks a factor (e.g., a resistance protein, or R protein) that would enable it to detect the avirulence protein, or (c) the pathogen and plant, respectively, lack both the avirulence and the corresponding R proteins.
Figure 1 illustrates these scenarios using Arabidopsis with or without one of the resistance genes (R genes), called RPS2, and PstDC3000 with or without one of the avirulence genes (Avr genes), called avrRpt2. The virulent interaction is shown in figure 1a, in which Arabidopsis produces the RPS2 protein that detects the AvrRpt2 protein of PstDC3000. This recognition event triggers Arabidopsis disease resistance responses, prevents the invasion of PstDC3000, and results in a healthy plant (figure 1a). The other three scenarios are indicated in figure 1b, 1c, and 1d, in which either Arabidopsis lacks the R gene or PstDC3000 lacks the Avr gene. In each of these cases, Arabidopsis shows a susceptible or diseased phenotype because it fails to recognize and slow the incursion of PstDC3000.
Overview of type III secretion system
The outer surfaces of plant tissues exposed to air are covered with waxes that form a cuticular layer. One purpose of this layer is to prevent the growth and penetration of pathogens. However, a potential chink in this armor is that plants must exchange gases (oxygen and carbon dioxide) through cellular vents called stomata. Bacteria are small enough to traverse the stomatal opening easily, allowing direct access to the intercellular space (the apoplast). From the apoplast, bacteria can directly contact the plant cells (figure 2b). Plant cells are rich in water and nutrients needed by the bacterial pathogens, but these potential resources are sequestered within a plasma membrane and rigid cell wall that serve as additional barriers to bacterial penetration. To overcome this obstacle, many bacteria have evolved a system that injects effector proteins directly into a host cell (figure 2c, 2d). These proteins are believed to modulate or suppress host defenses and force the host cell to leak water and nutrients. This system is used by both plant and animal pathogens.
The mechanism by which the bacterial effector proteins are injected directly into the host cell is known as the type III protein secretion system (TTSS). This system can be thought of as a tube or syringe from the bacterial cell into the cytoplasm of the plant cell (figure 2d). The proteinaceous components that make up the tunnel from pathogen to host are classified into two groups: Hrc (hypersensitivity response [HR] conserved) and Hrp (HR pathogenicity). “Hypersensitivity response” refers to the reaction of the plant upon successful recognition of the pathogen, a resistance phenomenon that is described below. Hrc proteins are the structural components of the portal spanning the inner and outer membranes of the bacterial cell (figure 2d). This structure secretes TTSS substrates such as effector proteins and Hrp proteins. Hrp proteins are the structural components of the Hrp pilus, the tubelike structure that is believed to penetrate both the plant cell wall and the plasma membrane (figure 2d; Roine et al. 1997). This structure serves as a conduit to transfer effector proteins into the host cell.
Genomic analysis of TTSS in PstDC3000
As a result of physiological analyses and molecular biology, we now know that the function of many, or perhaps all, Avr proteins is to suppress or block the host defense responses (Abramovitch and Martin 2004). One well-documented example is the function of the effector protein AvrPtoB. It has been shown that AvrPtoB suppresses HR in plants and induces susceptibility (Abramovitch et al. 2003). This finding is based on advanced molecular biology approaches, but the function of AvrPto was characterized using a so-called functional genomics approach (Hauck et al. 2003). Functional genomics is one of the branches of genomics, and it includes an extremely broad field of science. DNA microarray technology, which monitors gene expression levels, is one tool of functional genomics. The strategies and methodologies for DNA microarray technology have been reviewed elsewhere (Lockhart et al. 1996, DeRisi et al. 1997). Since the Arabidopsis genome has been completely sequenced, it is possible to array all or almost all of the 29,993 predicted Arabidopsis genes onto glass slides and monitor the expression level of all the genes in different tissues, in different developmental stages, or during specific treatments, such as disease or abiotic stress. Hauck and colleagues (2003) used this technology to analyze the gene expression profile of Arabidopsis that overexpresses avrPto. The results collected through DNA microarray experiments revealed that AvrPto represses a set of genes necessary for the cell wall–based defense mechanism in plants (Hauck et al. 2003).
The expression of avr genes as well as hrc and hrp genes is tightly regulated. It has been shown that HrpR and HrpS regulate the expression of the hrpL gene, which activates the downstream effector protein genes (Xiao et al. 1994, Grimm et al. 1995, Hutcheson et al. 2001). The promoter regions of these genes, which regulate gene expression, were compared, and a consensus sequence was identified from shared sequence motifs (the “hrp box” element; Innes et al. 1993). On the basis of this knowledge, several groups employed bio-informatics methods and identified genes with a hrp box regulatory element from the genomic sequence of PstDC3000 (Fouts et al. 2002, Zwiesler-Vollick et al. 2002, Buell et al. 2003). Although the specific bioinformatics algorithms used among the groups are different, the general idea of the algorithm is to identify a stretch of nucleotide sequences that are statistically similar to the nucleotide sequence of the hrp box element. Genes regulated by this hrp box are then predicted to be expressed, and are presumably important, during the infection process.
Using their bioinformatics method, Zwiesler-Vollick and colleagues (2002) identified 73 genes that have a hrp box element in the promoter. For these 73 genes, their method successfully identified 11 of the 12 known TTSS-associated genes; this suggests that their method is highly accurate. The authors generated a custom DNA microarray, which has the identified genes, and monitored the expression of these genes under hrp-inducing conditions. Six genes, in addition to the known TTSS-associated genes, were highly expressed in this experiment (Zwiesler-Vollick et al. 2002). The authors then tested to determine whether these six genes are real type III effectors, using the C-terminal end of the AvrRpt2 protein, which is important for recognition by the plant Rps2 protein in intercellular interactions. The six genes were fused to the C-terminal end of AvrRpt2 and transformed into PstDC3000 (Zwiesler-Vollick et al. 2002). The transformants were infiltrated into RPS2-expressing Arabidopsis leaves, and the HR was observed; this response was interpreted as a demonstration that the fusion protein was secreted into the plant cell by the TTSS and that the C-terminal end of AvrRpt2 triggered the plant HR (Axtell et al. 2001). The authors had successfully identified a novel hrp-regulated gene in their elegantly planned functional genomics experiment.
The TTSS and its protein components have now been identified in numerous bacterial pathogens of animals and plants (Cornelis and Van Gijsegem 2000). This suggests that either this system has been maintained through a long evolutionary history, or it was acquired and adopted by diverse bacterial pathogens. Either way, the TTSS has proved such an effective way to penetrate diverse host defense systems that it is now widely used among bacteria. Although the protein components of TTSS are relatively well conserved among different species of bacteria, the effector proteins are more specific to each host–pathogen interaction.
The Arabidopsis defense system: An inducible hypersensitive response
Like the pathogen infection strategies, plant defense mechanisms are multilayered and complex. One of the most effective defense mechanisms against biotrophic pathogens is the inducible HR (Hammond-Kosack and Jones 1996). This response is easily recognizable to the naked eye as an outbreak of small spots on the leaves. The HR consists of localized cell death of the infected cell and the rapid collapse and death of surrounding tissue. This response effectively cuts off the pathogen from living, healthy tissue that it normally uses as a nutrient source. In the absence of these nutrients, the proliferation of the pathogen is halted. With the loss of just a small group of cells surrounding the pathogen, the plant has prevented systemic infection.
Molecular genetic experiments have shown that the HR occurs when plant R proteins recognize the presence of specific effector molecules (e.g., Avr proteins; figure 2c). In this situation, the pathogen is avirulent on the plant host. As mentioned in the section describing the gene-for-gene model, this recognition event is very specific, and if either the pathogen or the host lacks the corresponding Avr or R gene, no HR results. A recognition event triggered by the R gene activates the HR at and around the infection site. The HR not only deprives pathogens from nutrient sources but also triggers a wide variety of defense responses, which include systemic acquired resistance (SAR; figure 2c). SAR serves as a warning throughout the plant of the invasion by a pathogen, and it elevates the plant's defense responses in tissues distal to the site of infection. Compared with the HR, SAR is long lasting and helps to protect plants from numerous species of plant pathogens.
A complex genetic network regulates both HR and SAR (Lam et al. 2001, Belkhadir et al. 2004, Durrant and Dong 2004). During the last 10 years, a large number of the R genes and the genes that regulate this signaling network have been isolated by cloning. The cloned genes were originally identified as traits resulting from artificially induced mutations or as natural variants among Arabidopsis populations. For example, the RPS2 gene was cloned by taking advantage of a single-gene mutation and a naturally susceptible isolate called Wu-0 (Kunkel et al. 1993, Bent et al. 1994). The rps2 mutant was identified in a genetic screen as a loss-of-function mutation in a single gene. This screen was performed as follows: First, resistant seeds were treated with the chemical mutagen EMS (ethyl methanesulfonate); this treatment randomly creates mutations in a small proportion of the genes in each plant. To select the few plants with loss-of-function mutations in the R gene, thousands of EMS-treated seeds were grown and infected with the PstDC3000 bacteria carrying the avrRpt2 gene. Plants showing a susceptible phenotype were selected as the induced mutants; this altered response was genetically mapped, and the gene containing the mutation was cloned (Kunkel et al. 1993, Bent et al. 1994). Using this approach and others, many genes have been cloned that, when mutated, demonstrated altered responses to pathogens. By and large, these genes have been receptor-like R genes, with a smaller subset encoding downstream signaling proteins. The identification of this set of genes involved in the signaling pathways has led to the partial deciphering of signal transduction circuitry. These networks are complex, and scientists have not yet identified the complete set of genes, interactions, and regulatory systems involved. Several recent reviews provide more details on these downstream genes and signaling systems (Nimchuk et al. 2003).
Structure and function of R genes
At least 30 R genes have now been cloned from a variety of plant species. The majority of these genes encode similar proteins, although these proteins function to detect pathogens as diverse as bacteria, viruses, nematodes, fungi, oomycetes, and insects. Comparisons of the cloned R genes have revealed that most encode proteins characterized by two domains: a nucleotide-binding site (NBS) and leucine-rich repeats (LRR; figure 3). The NBS domain, located approximately in the middle of the R protein, comprises several conserved amino acid motifs that are believed to function in ATP hydrolysis (Tameling et al. 2002). The NBS may also act to transfer a signal to the next protein or proteins in one or more defense systems. Rpm1, one of the most well studied R genes, recognizes the presence of the AvrRpm1 or AvrB effector protein (Bisgrove et al. 1994); interestingly, Rpm1 is one of the few R genes that have been shown to recognize two different effector proteins. Tornero and colleagues (2002) conducted a large-scale mutation screen of Rpm1 that confirmed the importance of the NBS domain for disease resistance. This study used an elegant inducible avrRpm1 expression system in Arabidopsis, the sophistication of which is best appreciated by reading the original paper. Ninety-five Rpm1 mutants were isolated, and each mutation site was identified (Tornero et al. 2002). The HR phenotype of each mutant was closely observed, and these data were used to dissect the relationship between the phenotype and the site with the protein of each mutation (Tornero et al. 2002). As the authors expected, the mutations in the NBS domain correlated with the loss of HR, indicating the importance of the NBS domain in the recognition of the foreign effector proteins.
The LRR of most R proteins contains 20 to 25 copies of a 25-to-30-amino-acid repeat found at the C-terminal end of the protein. Since LRR domains have been shown to mediate protein–protein interactions in many species (e.g., yeast, human, Drosophila), this domain is predicted to modulate direct or indirect interactions between the R protein and its corresponding effector molecule. Inferences from structural data suggest that the LRR forms a binding surface on which the protein–protein interactions take place (Kobe and Deisen-hofer 1994). Compelling data support this idea (Jia et al. 2000, Dodds et al. 2001). However, the large-scale mutation analysis of Rpm1 performed by Tornero and colleagues (2002) showed an unexpected result: the authors observed that the occurrence of loss-of-function mutations in the LRR domain was relatively low. This result suggested that the LRR domain of RPM1 may not be critical for the specific protein–protein interaction, or the interacting surface of the LRR may be large enough that no single mutation substantially affects recognition. Recently, researchers identified a protein called RIN4 that interacts with both RPM1 and Avr-RPM1 (Mackey et al. 2002). This exciting breakthrough has been summarized elsewhere (Ellis and Dodds 2003, Marathe and Dinesh-Kumar 2003). In the original paper (Tornero et al. 2002), the authors suggested that RPM1 does not interact with RIN4 via the LRR domain and concluded that the region between the N-terminal end and the NBS domain is necessary for the interactions.
A detailed examination of the Arabidopsis genome using bioinformatics tools and molecular biological inferences identified 149 NBS–LRR-encoding genes. These genes are roughly grouped into two classes based on the conserved domain in the N-terminal region of the gene products (figure 3). Fifty-five NBS–LRR genes encode an N-terminal coiled-coiled (CC) domain, and the remainder (94 genes) encode an N-terminal Toll/interleukin-1 receptor (TIR) domain. Although the function of the N-terminal domain in R proteins is still not clear, it has been suggested that this domain is necessary for the proper folding or regulation of the NBS–LRR proteins. Data suggest that in a folded state, the protein represses signaling activity, and once the plant recognizes a pathogen attack, the NBS–LRR R protein is unfolded and becomes active (Hwang and Williamson 2003, Zhang et al. 2003). The role of these folding activities has been reviewed elsewhere (Belkhadir et al. 2004).
The CC, TIR, NBS, and LRR domains play key roles in plant defenses. However, detailed structural analysis of R protein with bioinformatics tools suggests the existence of additional and potentially important domains in a subset of R proteins (Meyers et al. 2002, 2003). Therefore, the application of a broad range of functional genomics tools, as well as new and rapidly developing methods such as proteomics, will be necessary for a complete understanding of how R proteins function. Although a discussion of proteomics is beyond the scope of this article, advances in this area relevant to plant biology have been reviewed recently (Provart and McCourt 2004, Sappl et al. 2004).
Evolution of R genes in Arabidopsis
Evolutionary inferences about R genes have also been refined on the basis of the complete genome sequence of Arabidopsis. Prior work had proposed several models for the evolution of genes for disease resistance in plants. These models are based on molecular and classical genetic studies as well as on data from other organisms; they were particularly influenced by models of the evolution of the vertebrate immune system. A central tenet of these models is that over evolutionary timescales, duplication events resulting from recombination can lead to the creation of novel resistance specificities. After gene duplication, one copy retains the original function, freeing the duplicate from selection pressure and allowing it to mutate and diverge. This can create complex arrays or families of genes, some of which encode functional R genes, while others represent stores of unused diversity that may mature over evolutionary time into a novel resistance specificity (Michelmore and Meyers 1998). Comparisons of the chromosomal positions of closely related NBS–LRR-encoding genes have revealed that duplication events played a major role in the expansion of this gene family.
Our understanding of R-gene evolution has also been affected by detailed analyses of the LRR regions. These studies revealed that LRR sequences are hypervariable compared with the other domains in the NBS–LRR proteins. These data are now supported by an analysis of the entire family of NBS–LRR proteins encoded in the Arabidopsis genome (Mondragon-Palomino et al. 2002). The diversity in the LRR, combined with the knowledge that this domain interacts with other proteins, has led various authors to suggest that mutations in this domain are important for the creation of novel NBS–LRR proteins. Presumably, natural selection could act on variants in the LRR sequences to improve recognition of effector proteins that are also evolving in the pathogens. This would represent an “arms race” between the host and pathogen, although we should emphasize that there are other models for R-gene evolution with strong experimental support (Holub 2001). The data and models derived from genomic comparisons are compelling; however, these inferences are retrospective, as we have yet to catch evolution in the act of generating a completely novel resistance gene specificity.
Genomic analysis of Arabidopsis disease resistance signaling
Although traditional genetic approaches have successfully identified many R genes and several key genes in SAR, these approaches have been only partially successful in the identification of the proteins involved downstream in signaling. Transcriptional profiling technologies such as microarrays may be one of the best tools currently available to address this question. As described above, with current microarray platforms, it is possible to monitor the expression profile of all Arabidopsis genes. From these data, patterns of expression can be identified that correlate with specific treatments, biochemical pathways, or signaling events. Microarrays provide a means of transcriptional profiling that is both rapid and relatively inexpensive.
Although microarrays and other technologies have improved substantially in recent years, it is not trivial to obtain gene expression profiles. Furthermore, the real difficulty lies not in the generation of the data but in the analysis. The appropriate statistical treatment of the data is critical. Many bioinformatics tools utilizing novel statistical algorithms have been developed for the analysis of microarray data (Eisen et al. 1998, Alter et al. 2000). Maleck and colleagues (2000) applied different algorithms to 14 gene expression profiles of Arabidopsis under SAR-inducing and SAR-repressing conditions. Their analysis allowed them to identify approximately 300 genes that were differentially expressed (Maleck et al. 2000). With these types of data sets growing in availability, the next step in genomic analysis of plant–pathogen interactions is to develop models of the genetic regulatory networks and circuitry. These models must accurately capture and describe all experimental data to infer how specific gene products regulate one another. This is by far the most challenging step, although substantial progress has been made in the last few years (Liao et al. 2003, Segal et al. 2003). Modeling of the signal circuitry is now being applied to Arabidopsis responses to pathogens (Katagiri and Glazebrook 2003, Agrawal et al. 2004). The next obvious step is to test the models with specific biological experiments. For example, responses can be tested under various treatments (e.g., inoculation with PstDC3000) using mutants that disrupt important signal convergence points or nodes in the model. This strategy may be straightforward if the model is linear and consists of relatively few genes. However, models are not usually linear, containing feedback loops and cross-talk among pathways, and tens or even hundreds of genes may be involved. Therefore, it is still too early to properly assess the utility of these models and the depth to which the complexity of the plant disease resistance response has been captured.
The recent addition of genomics to the set of more traditional experimental tools has dramatically increased our understanding of plant–pathogen interactions. This review has focused on the Arabidopsis–Pseudomonas interaction, but this is just one of several well-studied plant–pathogen interactions. The lessons learned and information gathered from model systems can be applied to agronomically important crops; this approach has proven to be an effective way to understand diverse interactions between plants and their pathogens. The use of model systems has grown, such that plants other than Arabidopsis are being used as models because of their specialized characteristics. For example, rice has been championed as a model for genomic analyses of the many grass species that are economically important (maize, wheat, barley, sorghum, etc.). Rice is of critical economic importance worldwide, but it also has a small genome, which is in the final stages of sequencing. Even though its genomic sequence has not yet been completed, comparative genomic approaches have been used to identify and characterize NBS–LRR R genes in rice (Bai et al. 2002, Monosi et al. 2004, Zhou et al. 2004). These analyses have identified approximately 500 NBS–LRR-encoding genes in rice, suggesting that its set of pathogen receptors may be somewhat more elaborate than that of Arabidopsis. Although the majority of these genes in rice are similar to Arabidopsis R genes, substantial differences have been observed in the composition of subgroups within the larger NBS–LRR family (Bai et al. 2002, Meyers et al. 2002, 2003). Because Arabidopsis and rice have diverged over many millions of years, these comparative analyses have suggested that the function of the plant defense systems is generally similar from species to species, although the specific details vary. This variation is probably due to the types of pathogens that infect the plant, as well as environmental and developmental differences in the plants. Understanding the nature of both the similarities and the differences among distinct plant–pathogen interactions is likely to keep molecular biologists busy for many years to come.
Although we may have a long way to go before we fully understand plant defense mechanisms, there is no question that genomics will play a major role in achieving this goal. The long-term implications of deciphering the molecular basis of plant defenses are profound. Currently, chemical pesticides are widely used to contain the damage caused by plant pathogens of agriculturally important plants. A switch to a greater reliance on genetically encoded defenses, such as the plants' natural defense systems described here, could be highly effective in reducing crop losses while offering numerous advantages over the application of chemicals. Such a switch would reduce the cost of crop protection and diminish the need for petroleum-based pesticides. Ultimately, molecular biologists would like to understand the function of NBS–LRR proteins such that proteins could be synthesized de novo to identify and defend against novel pathogen effector proteins. This could ultimately tip the balance in favor of crop plants and make it possible to more effectively limit the damage and losses caused by pathogens.
Andrew Bent at the University of Wisconsin kindly provided the scanning electron microscope image used in figure 2b. Financial support was provided by awards to B. C. M. from the University of Delaware Research Foundation and the National Science Foundation Plant Genome Research Program.