Before the availability of digital photography resulting from the invention of charged couple devices in 1969, the measurement of plant architecture was a manual process either on the plant itself or on traditional photographs. The introduction of cheap digital imaging devices for the consumer market enabled the wide use of digital images to capture the shape of plant networks such as roots, tree crowns, or leaf venation. Plant networks contain geometric traits that can establish links to genetic or physiological characteristics, support plant breeding efforts, drive evolutionary studies, or serve as input to plant growth simulations. Typically, traits are encoded in shape descriptors that are computed from imaging data. Skeletons are one class of shape descriptors that are used to describe the hierarchies and extent of branching and looping plant networks. While the mathematical understanding of skeletons is well developed, their application within the plant sciences remains challenging because the quality of the measurement depends partly on the interpretation of the skeleton. This article is meant to bridge the skeletonization literature in the plant sciences and related technical fields by discussing best practices for deriving diameters and approximating branching hierarchies in a plant network.
The advent of digital imaging accelerated the development of computational methods to digitally reconstruct the shape of plant networks from imaging data. From reconstructions, detailed measurements such as network width or diameters of network links can be derived for a plant network. The importance of reconstructions and measurements (Godin, 2000; Spalding and Miller, 2013) is founded in the many known or hypothesized links between plant network properties and actual functions of plants (Brodribb and Feild, 2000; Enquist, 2002; Lynch, 2011; Lynch and Brown, 2012). In particular, roots system architecture (Lynch, 1995; de Dorlodot et al., 2007), tree crown architecture (Delagrange et al., 2004), or leaf vein structures (Roth-Nebelsick et al., 2001; Mason et al., 2013; Price et al., 2013) are examples of plant networks. Measurements of such networks are analyzed in the context of genotype to phenotype mapping (El-Soda et al., 2014) as well as breeding efforts (Holland et al., 2003; Finkel, 2009), evolutionary studies of plant species (Green and Hickey, 2005; Fernandez et al., 2010; Feild et al., 2011), or plant growth simulation (Vansteenkiste et al., 2014).
Images encode the shape of plant networks such as roots, tree crowns, or leaf venation. Most commonly, images are digitally represented in a two- or three-dimensional raster of pixels or of voxels, respectively. Some imaging techniques produce point clouds of 3D coordinates that often employ no neighborhood information between the points. Recently, the combination of imaging data with skeletonization algorithms became popular as a tool to extract the hierarchies and geometric measurements of plant networks. Skeletons consist of a collection of (approximated) curves being mostly centered within the plant network of interest (Fig. 1). In this article, I discuss the fundamental characteristics of imaging data in the context of studying plant networks and give an overview of existing skeletonization concepts. Within the skeletonization overview, I highlight concepts that were applied to plant networks. In the remainder of the article, I introduce two basic skeleton concepts in detail and demonstrate their applicability to plant networks on selected examples.
THE BOUNDARY CONCEPT AND OCCLUSION IN IMAGING DATA
Before applying skeletonization algorithms, it is important to understand the intrinsic constraints of imaging data. Imaging technologies capture data about an object surface and/or interior within the specification of an imaging sensor. For example, a digital photo camera collects the light reflected from surfaces that face the photo sensor. At each location of the photo sensor, a value for the received wavelength is stored as an electrical potential or as the number of photons counted within a certain time frame. The stored wavelength corresponds to colors that we recognize in the pixels of the image. In the case of a plant network, this physical process of capturing light is a projection of the plant network onto a plane. The result is a 2D image that encodes the form of a plant network. For leaf venation networks, this projection does not cause any self-occlusion (Fig. 2). However, the projection onto the 2D plane comes at a cost for other plant networks such as heavily branched tree crowns or dense roots. In both cases, branches will occlude parts of the tree crown (Fig. 2) or rootstock. As a result, the digital photograph misses network information at all self-occluded locations. Similar to recording reflections of light, it is also possible to record absorption rates of wavelengths that penetrate the plant network.
3D imaging technologies improve upon the occlusion problem. In general, 3D data are obtained from the reflection of waves from the plant surface or from waves penetrating the plant network such that the absorption rate varies through different tissues. Technologies relying on the reflection of light from a surface include structured light methods that project a known pattern onto the plant network and calculate 3D information from the distortion of the projected pattern (Salvi et al., 2004). Another technology is terrestrial laser scanning (Vosselman and Maas, 2010) that relies on the reflection of laser beams from the surfaces. The round-trip time of the laser from the sensor to surface determines the distance between sensor and surface because of the constant speed of light. In practice, occlusion effects still occur as missing data, but can be minimized by using registration techniques to combine different views of the plant network (Bucksch and Koshelham, 2013). The benefit of the obtained 3D information is that distribution of the network in 3D space allows the use of prior knowledge, such as assuming allometric relations in a branching structure (Aiteanu and Klein, 2014), to recover an approximation of the network. Ideally, the occlusion problem is resolvable with imaging technologies that penetrate the plant network, such as X-ray (Mairhofer et al., 2013) or magnetic resonance imaging (Schulz et al., 2013), that record changing absorption rates in different plant tissues. As a result, not only the outer surface, but also the internal architecture of the plant is resolvable. The important commonality between all these imaging technologies is that the plant network is represented as a boundary between itself and the free space surrounding the object in 2D and 3D. Or, more intuitively spoken, the boundary of a 3D plant network separates its enclosed volume from its “outside” environment. Data representations that just encode the plant network and the “outside” are called binary, because they assign a 1 to every plant network location and a 0 to every background location. In most application scenarios, we can safely assume the boundary of a plant network to be smooth, i.e., the boundary has no sharp corners or breaklines.
Once we understand the concept of a boundary, we can observe that skeletons are derived from this interface between the plant network and environment. Thus, a skeleton representation is computed on the basis of the boundary and, therefore, a skeleton represents boundary properties rather than the hierarchies and extensions of biological interest.
SKELETON CONCEPTS AND PLANT APPLICATIONS
Current literature reviews a plethora of skeleton concepts to be computed from 2D binary images (Lam et al., 1992), 3D images of voxels (Sobiecki et al., 2014), and point clouds (Cornea and Min, 2007; Bucksch, 2011). Two basic skeleton concepts used today are the medial axis (Lee, 1982) and the Reeb graph (Biasotti et al., 2008) shown in Fig. 3. The computation of the medial axis can be performed on 2D binary images (Serra, 1986), 3D voxels (Palágyi, 2002), and point clouds (Amenta et al., 2001). A noteworthy extension is the Hamilton-Jacobi skeleton (Siddiqi et al., 2002), which enhances the medial axis with a flux equation to achieve more robustness to noise. In recent years, the field of computer graphics developed several new skeleton concepts that make use of prior knowledge of the plant network. Rotational symmetry axis (ROSA) assumes rotational symmetry around the skeletal curves to enhance robustness against missing data from point clouds (Tagliasacchi et al., 2009). The assumption of a tree structure to extract a skeleton from point cloud data was introduced in Livny et al. (2010) for trees and has been applied to plant networks of grape clusters (Schöler and Steinhage, 2012; Steinhage et al., 2012). A very recent concept is the L1-medial skeleton (Huang et al., 2013), which shows promising results on point clouds of plants.
The use of skeletonization algorithms to obtain quantitative measurements of plant networks was first applied on tree crowns to estimate the radiative properties by simulating the foliage and fine branches from laser data (Cote et al., 2009) and to derive branch hierarchy information along with length and diameter distributions (Bucksch and Fleck, 2011; Delagrange et al., 2014). Leaf veins were quantified with skeletons to obtain estimates of areole measures and hierarchies (Price et al., 2011, 2013, 2014; Price, 2012). A recent article introduced a framework to identify legumes on the basis of leaf vein features (Larese et al., 2014). The branching structure of roots is another plant network that is often assessed with skeletons to obtain measurements of phenotypic traits (Iyer-Pascuzzi et al., 2010; Clark et al., 2011; Galkovskyi et al., 2012; Pound et al., 2013) to be related to regions of the genome (Topp et al., 2013). Pound et al. (2013) introduced a skeletonization concept for plant phenotyping that uses a probability field computed from the gray values of a 2D image to grow a connected skeleton of young and sparse roots. The growth is initiated at a user-defined location to the root tips. Free software exists that allows manual enforcement of skeleton centeredness and correction of miscomputed branching hierarchy. The references for tree crowns are examples of successful applications developed on the basis of a Reeb graph. Except for Pound et al. (2013), the chosen examples for leaves and roots apply to the medial axis.
Medial axis —The medial axis (Fig. 3) was introduced by Harry Blum (Blum, 1967) and was inspired by line-like structures created by the process of extinguishing fires. Simply imagine a patch of dry grassland, the whole border of which is ignited at one moment in time. If the fire propagates inward at a constant speed, then the fire fronts meet at the location of the medial axis. Although there is no generally agreed-upon definition of the medial axis (Attali et al., 2009), we can use here a simplified definition that can be visually followed in Fig. 3 using the same variables. The medial axis MA of a plant network M with a smooth boundary is the set of points inside the root network that consists of centers (ci ) of circles in two dimensions or spheres in three dimensions, whose boundary touches in at least two points (ti,j and ti,j+1 ) the boundary of the plant network. Here, touching is used as a synonym for the circle/sphere being tangent to the boundary of M.
The definition of the medial axis implies that each local change of the boundary changes the medial axis. Such changes correspond to locally occurring/vanishing maxima, minima, or saddle points on the boundary. Consequently, additional branches occur in the medial axis that may not represent the actual network. Reasons for additional branches include noise in the imaging data and tissue growth that results in a locally bulging boundary (Fig. 4). As a result, direct biological interpretation from the medial axis skeleton is problematic. A simple but efficient correction that makes use of a property of the medial axis is to remove branches that are shorter than the known radius of a circle/sphere at a branching point plus a user-defined threshold. In some applications, a threshold might be hard to define, such that methods to prune the medial axis have to be applied (e.g., Bai et al., 2007).
One problem arising in 3D applications of the medial axis is the occurrence of planes. For many plant applications, this problem is irrelevant, because plant organs are often cylindrical or planar and therefore result in curves. However, there are solutions to reduce these planes to curves (Palágyi and Kuba, 1998; Gorte and Pfeifer, 2004).
Reeb graph —The Reeb graph (Fig. 3) is a skeletonization concept introduced in 1946 (Reeb, 1946) and used to describe the change of an object shape in relation to a function expanded over the object. A simple way to imagine a Reeb graph is to think of a banana sliced—lengthwise—by a knife. The path of the knife along the banana can be thought of as the function that is expanded over the banana. The centers of the cutting area through the banana are the points on the Reeb graph that get connected to the center of the next cutting area in direction of the path. The following formal definition can be graphically followed in Fig. 3. Given a plant network M with smooth boundary and a piecewise linear function F, the level set at a value i is defined as the set of points in M with function value equal to i. As a practical example, if F is the height function whose function values are the values along the y-axis of a Cartesian coordinate system, then the level sets are simply the intersections when slicing through the object at intervals of size s orthogonal to the y-axis. To obtain the Reeb graph, we analyze the change of the connected components of the level sets of F, called contours. The Reeb graph RG of F is obtained by contracting the contours of F to points. For plants, this point can be chosen as the center of the contour, because most contours are convex. The center points of the contours assemble the vertices of the Reeb graph, which are connected via edges to the “closest” vertices obtained at the next function value i+s. For easier understanding, the given definition includes the embedding of the graph into the plant network. The original literature on Reeb graphs uses an abstract definition, which only defines the merge and split of contours free of coordinates. An example of the introduced construction is shown in Fig. 5 with the same binary images used in Fig. 4. Additionally, Fig. 5 demonstrates the robustness of the Reeb graph to small changes on the boundary.
The contours obtained by using height function are horizontal slices through the plant network. Problems can arise because the height function is not rotation invariant. To understand the potential problems, consider the Reeb graph of a vertical trunk of a plant network with a child branch emerging at a 90° angle from a parent branch (Fig. 6). At the level of the child branch, the contours include both the child and parent branch. Hence, only one contour for parent and child branch is extracted. As a result, consecutive center points appear to be “far from each other” or “shifted.”
A different choice of the function F can result in different Reeb graphs that can overcome the problems of the height function. Several solutions have been suggested to choose a suitable function F. Known solutions include the choice of different global functions (Biasotti et al., 2000) or determining a suited function locally (Biasotti et al., 2008; Bucksch et al., 2010), as shown in Fig. 7.
APPROXIMATING THE BRANCHING HIERARCHY OF A PLANT FROM REAL DATA
In Fig. 8, the medial axis and its segmentation of a cowpea root system was computed from a 2D image. The segmentation allows the detection of branches that do not represent the actual plant architecture. The segmentation was achieved by tracing the shortest path from the tip of the medial axis closest to the top row of the image to every root tip that is represented in the medial axis. The tracing resolves approximately for the loops resulting from the projection of the three-dimensional arrangement of roots to the image plane. Note that this strategy only applies to sparse networks, where branches do not occlude each other.
Figure 9 shows a 3D skeleton of a tree crown. Here, the skeleton is represented as a graph embedded into the original point cloud of a tree. All vertices with one incident edge that connect to vertices with three or more incident edges are removed from the originally extracted graph. The resulting skeleton approximates the actual branching structure of the tree crown (Bucksch, 2011).
SKELETON CENTEREDNESS USED TO DERIVE DIAMETERS IN PLANT APPLICATIONS
For plant networks, noncentered skeleton parts occur at locations where a child branch emerges from a parent branch (Fig. 7 and Fig. 10). Noncenteredness is implicit because a skeleton is a one-dimensional representation of a plant network that expands in two or three dimensions. This expansion is what we visually recognize as thickness and can be seen as the result of the expansion process of a growing one-dimensional curve (Bucksch et al., 2014). Therefore, every connected skeleton of a branching plant network travels from a branching point through a region of “thickness” before it reaches the location where the branch emerges from the boundary (Fig. 10). As a result, the skeleton part between the branching point and the point of branch emergence on the boundary is not centered. In the case of the medial axis, we can simply subtract the radius of the medial circle at the branching point from the emerging skeletal branch. In the case of the Reeb graph, we may identify jumps in the size of the contour on suited data. As a result, we obtain an approximation of the point of branch emergence on the boundary. Computing the point of branch emergence corresponds closely to measurements that can be obtained manually on the real plant, e.g., with a measuring tape, when measuring along the surface of the branch from the start of a branch on the trunk surface to the tip. For the Reeb graph, we can detect such regions as “long edges.” Note that vertices are favorable locations to derive measurements from the plant network because the graph can maintain centeredness for its vertices while the edges are not necessarily centered.
A QUANTITATIVE EXAMPLE OF SKELETON-DRIVEN TRAIT COMPUTATION
In the following, I apply the introduced skeleton concepts to images from a freely available rice root data set used in Iyer-Pascuzzi et al. (2010). The data set consists of images that show the root from several side views. I selected one image per individual root of the Caipo and Bala genotypes. Each image was selected such that occlusion of roots is visually minimized, which corresponds to the view of maximal root width. Overall, I used nine images of the Caipo data set and eight images of the Bala data, shown in Fig. 11. Both data sets were imaged with the same imaging setup such that results measured in pixels are comparable. The origin of the image is set to be the upper left corner. The absolute value of the negative y-direction corresponds to the row numbers.
The three example traits—maximum tip diameter, root width, and rooting depth—were computed for the selected Bala and Caipo images. Caipo and Bala were chosen because they visually contrast with each other in these traits (Fig. 11). A tip on the medial axis is defined as a pixel with only one neighbor and the diameter is known from the corresponding medial circle. To estimate root width, I used the difference between maximal and minimal column values occupied by the skeleton. Rooting depth corresponds to the maximal row number reached by one of the tips. I also applied noise filtering with a threshold of two pixels to eliminate branches resulting from noise as described above.
Caipo shows a maximum tip diameter of 3.9 pixels with a standard deviation (SD) of 0.57. Bala resulted in a maximum tip diameter of 4.4 pixels (SD = 0.50). A t-test performed on the two measurement sets resulted in a P = 0.13 and suggests that the maximum tip diameter differentiates both genotypes. Furthermore, I extracted estimates of the two known traits—root width and rooting depth. Root width resulted in an average of 607.11 pixels (SD = 224.90) for Caipo and an average of 504.38 pixels (SD = 66.14) for Bala. Although the t-test resulted only in P = 0.23 for the small set of test images, the genotypes differed strongly in their standard deviation. The strongest differentiation (P < 0.0001) between the two genotypes was achieved with the rooting depth. Rooting depth of Caipo was 1317.80 pixels (SD = 151.35) on average. In comparison, the rooting depth of Bala was about half as deep as the rooting depth of Caipo, resulting in an average of 625.50 pixels (SD = 181.62). I used an early version of the Digital Imaging of Root Traits (DIRT) platform ( www.dirt.biology.gatech.edu) to compute the traits from the medial axis. The computation results are included in Appendix S1 (apps.1400005_s1.xls).
The advancement of image processing algorithms has resulted in a wide variety of software tools throughout biology (Eliceiri et al., 2012) and the plant sciences in particular (Lobet et al., 2013). In addition to end-user software, open-source libraries for scripting languages such as MATLAB or Python (Coelho, 2013; van der Walt et al., 2014) offer access to skeletonization methods. In particular, Python and its scientific libraries have become widely used in interdisciplinary research and education (Deutsch, 2014). Software libraries enable researchers to adapt skeletonization methods to their specific applications. Specific application needs can often be solved via open community platforms such as MATLAB Central, stack overflow, or stack exchange that offer access to source code for easy tasks, such as implementations of 3D skeletonizations in MATLAB (Schena, 2004) or finding branches in a skeleton (stackoverflow.com, 2013). From a user perspective, I introduced two skeletonization basic concepts and provided insight into mechanisms that potentially influence the statistical analysis of data computed with available software packages or code snippets collected on the Internet. However, it remains the responsibility of the researcher to judge and document if measurement errors induced by imperfect data or the skeletonization concept have significant influence on the achieved results.
Skeletons are powerful descriptors for analyzing plant networks, if applied with care. In this article I pointed out two difficulties:
1. Spurious branches have to be removed from an extracted skeletal structure to obtain an improved approximation of the actual plant network.
2. Diameter measurements of the plant network of interest cannot be derived at all locations of the skeleton.
The choice for a particular skeletonization concept is dependent on the application, but knowledge of the basic underlying concepts will make it easier to adapt source code to plant applications. These adaptions result in improved data quality computed from skeletons. As a consequence of improved data quality, the statistical analysis of the naturally large geometric variations within plant networks is facilitated.
Achieving a one-to-one correspondence between the actual plant network and a skeleton extracted from imaging data are only possible on sparse plant networks. The limiting factor is not the skeleton concept, but the imaging technology and its ability to resolve the complete plant network. Therefore, preparing near-ideal input data will involve manual work to correct for the physical limits imposed by certain imaging technologies. The simple examples shown here should help end-users to judge the results computed with the chosen software or encourage users to contact the author of the software to interpret the results correctly.
Although I used several thresholds in this article, I do not propose them as an optimal algorithmic solution, but as a simple tool to start experiencing and understanding skeleton concepts. Appendix S2 (apps.1400005_s2.docx) contains annotated references to freely available data, source/pseudo codes, and executable software referenced in this article.