Reforestation efforts that integrate local ecological knowledge (LEK) have demonstrated their potentia to enhance vegetation recovery by leveraging local communities' adaptive understanding of ecosystems. This study evaluates the impacts of a reforestation project that integrated LEK and was implemented in central Nepal between 2010 and 2016. Using satellite-based remote sensing data and propensity score matching, the research isolates the effects of interventions integrating LEK on vegetation recovery, as measured by the normalized difference vegetation index (NDVI). The analysis, which uses Sentinel-2 imagery and covariates such as topography, soil characteristics, and past vegetation patterns, reveals statistically significant NDVI improvements in project areas over the study period: 0.0341 in 2018, 0.0670 in 2020, and 0.0737 in 2022. Furthermore, by 2022, the project contributed considerably to achieving medium-high vegetation health, increasing the probability of attaining this vegetation health status by 19.14 percentage points. The improvement in vegetation health suggests that the reforestation project not only has enhanced canopy cover but also is associated with improved soil nutrient availability and increased organic carbon storage. These factors contribute to long-term forest regeneration by facilitating nutrient cycling, stabilizing soil, and promoting sustained vegetation growth. Sensitivity analyses confirm the robustness of these results against potential unobserved confounders, underscoring the reliability of the findings. The integration of local communities' knowledge into species selection may have played a crucial role in these improvements while supporting sustainable forest management practices. These findings add to the growing body of evidence supporting the integration of LEK in reforestation initiatives, offering practical implications for global biodiversity conservation and climate change mitigation efforts.
Introduction
Forests rank among the most critical ecosystems on Earth, providing a diverse array of ecosystem services, including carbon sequestration and biodiversity maintenance. Reforestation could contribute to addressing the pressing global challenges of biodiversity conservation and climate change mitigation (Pawson et al 2013; Kemppinen et al 2020).
Sustainable ecosystem and biodiversity management depends on fully recognizing the value of the knowledge–practice–belief complex of Indigenous Peoples related to biodiversity conservation (Gadgil et al 1993). The importance of integrating local ecological knowledge (LEK) into forest management and conservation has long been acknowledged (Berkes et al 2000; Lykke et al 2004). LEK refers to the knowledge held by a specific group of people about their local ecosystems (Olsson and Folke 2001). Although traditional ecological knowledge also means the knowledge and insights acquired through extensive observation of an area or a species (Huntington 2000), this study adopts the term LEK to emphasize its dynamic and adaptive nature, rather than its reliance on historical or cultural continuity (Berkes et al 2000; Joa et al 2018).
LEK complements scientific knowledge, not only in ecosystem conservation and management but also in ecosystem restoration, contributing to areas such as the construction of reference ecosystems, species selection for restoration plantations, site selection for restoration, knowledge about historical land management practices, management of invasive species, and postrestoration monitoring (Uprety et al 2012). Recent empirical studies from diverse regions worldwide similarly demonstrate that LEK contributes to these aspects of ecosystem restoration (Suárez et al 2012; Velázquez-Rosas et al 2018; Fremout et al 2021; Mariscal et al 2022; Haq et al 2023; González-Torres et al 2024).
LEK has been largely undocumented, so research on its role in nature conservation has primarily relied on qualitative methodologies (Abas et al 2022). Most case studies examining the role of LEK in reforestation projects use methods such as interviews, observations, and workshops, although some exceptions incorporate quantitative methods to analyze vegetation distribution and plant utilization (Velázquez-Rosas et al 2018; Sinthumule and Mashau 2020; Fremout et al 2021; Schmidt et al 2021; Mariscal et al 2022; Haq et al 2023). These studies provide rich contextual information and insights into the mechanisms through which LEK may contribute to forest restoration. However, they lack foundational evidence regarding the causal effects of LEK—specifically, whether LEK meaningfully drives forest recovery outcomes.
Few studies have employed techniques such as principal component analysis (PCA) to assess mangrove forest survival rates following reforestation projects that incorporated LEK; one example is the work by Ravaoarinorotsihoarana et al (2023). However, the statistical analyses in this study do not rigorously measure the causal effects of these interventions. Establishing causality in observational studies requires controlling for influences on the outcome variables that are unrelated to the intervention. Differences between treated and untreated groups may stem both from the intervention and from preexisting differences in their attributes—called selection bias—which can distort causal estimates (Rosenbaum and Rubin 1983; Rosenbaum 2020). In the context of forest restoration, vegetation recovery in project areas might simply reflect the selection of sites with favorable baseline conditions, such as soil fertility or sunlight exposure, rather than the actual effects of the intervention.
To rigorously assess the impact of such projects, statistical control of these confounding factors is essential. To date, no study has effectively addressed these issues empirically to demonstrate the causal impact of LEK on forest restoration. Our study addresses this research gap and examines the impacts of a reforestation project that integrates LEK, using remote sensing and statistical causal inference methods. Specifically, we employ satellite imagery data and propensity score matching (PSM) to assess the impact of LEK on local vegetation recovery. PSM is a quasiexperimental approach that leverages observational data to create statistically comparable treatment and control groups, enabling causal inference.
Our analysis focuses on Institute for Himalayan Conservation Japan (IHC) Reforestation Project 2010, implemented in Gandaki Province, central Nepal, in a collaboration between local communities and IHC, a Japanese nongovernmental organization. This project provides a typical case in which local communities, guided by LEK, determined species for reforestation. Although the project's methods are neither novel nor complex, their practical and fundamental nature makes them an ideal case for examining the transferability and applicability of evidence in reforestation practices that integrate LEK.
Research background
Since the 1970s, IHC has promoted nationwide tree-planting projects in rural Nepal, particularly in the foothills of the Annapurna Mountains. This paper examines IHC Reforestation Project 2010 (April 2010–March 2016), funded by the Japan International Cooperation Agency Grassroots Technical Cooperation Project, called the “Mountain Village Reconstruction Support Project Through the Creation of Livelihood Forests” (Soma 2021a, 2021b). The project was implemented at 6 main study sites in Gandaki Province, located in central Nepal (Figure 1).
All project sites were selected according to factors such as elevation, orientation, local ethnicity, and flora. IHC conducted a reforestation project that fully integrated LEK, incorporating ideas from local community members. First, IHC established local forest committees comprising approximately 10 members at each of the 6 project sites. These members, all local residents with extensive knowledge of local forests and ecosystems, were responsible for identifying planting areas, selecting tree species, and liaising with the village council.
Subsequently, IHC official nurseries were established at each site to facilitate intensive seedling production. Each nursery had at least 2 nursery managers, selected from local farmers, who were responsible for seed collection, seedling cultivation, and planting young trees.
To produce young trees, seeds were collected from mountain forests near each project site, and topsoil was taken from the same elevation at which the seeds were collected. Both seeds and topsoil were then transported to the nurseries for seedling development. Tree species and soil were selected according to local ecological characteristics and LEK related to forest resource use, rather than by simply introducing new species to the area.
The annual tree-planting season was scheduled over the second and third weeks of June. Each year, the village council determined the planting dates, with community members volunteering for the planting activities.
The project employed a reforestation methodology that integrated LEK and encompassed seed collection, seedling production in nurseries, transplantation, and scientific evaluation (Soma 2021a, 2021b). A total of 44 tree species were selected for transplantation (Table 1), with species composition varying by project location.
The forest committee and local community members selected tree species and transplantation plots. Table 2 shows the total number of trees planted on public land in the 6 communities between 2010 and 2016, as well as the area being afforested. Transplantation activities were conducted at 63 plots across all 6 study sites, with tree-planting areas allocated by the forest committee and village council. The planting plots included areas affected by landslides, forest gaps created by fallen trees, wastelands, unused lands, cleared lands, and steep slopes. Planting activities were carried out on steep slopes ranging from lowlands to highlands. During the 6-year implementation period starting in 2010, 131,186 trees were planted, and the area covered amounted to 762,090 m2 of public land.
TABLE 2
Basic data on study sites.

All plantation plots in each project site were selected in consultation with the then-existing village development committees (VDCs) and community forest user groups, together with IHC members, nursery managers, the forest committee, and local residents, before being proposed to the local Division Forest Office. Local stakeholders volunteered to care for the newly planted young trees in the plots even after the project ended.
In addition to reforestation efforts on public land, this project involved distributing seedlings to local residents. However, planting these seedlings on their private land was voluntary and not monitored. Therefore, this study focuses exclusively on reforestation efforts conducted on public land. All data presented in this study concerning project sites refer solely to reforestation sites on public land.
Data
This study used geographic information systems, as illustrated in Figure 2. In step 1, data primarily from satellite-derived layers were extracted to construct the analytical dataset. In step 2, land characteristic covariates included in this dataset were used to calculate propensity scores. These scores were then used to match treatment and control groups to estimate the effects of the IHC Reforestation Project on vegetation recovery.
The data acquisition range was confined to the boundaries of the former VDCs encompassing a project site. Until 2017, VDCs were Nepal's smallest administrative divisions; they were then reorganized into gaunpalikas (rural municipalities). Because this study examines a project conducted between 2010 and 2016, the administrative boundaries from that period were used to maintain consistency. Both project and nonproject sites within the VDCs were delineated using GPS boundary data collected at all project sites between May and October 2015. The total analyzed area of the VDCs is 107,251,385 m2, with the project sites occupying 762,090 m2; all project sites analyzed consist of public land.
First, 10-m-resolution normalized difference vegetation index (NDVI) data were obtained from the Sentinel-2 satellite for the study area to assess vegetation recovery. Specifically, Google Earth Engine was used to obtain atmospherically corrected satellite imagery from the “Harmonized Sentinel-2 MSI: Multispectral Instrument, Level 2A” dataset. The mean NDVI values were computed for 3 dry seasons: 2018 (October 2018–May 2019), 2020 (October 2020–May 2021), and 2022 (October 2022–May 2023). Dataset construction and statistical analysis were conducted using these 10-m-resolution grids, which served as the fundamental units of the analyses.
NDVI is also widely used as an indicator of vegetation health (Huete et al 2002). In this study, we classified the mean NDVI of each 10-m grid to construct vegetation health categories: medium (0.4 < NDVI ≤ 0.6), medium high (0.6 < NDVI ≤ 0.8), and high (NDVI > 0.8). Although no universally standardized threshold exists for NDVI-based vegetation health classification, several studies in the Himalayas have adopted thresholds of 0.4–0.5 or higher to define healthy vegetation (Haq et al 2024; Haq et al 2025). Based on this precedent, the adopted threshold is reasonable.
Furthermore, 16 covariates representing land characteristics were computed for each grid. These included topographical, accessibility, land cover, past vegetation, and soil characteristics. Topographical covariates were obtained from the 30-m-resolution digital elevation model by the Japan Aerospace Exploration Agency (ALOS World 3D). The accessibility covariate was derived from road vector data in OpenStreetMap, whereas land cover covariates were derived from Nepal's 2009 Land Cover Dataset (30 m resolution) by the International Center for Integrated Mountain Development. The past vegetation covariate was extracted from Landsat-5 satellite images (30 m resolution), with NDVI values calculated for the dry seasons of October 2003–May 2004 and October 2008–May 2009. Soil characteristic covariates were obtained from a 100-m-resolution digital soil map, jointly developed by the Nepal National Soil Science Research Center and the International Maize and Wheat Improvement Center. These variables are summarized in Table S1 (Takahashi_MRD2025_Supplemental_material_TableS1-S2.pdf) (Supplemental material, https://doi.org/10.1659/mrd.2024.00015.S1) by the project and nonproject locations within the boundaries of the former VDCs.
Assessment methodology
PSM estimates the causal effect of a treatment by constructing treatment and control groups based on propensity scores, which indicate the probabilities of receiving the treatment. As proposed by Rosenbaum and Rubin (1983), it matches treated individual samples with untreated ones that have similar propensity scores. PSM has been used in several studies to assess forest conservation programs. For instance, satellite imagery and PSM have demonstrated the role of protected areas in forest cover restoration and deforestation prevention across Asia (Gaveau et al 2012; Gu et al 2020; Rahman and Islam 2021). Similarly, this approach has been used to examine the impact of forest coffee certification on conservation in Ethiopia (Takahashi and Todo 2013, 2017).
In this study, 10-m square grids were matched based on the propensity scores to form the treatment and control groups, enabling the estimation of the average treatment effects on the treated (ATT). Propensity scores were estimated by fitting a logistic regression model, as follows:
where Yi is a binary variable indicating whether grid i is in the treatment group. The independent variables (COVi) represent the 16 covariates in each grid i (Table 3). β represents the coefficients estimated in the logistic regression model. Following Brookhart et al (2006), the logistic model incorporated covariates that influence both treatment assignment and outcome, as well as covariates unrelated to treatment but correlated with outcome. The selected covariates are based on the understanding that changes in NDVI are influenced not only by the reforestation project but also by natural vegetation recovery and human-induced land management independent of the project.
TABLE 3
List of covariates.

Covariates 1–4 capture topographical characteristics, with south- and north-facing slope dummies accounting for solar radiation and soil moisture. Covariate 5, proximity to roads, indicates land accessibility, crucial for site selection and vegetation management. Covariates 6–8 represent preproject land cover types, which correlate with high NDVI and influence project site selection. Covariate 9, the rate of NDVI change from 2003 to 2008, reflects vegetation recovery trends before the project. Covariates 10–16 relate to soil characteristics, controlling for factors affecting reforestation suitability and natural vegetation recovery potential.
The variance inflation factor (VIF) measures multicollinearity, with values above 10 suggesting instability in coefficient estimates. In this study, VIF for all covariates was below 10, indicating no severe multicollinearity in this logistic regression model. However, soil pH and total nitrogen in soil had VIFs slightly above 5, indicating moderate correlation among the soil covariates. To ensure model robustness, we estimated 2 specifications: one including all covariates (1–16) and another in which the soil covariates (10–16) were dimensionally reduced using PCA. The second model included the first to third principal components, capturing up to 80% of the cumulative proportion, as explanatory variables. As a result, all VIFs fell below 5, confirming effective multicollinearity control.
Using estimated propensity scores from the logistic model, 1-to-1 nearest-neighbor matching with replacement was performed. Each project site grid was matched to the most similar nonproject site grid. Matching with replacement can often decrease bias by allowing repeat use of control individuals with similar characteristics (Stuart 2010). Samples outside the common support range, where treatment and control propensity score distributions overlapped, were excluded. Ensuring common support is essential for causal inference using PSM.
The efficacy of the PSM was assessed using a balancing test, evaluating the standardized mean differences (SMDs) of the covariates in the matched dataset. SMD quantifies the mean difference between treatment and control groups, standardized by the pooled standard deviation for continuous variables or by the difference in proportions for binary variables. PSM aims to balance the distribution of covariates across the groups, making balance testing essential. An absolute SMD < 0.1 suggests that covariates are well balanced between groups.
ATT can be estimated by directly comparing treatment and control group means in the matched dataset. However, slight imbalances may persist even after PSM, potentially biasing treatment effect estimation. Prior research suggests that including covariates used in the propensity score estimation in the analytical model helps mitigate such residual imbalances (Rubin and Thomas 2000; Austin 2011; Elze et al 2017). Therefore, instead of direct comparison, we employed analysis of covariance (ANCOVA) to estimate ATT. The ANCOVA equation is expressed as follows:
where NDVIit represents the mean NDVI for each period t in each grid i, α is the intercept, Pi is a dummy variable that equals 1 if the grid is located within the project sites and 0 otherwise, COVi represents the covariates used to calculate the propensity scores, β represents the coefficients estimated in the ANCOVA model, and εit is the error term.
Furthermore, to assess the effect on vegetation health, we estimated a logistic ANCOVA with the vegetation health classification as the dependent variable. The logistic ANCOVA equation is expressed as follows:
where VHi represents 1 of 3 binary variables for vegetation health: medium, medium high, or high. We estimated 3 separate models, each using 1 binary variable as the dependent variable, but we present a single equation for brevity. Pi is a dummy variable indicating project sites, COVi represents the covariates used in propensity score estimation, and β represents the coefficients estimated in the logistic ANCOVA model. We calculated the marginal effects of the treatment variable from the logistic ANCOVA results to assess the project's impact on vegetation health.
As a final step, we conducted a sensitivity analysis to evaluate the robustness of estimated ATT. PSM assumes treatment assignment is based on observable variables, but in practice, not all relevant factors can be observed. Unobservable natural and social factors may affect project site selection and vegetation restoration. If such hidden bias exists, ATT estimated via PSM may be biased, compromising causal inference.
To address this limitation, we used the Rosenbaum bounds method (Rosenbaum 2002) to test the robustness of estimated ATT against unobserved confounders. This method introduces the parameter Γ, which quantifies the potential influence of unobserved confounders on project site selection and vegetation recovery. Robustness is evaluated by checking whether ATT remains statistically significant at the 5% level as Γ increases, and the critical value of Γ is defined as the value at which ATT loses significance at the 5% level. Although no universal threshold exists, Aakvik (2001) proposes that a critical value of Γ ≥ 1.5 suggests robustness. However, losing significance at low Γ does not necessarily mean ATT is invalid, because the test assesses a worst-case scenario under the influence of unobserved confounders (Becker and Caliendo 2007).
Results
Before presenting the ATT estimation results, we examine indicators to verify that our matching process meets key assumptions and mitigates bias. Figure 3 shows the propensity score distributions for the matched treatment and control groups. The figure compares 2 models: (1) a logistic regression with all 16 covariates as explanatory variables and (2) a model aggregating soil covariates using PCA. The observed congruence of distributions supports the common support assumption. In each model, approximately 1,217,000 nonproject site grids were excluded from the control group because of mismatching or extreme propensity scores. Table 4 shows that post-PSM SMDs fall within –0.1 to 0.1, indicating balanced matching across all models and effective bias mitigation.
TABLE 4
Results from the balancing test.

Table 5 presents the ANCOVA and logistic ANCOVA results, with models incorporating all soil covariates. In these models, the estimated coefficient of the project sites' dummy variable represents ATT for vegetation. Results from models 1–3 indicate ATT for NDVI, showing a statistically significant and consistent increase: 0.0341 in 2018, 0.0670 in 2020, and 0.0737 in 2022. These findings suggest that IHC Reforestation Project 2010 has had a sustained positive impact on vegetation recovery.
TABLE 5
Results from ANCOVA and logistic ANCOVA estimation.

Models 4–6 estimate ATT for vegetation health categories. Because of space constraints, only the results for 2022 are presented. The marginal effects for medium, medium-high, and high vegetation health were –0.1685, 0.1914, and 0.0239, respectively, all of which are statistically significant. The largest marginal effect was observed for medium-high vegetation health, where the project increased the probability of attaining this vegetation health status by 19.14 percentage points. Although the effect on high vegetation health was statistically significant and positive, its magnitude was relatively small. These results suggest that by 2022, the project primarily contributed to achieving medium-high vegetation health. In contrast, the negative effect on medium vegetation health indicates that many grid cells transitioned out of this category over time. Although not reported here, the project's effect on medium vegetation health in 2018 was positive and significant.
In models 1–3, the critical values of Γ were 1.4, 2.3, and 2.4, respectively. Although 2018 ATT is more susceptible to unobserved factors, the 2020 and 2022 estimates are more robust, confirming their reliability. In models 4–6, although the critical value of Γ was relatively small for medium vegetation health, it remained within an acceptable range for both medium-high and high vegetation health. These results indicate that models 4–6 are robust in the estimation of ATT for medium-high and high vegetation health.
Table S2 (Takahashi_MRD2025_Supplemental_material_TableS1-S2.pdf) (Supplemental material, https://doi.org/10.1659/mrd.2024.00015.S1) presents additional results from models incorporating PCA-aggregated soil covariates. Compared with models with all soil covariates, the significance, magnitude, and direction of the ATT estimates, as well as the critical values of Γ, remained mostly unchanged, confirming the model's robustness.
Discussion
Our impact assessment indicates that NDVI values in the areas of IHC Reforestation Project 2010 have consistently improved since its completion in 2016, particularly enhancing vegetation in the medium-high health category. The observed NDVI increase generally suggests greater photosynthetic activity, which is likely to result in higher biomass accumulation and enhanced carbon sequestration. Moreover, previous studies in the Himalayas have demonstrated a positive correlation among NDVI, carbon content, and nutrient availability (Bhardwaj et al 2016; Banday et al 2019; Wani et al 2021; Shah and Sharma 2023).
For instance, Bhardwaj et al (2016) examined forests with NDVI values ranging from 0.0 to 0.5 and reported that the maximum total soil carbon density (carbon at 90.82 t ha–1) was observed in forests with an NDVI of 0.4–0.5. Similarly, Banday et al (2019) focused on forests with NDVI values ranging from 0.0 to 0.5 and found that soil organic carbon and key nutrients—nitrogen, phosphorus, potassium, calcium, and sulfur—were highest in areas with NDVI values of 0.4–0.5.
The increase in medium-high vegetation health suggests that the reforestation project not only has enhanced canopy cover but also is associated with improved soil nutrient availability and increased organic carbon storage. These factors contribute long-term forest regeneration by facilitating nutrient cycling, stabilizing soil, and promoting sustained vegetation growth. These findings highlight the role of reforestation projects that integrate LEK in fostering forest sustainability.
This paper empirically assessed the vegetation recovery effects of reforestation that integrates LEK, rather than investigating the underlying mechanisms driving these impacts. Therefore, we limit our discussion to mechanisms identified in existing studies. Prior research shows that LEK is crucial for species selection in forest restoration (Suárez et al 2012; Fremout et al 2021; Mariscal et al 2022; Haq et al 2023; González-Torres et al 2024). It plays a vital role in understanding propagation methods, environmental adaptability, and the sociocultural values of tree species. Such knowledge enhances reforestation success by guiding the selection of ecologically and culturally significant species (Uprety et al 2012).
In IHC Reforestation Project 2010, members of the forest committee, knowledgeable about the local ecosystem, selected tree species based on factors such as terrain, elevation, and sunlight exposure. They also prioritized species with seeds available in nearby forests. These ecological considerations may have contributed to higher tree survival rates. At the same time, forest committee members selected a diverse range of species to meet site-specific needs, taking into account their use for fodder, firewood, timber, and raw materials for paper (Soma 2021a, 2021b). The demand for forest resources varies across communities, resulting in differences in tree species composition among the 6 study sites (Soma 2021a, 2021b). The selection of species beneficial to local communities likely improved tree survival rates by encouraging resident participation in planting and maintenance.
Study limitations
This study compared project sites that underwent reforestation integrating LEK with areas without reforestation. A more rigorous assessment would compare sites that underwent reforestation with and without the integration of LEK. In addition, our design does not examine the mechanisms of reforestation that integrate LEK. Investigating factors driving vegetation recovery in this project requires in-depth qualitative research at the study sites.
Although NDVI was used as the evaluation index, biomass—reflecting growth in wood and total tree volume—provides a more direct measure of forest recovery. A key limitation of NDVI is its inability to capture tree structural attributes, which are essential for biomass estimation. Light detection and ranging (LiDAR) addresses this limitation by capturing the three-dimensional structure of forests, with unmanned aerial vehicle–mounted LiDAR enabling high-resolution, localized data collection. A promising approach involves integrating LiDAR data from sampled locations with satellite imagery to train predictive models for large-scale biomass estimation. In addition, integrating these estimates with causal inference methods, such as PSM, enhances the accuracy of assessing the reforestation project's impact on forest recovery.
Future studies should incorporate more qualitative and quantitative data. Although our study offers broad-scale observations a decade after the project, verifying the long-term effects of forest regeneration requires prolonged data collection.
ACKNOWLEDGMENTS
We acknowledge the Japan Aerospace Exploration Agency for ALOS World 3D data, the European Commission's Copernicus program for Sentinel-2 imagery, and the U.S. Geological Survey for Landsat-5 imagery. We also acknowledge the Humanitarian OpenStreetMap Team for Nepal Roads vector data, the International Center for Integrated Mountain Development for Nepal's Land Cover Dataset, and the Nepal National Soil Science Research Center and the International Maize and Wheat Improvement Center for generating Nepal's Digital Soil Map, licensed by the Nepal Government.
© 025 Takahashi and Soma.
This open access article is licensed under a Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/). Please credit the authors and the full source.
REFERENCES
Appendices
Supplemental material
TABLE S1 (Takahashi_MRD2025_Supplemental_material_TableS1-S2.pdf) Summary statistics of the land characteristics of the study sites.
TABLE S2 (Takahashi_MRD2025_Supplemental_material_TableS1-S2.pdf) Results from the ANCOVA and logistic ANCOVA estimation for robustness check.