Machine Learning Approaches To Develop Pedotransfer Functions For Tropical Sri Lankan Soils
Poor data availability on soil hydraulic properties in tropical regions hampers many studies, including crop and environmental modeling. The high cost and effort of measurement and the increasing demand for such data have driven researchers to search for alternative approaches. Pedotransfer functions (PTFs) are predictive functions used to estimate soil properties by easily measurable soil parameters. PTFs are popular in temperate regions, but few attempts have been made to develop PTFs in tropical regions. Regression approaches are widely used to develop PTFs worldwide, and recently a few attempts were made using machine learning methods. PTFs for tropical Sri Lankan soils have already been developed using classical multiple linear regression approaches. However, no attempts were made to use machine learning approaches. This study aimed to determine the applicability of machine learning algorithms in developing PTFs for tropical Sri Lankan soils. We tested three machine learning algorithms (artificial neural networks (ANN), k-nearest neighbor (KNN), and random forest (RF)) with three different input combination (sand, silt, and clay (SSC) percentages; SSC and bulk density (BD); SSC, BD, and organic carbon (OC)) to estimate volumetric water content (VWC) at −10 kPa, −33 kPa (representing field capacity (FC); however, most studies in Sri Lanka use −33 kPa as the FC) and −1500 kPa (representing the permanent wilting point (PWP)) of Sri Lankan soils. This analysis used the open-source data mining software in the Waikato Environment for Knowledge Analysis. Using a wrapper approach and best-first search method, we selected the most appropriate inputs to develop PTFs using different machine learning algorithms and input levels. We developed PTFs to estimate FC and PWP and compared them with the previously reported PTFs for tropical Sri Lankan soils. We found that RF was the best algorithm to develop PTFs for tropical Sri Lankan soils. We tried to further the development of PTFs by adding volumetric water content at −10 kPa as an input variable because it is quite an easily measurable parameter compared to the other targeted VWCs. With the addition of VWC at −10 kPa, all machine learning algorithms boosted the performance. However, RF was the best. We studied the functionality of finetuned PTFs and found that they can estimate the available water content of Sri Lankan soils as well as measurements-based calculations. We identified RF as a robust alternative to linear regression methods in developing PTFs to estimate field capacity and the permanent wilting point of tropical Sri Lankan soils. With those findings, we recommended that PTFs be developed using the RF algorithm in the related software to make up for the data gaps present in tropical regions.