Using kNN, RF and SVM and their Combination Using GR for Soil Texture Modeling

Document Type : Research Paper

Authors

1 Ph.D Student, Department of Soil Science, Faculty of Agriculture, Lorestan University, Khorramabad, Iran

2 Assitant Professor, Department of Soil Science, Faculty of Agriculture, Lorestan University, Khorramabad, Iran

3 Assitant Professor, Department of Rangeland and Watersghed Management, Faculty of Agriculture and Natural Resources, University of Ardakan, Ardakan, Iran.

4 Professor, Department of Soil Science, Faculty of Agriculture, Lorestan University, Khorramabad, Iran.

10.22092/ijsr.2024.364333.735

Abstract

Soil texture is one of the most important soil properties that governing soil physical, chemical and biological behaviors. In modeling soil textural fractions different models are used, each has its own advantages. To combine the benefit from different models, one approach is combining their predictions. Since soil texture is a compositional data, when its fractions are estimated separately there is no guarantee that the estimates will sum to 100. Log-ratio transformations before modeling are ways to deal with the problem. Little is known about modeling transformed and untransformed (UT) soil texture data using a combination of different models. In present study, 200 surface soil samples (0-30 cm) were collected from Kuhdasht region. Random forest (RF), k-nearest neighbors (kNN) and support vector machines (SVM) and their combination using Granger-Ramanathan (GR) method were used to model soil texture data. Additive log-ratio (alr), centroid log-ratio (clr) and isometric log-ratio (ilr) transformations were used to transform texture data. Environmental variables derived from Landsat 8 and Sentinel-2 images and a digital elevation model (DEM) were used as input for all models. Results indicated that covariates derived from DEM were more important in modeling soil texture. All models improved the estimates of soil texture fractions when used alr transformed data compare to when using UT, clr and ilr transformed data. The combined model (i.e. GR) did not show superiority over other models. Using GR model RMSE values for alr, clr, ilr transformed clay data and UT clay data were 5.07%, 4.21%, 5.81% and 6.09%, respectively. For silt RMSE values (in the same order as clay) were 7.11%, 5.15%, 9.04% and 6.70%, and for sand were 9.20%, 7.67%, 11.69% and 8.74%, respectively. Generally, SVM using alr transformed data showed a slightly higher potential for modeling soil texture. To sum up, results indicated that combining different machine learning algorithms does not necessarily improved the estimates. Therefore, instead of using a model combination approach that may result in more complexity, it is possible to use a single appropriate model for modeling soil texture.

Keywords

Main Subjects