Přispěvatelé: |
Lorenzetti, R, Barbetti, B, Fantappiè, M, L'Abate, G, Costantini, EAC, Barbetti, R |
Popis: |
Estimating frequency of soil classes in map unit is always affected by some degree of uncertainty, especially at small scales, with a larger generalization. The aim of this study was to compare different possible approaches - data mining, geostatistic, deterministic pedology - to assess the frequency of WRB Reference Soil Groups (RSG) in the major Italian soil regions. In the soil map of Italy (Costantini et al., 2012), a list of the first five RSG was reported in each major 10 soil regions. The soil map was produced using the national soil geodatabase, which stored 22,015 analyzed and classified pedons, 1,413 soil typological unit (STU) and a set of auxiliary variables (lithology, land-use, DEM). Other variables were added, to better consider the influence of soil forming factors (slope, soil aridity index, carbon stock, soil inorganic carbon content, clay, sand, geography of soil regions and soil systems) and a grid at 1 km mesh was set up. The traditional deterministic pedology assessed the STU frequency according to the expert judgment presence in every elementary landscape which formed the mapping unit. Different data mining techniques were firstly compared in their ability to predict RSG through auxiliary variables (neural networks, random forests, boosted tree, supported vector machine (SVM)). We selected SVM according to the result of a testing set. A SVM model is a representation of the examples as points in space, mapped so that examples of separate categories are divided by a clear gap that is as wide as possible. The geostatistic algorithm we used was an indicator collocated cokriging. The class values of the auxiliary variables, available at all the points of the grid, were transformed in indicator variables (values 0, 1). A principal component analysis allowed us to select the variables that were able to explain the largest variability, and to correlate each RSG with the first principal component, which explained the 51% of the total variability. The principal component was used as collocated variable. The results were as many probability maps as the estimated WRB classes. They were summed up in a unique map, with the most probable class at each pixel. The first five more frequent RSG resulting from the three methods were compared. The outcomes were validated with a subset of the 10% of the pedons, kept out before the elaborations. The error estimate was produced for each estimated RSG. The first results, obtained in one of the most widespread soil region (plains and low hills of central and southern Italy) showed that the first two frequency classes were the same for all the three methods. The deterministic method differed from the others at the third position, while the statistical methods inverted the third and fourth position. An advantage of the SVM was the possibility to use in the same elaboration numeric and categorical variable, without any previous transformation, which reduced the processing time. A Bayesian validation indicated that the SVM method was as reliable as the indicator collocated cokriging, and better than the deterministic pedological approach. |