The goal of {SOILmilaR} is to provide methods for applying standardized, customizable “similar soils” rules to site-level data derived from various sources.
The core function to implement this method is similar_soils()
, which compares a set of soils against one soil type or combination of soil conditions. similar_soils()
can be called iteratively using the design_mapunit()
function.
The method generally follows the process outlined in Norfleet & Eppinette (1993):
Norfleet, M.L. and Eppinette, R.T. (1993), A Mathematical Model for Determining Similar and Contrasting Inclusions for Map Unit Descriptons. Soil Survey Horizons, 34: 4-5. https://doi.org/10.2136/sh1993.1.0004
Installation
You can install the development version of SOILmilaR from GitHub with:
# install.packages("remotes")
remotes::install_github("brownag/SOILmilaR")
Example
This is an example that shows you how similar_soils()
works.
First we generate a synthetic data set using random soil depths across a range of depth classes, as well as random particle size control section clay content and fragments.
Then we apply a subset of taxonomic rules for loamy soils to assign a taxonomic particle size class.
Then we create some rating functions for properties of interest.
rate_taxpartsize <- function(x) {
dplyr::case_match(x,
c("sandy-skeletal") ~ 1,
c("sandy") ~ 3,
c("loamy", "coarse-loamy", "coarse-silty") ~ 5,
c("fine-loamy", "fine-silty") ~ 7,
c("clayey", "fine") ~ 9,
c("very-fine") ~ 11,
c("loamy-skeletal", "clayey-skeletal") ~ 13,
"fragmental" ~ 15)
}
rate_depthclass <- function(x,
breaks = c(
`very shallow` = 25,
`shallow` = 50,
`moderately deep` = 100,
`deep` = 150,
`very deep` = 1e4
),
...) {
res <- cut(x, c(0, breaks))
factor(res, levels = levels(res), labels = names(breaks), ordered = TRUE)
}
rate_pscs_clay <- function(x, breaks = c(18, 35, 60, 100)) {
res <- cut(x, c(0, breaks))
factor(res, levels = levels(res), ordered = TRUE)
}
To run similar_soils()
, we pass a data.frame or SoilProfileCollection as the first argument. The second argument is a named list that provides a mapping between site-level properties and rating functions. The list elements are functions with minimum arguments x
and ...
, and the element names are the corresponding columns.
m <- list(taxpartsize = rate_taxpartsize,
depth = rate_depthclass,
pscs_clay = rate_pscs_clay)
s <- similar_soils(loamy, m)
#> comparing to dominant reference condition (`7.deep.(18,35]` on 7 rows)
Here we inspect the tabular output of a single run.
head(s)
#> id taxpartsize depth pscs_clay similar_dist similar_single
#> 1 A1 7 moderately deep (18,35] 1 1
#> 2 B1 7 deep (18,35] 0 0
#> 3 C1 7 moderately deep (18,35] 1 1
#> 4 D1 7 deep (18,35] 0 0
#> 5 E1 5 deep (0,18] 3 2
#> 6 F1 5 shallow (0,18] 5 2
#> group similar
#> 1 7.moderately deep.(18,35] TRUE
#> 2 7.deep.(18,35] TRUE
#> 3 7.moderately deep.(18,35] TRUE
#> 4 7.deep.(18,35] TRUE
#> 5 5.deep.(0,18] FALSE
#> 6 5.shallow.(0,18] FALSE
The rating values can be used as surrogates for the detailed properties for calculating distance.
Here, we use cluster::agnes()
to cluster similar sets of rating values, and render a dendrogram.
# inspect distances using agglomerative clustering+dendrogram
d <- cluster::agnes(s[, 5, drop = FALSE], method = "gaverage")
d$height <- d$height + 0.2 # fudge factor for 0-distance
plot(stats::as.dendrogram(d), center = TRUE, type = "triangle")
If we set absolute=FALSE
then the differences between soils can be negative, which can help with processing ratings that are ordinal in nature.
# allow relative contrast ratings to be negative
# (i.e. ordinal factors, concept of "limiting")
# absolute value is still used for "similar" threshold
s2 <- similar_soils(loamy, m, absolute = FALSE)
#> comparing to dominant reference condition (`7.deep.(18,35]` on 7 rows)
# inspect distances unsing agglomerative clustering+dendrogram
d2 <- cluster::agnes(s2[, 5, drop = FALSE], method = "gaverage")
d2$height <- d2$height + 0.2 # fudge factor for 0-distance
plot(stats::as.dendrogram(d2), center = TRUE, type = "triangle")
design_mapunit()
function
A higher-level wrapper function around similar_soils()
is design_mapunit()
. It takes the same inputs as similar_soils()
but it processes the data iteratively until there are no data remaining to be grouped.
Here we use the same input data.frame and rating function.
d <- design_mapunit(loamy, m)
d[order(d$component), ]
#> id taxpartsize depth pscs_clay pscs_frags component
#> 1 A1 fine-loamy 68.07141 34.26617 13.3430897 Alpha
#> 2 B1 fine-loamy 125.65509 25.70668 10.3920511 Alpha
#> 3 C1 fine-loamy 82.03235 29.51870 9.6076022 Alpha
#> 4 D1 fine-loamy 136.54700 27.73477 33.8624746 Alpha
#> 11 A2 fine-loamy 145.74779 20.42760 0.6874675 Alpha
#> 12 B2 fine-loamy 138.76439 25.04729 6.6330011 Alpha
#> 13 C2 fine-loamy 114.43111 25.03331 11.9838727 Alpha
#> 14 D2 fine-loamy 126.47875 24.27037 12.9255822 Alpha
#> 21 A3 fine-loamy 111.48825 30.82608 3.6542921 Alpha
#> 23 C3 fine-loamy 79.15651 30.07310 6.2647017 Alpha
#> 24 D3 fine-loamy 66.55412 18.01062 28.9167000 Alpha
#> 8 H1 loamy-skeletal 137.62819 14.16824 49.8535505 Beta
#> 9 I1 loamy-skeletal 98.41503 15.31168 82.2289934 Beta
#> 10 J1 loamy-skeletal 87.51069 17.81801 38.6778412 Beta
#> 18 H2 loamy-skeletal 59.88691 15.86385 53.8326966 Beta
#> 20 J2 loamy-skeletal 61.63697 17.43131 44.3615694 Beta
#> 28 H3 loamy-skeletal 128.42479 16.45108 57.3262779 Beta
#> 30 J3 loamy-skeletal 85.58064 14.44454 39.3763163 Beta
#> 6 F1 loamy 40.23900 17.59930 27.0047312 Delta
#> 15 E2 loamy 37.83057 14.60978 23.4627516 Delta
#> 22 B3 loamy 45.90668 28.69676 10.0208338 Delta
#> 19 I2 fragmental 71.59082 15.06389 97.3761340 Epsilon
#> 29 I3 fragmental 126.34937 15.40719 97.1617265 Epsilon
#> 5 E1 coarse-loamy 143.15374 14.41170 25.7369392 Gamma
#> 7 G1 coarse-loamy 95.73213 14.98435 23.0575846 Gamma
#> 16 F2 coarse-loamy 89.94654 14.55522 14.9567534 Gamma
#> 17 G2 coarse-loamy 122.22285 14.93214 13.0607596 Gamma
#> 25 E3 coarse-loamy 128.68360 15.90127 12.4687515 Gamma
#> 26 F3 coarse-loamy 86.57938 14.88048 20.4374258 Gamma
#> 27 G3 coarse-loamy 128.15740 15.51927 33.6389675 Gamma
sort(prop.table(table(d$component)), decreasing = TRUE)
#>
#> Alpha Beta Gamma Delta Epsilon
#> 0.36666667 0.23333333 0.23333333 0.10000000 0.06666667
If we assume that the collection observations in x
are spatially representative of the mapunit extent, it is reasonable to consider the resulting proportions to be similar to the abundance represented with component percentages.
A literal interpretation of this output would result in a map unit with 5 groups of similar soils, of which 4 could be major components. Alpha appears to be dominant, Beta and Gamma are co-dominant in second place, Delta and Epsilon are dissimilar, limiting, and very strongly contrasting, with Epsilon as a minor component.
apply(d[3:5], 2, \(dd) {
aggregate(dd, by = list(component = d$component), quantile)
})
#> $depth
#> component x.0% x.25% x.50% x.75% x.100%
#> 1 Alpha 66.55412 80.59443 114.43111 131.51288 145.74779
#> 2 Beta 59.88691 73.60880 87.51069 113.41991 137.62819
#> 3 Delta 37.83057 39.03479 40.23900 43.07284 45.90668
#> 4 Epsilon 71.59082 85.28045 98.97009 112.65973 126.34937
#> 5 Gamma 86.57938 92.83933 122.22285 128.42050 143.15374
#>
#> $pscs_clay
#> component x.0% x.25% x.50% x.75% x.100%
#> 1 Alpha 18.01062 24.65184 25.70668 29.79590 34.26617
#> 2 Beta 14.16824 14.87811 15.86385 16.94120 17.81801
#> 3 Delta 14.60978 16.10454 17.59930 23.14803 28.69676
#> 4 Epsilon 15.06389 15.14972 15.23554 15.32137 15.40719
#> 5 Gamma 14.41170 14.71785 14.93214 15.25181 15.90127
#>
#> $pscs_frags
#> component x.0% x.25% x.50% x.75% x.100%
#> 1 Alpha 0.6874675 6.4488514 10.3920511 13.1343360 33.8624746
#> 2 Beta 38.6778412 41.8689428 49.8535505 55.5794872 82.2289934
#> 3 Delta 10.0208338 16.7417927 23.4627516 25.2337414 27.0047312
#> 4 Epsilon 97.1617265 97.2153284 97.2689302 97.3225321 97.3761340
#> 5 Gamma 12.4687515 14.0087565 20.4374258 24.3972619 33.6389675
If we summarize the properties of the resulting groups, we have:
- Alpha: moderately deep to deep, loamy-skeletal
- Beta: deep to moderately deep, fine-loamy
- Gamma: deep to moderately deep, coarse-loamy
- Delta: shallow, loamy
- Epsilon: moderately deep to deep, fragmental
Depending on the context of where these soils occur within the mapunit, one may opt to further combine (or perhaps split out) unique conditions as the pertain to unique soil properties, landforms, and vegetation.
Also, perhaps some of the rating functions could be adjusted. If Beta and Gamma appear too similar, the rating groups that split coarse and fine loamy could be combined.