Skip to contents

A method for applying standardized, customizable "similar soils" rules to site-level data derived from various sources.

Usage

similar_soils(
  x,
  mapping,
  condition = NULL,
  idname = "id",
  thresh = NULL,
  thresh_single = 2,
  thresh_all = 3,
  absolute = TRUE,
  verbose = TRUE
)

Arguments

x

A data.frame or a SoilProfileCollection

mapping

A named list of functions. List element names refer to site-level data columns of x. Each specified function provides a conversion of the data element in x to a value used in the "similar soils" calculation.

condition

integer or character. Default: NULL the value is calculated internally based on the dominant condition of intersection of mapping results in x. You may specify an integer row ID in x for specific similar soil contrasts, or you can specify a character dominant condition equivalent to the name assigned by interaction() e.g. "4.3" for a two rating mapping result where the first rating has value 4 and the second has value 3

idname

ID column name, default "id"

thresh

Deprecated. If used passed to thresh_all.

thresh_single

Sum of differences relative to dominant condition in x. Default 1. See details for discussion on the default calculation for similarity.

thresh_all

Sum of differences relative to dominant condition in x. Default 1. See details for discussion on the default calculation for similarity.

absolute

logical. Report absolute difference? Default: TRUE. Absolute difference is always used for comparison against thresh.

verbose

Default: TRUE message about selected condition

Value

A data.frame containing inputs and three new columns: similar_single (maximum difference in any one property, relative to condition), similar_dist (cumulative sum of differences relative to condition), similar (logical; soil is similar to condition)

Details

The sum of differences across conditions (specified by the intersection of output of the functions in mapping) is used as the "distance" of a soil relative to a dominant (or otherwise specified) condition. A threshold value is used to decide which are "similar" and which are not. The functions in mapping can be customized to use alternate thresholds.

References

Norfleet, M.L. and Eppinette, R.T. (1993), A Mathematical Model for Determining Similar and Contrasting Inclusions for Map Unit Descriptons. Soil Survey Horizons, 34: 4-5. https://doi.org/10.2136/sh1993.1.0004

Examples


data(loamy, package = "SOILmilaR")

rate_taxpartsize <- function(x) {
  dplyr::case_match(x,
                    c("sandy-skeletal") ~ 1,
                    c("sandy") ~ 3,
                    c("loamy", "coarse-loamy", "coarse-silty") ~ 5,
                    c("fine-loamy", "fine-silty") ~ 7,
                    c("clayey", "fine") ~ 9,
                    c("very-fine") ~ 11,
                    c("loamy-skeletal", "clayey-skeletal") ~ 13,
                    "fragmental" ~ 15)
}


rate_depthclass <- function(x, breaks = c( `very shallow` = 25, `shallow` =
  50, `moderately deep` = 100, `deep` = 150, `very deep` = 1e4 ), pattern =
  "R|Cr|Cd|kk|m", hzdesgn = aqp::guessHzDesgnName(x, required = TRUE),
                            ...) {
  res <- cut(x, c(0, breaks))
  factor(res, levels = levels(res), labels = names(breaks), ordered = TRUE)
}

rate_pscs_clay <- function(x,
                           breaks = c(18, 27, 40, 60, 100)) {
  res <- cut(x, c(0, breaks))
  factor(res, levels = levels(res), ordered = TRUE)
}

m <- list(taxpartsize = rate_taxpartsize, depth = rate_depthclass,
pscs_clay = rate_pscs_clay)

s <- similar_soils(loamy, m)
#> comparing to dominant reference condition (`13.moderately deep.(0,18]` on 5 rows)
head(s)
#>   id taxpartsize           depth pscs_clay similar_dist similar_single
#> 1 A1           7 moderately deep   (27,40]            8              6
#> 2 B1           7            deep   (18,27]            8              6
#> 3 C1           7 moderately deep   (27,40]            8              6
#> 4 D1           7            deep   (27,40]            9              6
#> 5 E1           5            deep    (0,18]            9              8
#> 6 F1           5         shallow    (0,18]            9              8
#>                       group similar
#> 1 7.moderately deep.(27,40]   FALSE
#> 2            7.deep.(18,27]   FALSE
#> 3 7.moderately deep.(27,40]   FALSE
#> 4            7.deep.(27,40]   FALSE
#> 5             5.deep.(0,18]   FALSE
#> 6          5.shallow.(0,18]   FALSE

# inspect distances using agglomerative clustering+dendrogram
d <- cluster::agnes(s[, 5, drop = FALSE], method="gaverage")
d$height <- d$height + 0.2 # fudge factor for 0-distance
plot(stats::as.dendrogram(d), center=TRUE, type="triangle")


# allow relative contrast ratings to be negative # (i.e. ordinal factors, concept of "limiting")
# absolute value is still used for "similar" threshold
s2 <- similar_soils(loamy, m, absolute=FALSE)
#> comparing to dominant reference condition (`13.moderately deep.(0,18]` on 5 rows)

# inspect distances using agglomerative clustering+dendrogram
d2 <- cluster::agnes(s2[, 5, drop = FALSE], method="gaverage")
d2$height <- d2$height + 0.2 # fudge factor for 0-distance
plot(stats::as.dendrogram(d2), center=TRUE, type="triangle")