Skip to contents
library(SOILmilaR)

data("loamy", package = "SOILmilaR")

The object loamy is a sample data set representing 3 “synthetic transects”.

We assume a similar “soil forming function” for these transects across each hypothetical “delineation” being transected. Here the values have been customized so that taxonomic particle size class is fine-loamy, but coarser textured in higher fragment material. Soil depth varies from shallow to very deep, uniform but centered around moderately deep. pscs_* quantities provided for example numeric quantities that can be used.

Next we define some rating functions for particle size class and depth class.

rate_taxpartsize <- function(x) {
  dplyr::case_match(x,
                    c("sandy-skeletal") ~ 1,
                    c("sandy") ~ 3,
                    c("loamy", "coarse-loamy", "coarse-silty") ~ 5,
                    c("fine-loamy", "fine-silty") ~ 7,
                    c("clayey", "fine") ~ 9,
                    c("very-fine") ~ 11,
                    c("loamy-skeletal", "clayey-skeletal") ~ 13,
                    "fragmental" ~ 15)
}

rate_depthclass <- function(x,
                            breaks = c(
                              `very shallow` = 25,
                              `shallow` = 50,
                              `moderately deep` = 100,
                              `deep` = 150,
                              `very deep` = 1e4
                            ),
                            ...) {
  res <- cut(x, c(0, breaks))
  factor(res, levels = levels(res), labels = names(breaks))
}

The above rating functions can be combined in a list (m) which will be use used as the mapping argument to similar_soils(). The similar_soils() function applies the rating functions to the columns of the input data x. Target column names in the data match the names of m, providing the “mapping” of data to rating functions.

Now we will demonstrate iterative filtering and application of similar soils criteria.

m <- list(taxpartsize = rate_taxpartsize,
          depth =  rate_depthclass)
x <- loamy
res0 <- similar_soils(x, m)
#> comparing to dominant reference condition (`7.deep` on 7 rows)
res0
#>    id taxpartsize           depth similar_dist similar_single
#> 1  A1           7 moderately deep            1              1
#> 2  B1           7            deep            0              0
#> 3  C1           7 moderately deep            1              1
#> 4  D1           7            deep            0              0
#> 5  E1           5            deep            2              2
#> 6  F1           5         shallow            4              2
#> 7  G1           5 moderately deep            3              2
#> 8  H1          13            deep            6              6
#> 9  I1          13 moderately deep            7              6
#> 10 J1          13 moderately deep            7              6
#> 11 A2           7            deep            0              0
#> 12 B2           7            deep            0              0
#> 13 C2           7            deep            0              0
#> 14 D2           7            deep            0              0
#> 15 E2           5         shallow            4              2
#> 16 F2           5 moderately deep            3              2
#> 17 G2           5            deep            2              2
#> 18 H2          13 moderately deep            7              6
#> 19 I2          15 moderately deep            9              8
#> 20 J2          13 moderately deep            7              6
#> 21 A3           7            deep            0              0
#> 22 B3           5         shallow            4              2
#> 23 C3           7 moderately deep            1              1
#> 24 D3           7 moderately deep            1              1
#> 25 E3           5            deep            2              2
#> 26 F3           5 moderately deep            3              2
#> 27 G3           5            deep            2              2
#> 28 H3          13            deep            6              6
#> 29 I3          15            deep            8              8
#> 30 J3          13 moderately deep            7              6
#>                 group similar
#> 1   7.moderately deep    TRUE
#> 2              7.deep    TRUE
#> 3   7.moderately deep    TRUE
#> 4              7.deep    TRUE
#> 5              5.deep   FALSE
#> 6           5.shallow   FALSE
#> 7   5.moderately deep   FALSE
#> 8             13.deep   FALSE
#> 9  13.moderately deep   FALSE
#> 10 13.moderately deep   FALSE
#> 11             7.deep    TRUE
#> 12             7.deep    TRUE
#> 13             7.deep    TRUE
#> 14             7.deep    TRUE
#> 15          5.shallow   FALSE
#> 16  5.moderately deep   FALSE
#> 17             5.deep   FALSE
#> 18 13.moderately deep   FALSE
#> 19 15.moderately deep   FALSE
#> 20 13.moderately deep   FALSE
#> 21             7.deep    TRUE
#> 22          5.shallow   FALSE
#> 23  7.moderately deep    TRUE
#> 24  7.moderately deep    TRUE
#> 25             5.deep   FALSE
#> 26  5.moderately deep   FALSE
#> 27             5.deep   FALSE
#> 28            13.deep   FALSE
#> 29            15.deep   FALSE
#> 30 13.moderately deep   FALSE

Identifying soils similar to "13.3" (moderately deep, skeletal), corresponds to a major, dominant, component in a map unit.

We might consider selecting a different reference condition manually after inspection. If we were to do that we could set, for example, condition="7.4" to compare against condition "7.4" rather than "13.3" that was automatically selected in this example.

Next, let’s take the remaining dissimilar soils, and re-apply the similarity criteria based on the next-most-dominant condition.

y <- subset(x, !res0$similar, select = c("id", "taxpartsize", "depth"))

res1 <- similar_soils(y, m)
#> comparing to dominant reference condition (`13.moderately deep` on 5 rows)
res1
#>    id taxpartsize           depth similar_dist similar_single
#> 5  E1           5            deep            9              8
#> 6  F1           5         shallow            9              8
#> 7  G1           5 moderately deep            8              8
#> 8  H1          13            deep            1              1
#> 9  I1          13 moderately deep            0              0
#> 10 J1          13 moderately deep            0              0
#> 15 E2           5         shallow            9              8
#> 16 F2           5 moderately deep            8              8
#> 17 G2           5            deep            9              8
#> 18 H2          13 moderately deep            0              0
#> 19 I2          15 moderately deep            2              2
#> 20 J2          13 moderately deep            0              0
#> 22 B3           5         shallow            9              8
#> 25 E3           5            deep            9              8
#> 26 F3           5 moderately deep            8              8
#> 27 G3           5            deep            9              8
#> 28 H3          13            deep            1              1
#> 29 I3          15            deep            3              2
#> 30 J3          13 moderately deep            0              0
#>                 group similar
#> 5              5.deep   FALSE
#> 6           5.shallow   FALSE
#> 7   5.moderately deep   FALSE
#> 8             13.deep    TRUE
#> 9  13.moderately deep    TRUE
#> 10 13.moderately deep    TRUE
#> 15          5.shallow   FALSE
#> 16  5.moderately deep   FALSE
#> 17             5.deep   FALSE
#> 18 13.moderately deep    TRUE
#> 19 15.moderately deep   FALSE
#> 20 13.moderately deep    TRUE
#> 22          5.shallow   FALSE
#> 25             5.deep   FALSE
#> 26  5.moderately deep   FALSE
#> 27             5.deep   FALSE
#> 28            13.deep    TRUE
#> 29            15.deep   FALSE
#> 30 13.moderately deep    TRUE

At this second step, "5.4" (coarse-loamy, deep) is the dominant condition, also identified as similar are "5.3" (loamy, moderately deep)

One might consider which one of these is the best representative condition for the map unit (including unobserved areas) regardless of what is “dominant” in the observation data.

If there are issues with dissimilar soils being included in the same groups, consider revising the rating functions to ensure dissimilar properties have a distance greater than the set threshold (thresh_single). With similar_soils(), you can specify an alternate condition to compare against, or a thresh_single value higher or lower than 2.

z <- subset(x, !x$id %in% c(res0$id[res0$similar], res1$id[res1$similar]),
            select = c("id", "taxpartsize", "depth"))
res2 <- similar_soils(z, m)
#> comparing to dominant reference condition (`5.deep` on 4 rows)
res2
#>    id taxpartsize           depth similar_dist similar_single
#> 5  E1           5            deep            0              0
#> 6  F1           5         shallow            2              2
#> 7  G1           5 moderately deep            1              1
#> 15 E2           5         shallow            2              2
#> 16 F2           5 moderately deep            1              1
#> 17 G2           5            deep            0              0
#> 19 I2          15 moderately deep           11             10
#> 22 B3           5         shallow            2              2
#> 25 E3           5            deep            0              0
#> 26 F3           5 moderately deep            1              1
#> 27 G3           5            deep            0              0
#> 29 I3          15            deep           10             10
#>                 group similar
#> 5              5.deep    TRUE
#> 6           5.shallow   FALSE
#> 7   5.moderately deep    TRUE
#> 15          5.shallow   FALSE
#> 16  5.moderately deep    TRUE
#> 17             5.deep    TRUE
#> 19 15.moderately deep   FALSE
#> 22          5.shallow   FALSE
#> 25             5.deep    TRUE
#> 26  5.moderately deep    TRUE
#> 27             5.deep    TRUE
#> 29            15.deep   FALSE

Applying the similar soils criteria again, we find "7.4" is next most dominant, and "7.3" is similar to it.

We are left with a few soils that are not similar to any of the prior 3 sets:

subset(res2, !similar)
#>    id taxpartsize           depth similar_dist similar_single
#> 6  F1           5         shallow            2              2
#> 15 E2           5         shallow            2              2
#> 19 I2          15 moderately deep           11             10
#> 22 B3           5         shallow            2              2
#> 29 I3          15            deep           10             10
#>                 group similar
#> 6           5.shallow   FALSE
#> 15          5.shallow   FALSE
#> 19 15.moderately deep   FALSE
#> 22          5.shallow   FALSE
#> 29            15.deep   FALSE

We see that all 3 of the remaining soils are "5.2" (loamy, shallow).

Let’s construct a data.frame, with the 4 groups of similar soils each identified with a greek letter. We will see which is the most prevalent overall based on the whole data set. We could also assess prevalence within individual transects.

fin <- do.call('rbind', list(
  data.frame(component = greekletters[[1]][1], subset(res0, similar)),
  data.frame(component = greekletters[[1]][2], subset(res1, similar)),
  data.frame(component = greekletters[[1]][3], subset(res2, similar)),
  data.frame(component = greekletters[[1]][4], subset(res2, !similar))
))

# label any unassigned observations
una <- subset(res0, !res0$id %in% fin$id)
if (nrow(una) > 0) {
  fin <- rbind(fin, data.frame(component = "unassigned", una))
}

# put in original order of dataset
fin <- fin[match(x$id, fin$id), ]

We can tabulate the assignments we made and see how that corresponds with our concept for the relative abundance of the soils on the landscape in the typical delineation.

res <- sort(prop.table(table(fin$component)), decreasing = TRUE)
res
#> 
#>     Alpha      Beta     Gamma     Delta 
#> 0.3666667 0.2333333 0.2333333 0.1666667

In this case, we see Alpha, Beta and Gamma as major components, and Delta as a lesser component. However, Delta is dissimilar (and shallow, which is likely strongly contrasting) compared to all of the prior soils, so if we accept the observation proportions as map unit component percentages we would have a four major component map unit.

It may be that there are other miscellaneous areas, or more-rarely-observed, unique, contrasting soils that were not captured as distinct by the rating functions. These could be split off either by creating a rating to capture them, or just noting their presence as minor components.

# TODO: abstract this concept
cmp <- subset(fin, component == names(res[1]))
ref <- names(tail(sort(table(
  interaction(cmp$taxpartsize, cmp$depth)
)), 1))
fin_sim <- similar_soils(x, m, ref)
#> comparing to dominant reference condition (`7.deep` on 7 rows)

# transfer similarity distance and similar ranking
fin$similar_dist <- fin_sim$similar_dist
fin$similar <- fin_sim$similar # similarity to the dominant condition within Beta

# original sort order
fin
#>    component id taxpartsize           depth similar_dist similar_single
#> 1      Alpha A1           7 moderately deep            1              1
#> 2      Alpha B1           7            deep            0              0
#> 3      Alpha C1           7 moderately deep            1              1
#> 4      Alpha D1           7            deep            0              0
#> 5      Gamma E1           5            deep            2              0
#> 6      Delta F1           5         shallow            4              2
#> 7      Gamma G1           5 moderately deep            3              1
#> 8       Beta H1          13            deep            6              1
#> 9       Beta I1          13 moderately deep            7              0
#> 10      Beta J1          13 moderately deep            7              0
#> 11     Alpha A2           7            deep            0              0
#> 12     Alpha B2           7            deep            0              0
#> 13     Alpha C2           7            deep            0              0
#> 14     Alpha D2           7            deep            0              0
#> 15     Delta E2           5         shallow            4              2
#> 16     Gamma F2           5 moderately deep            3              1
#> 17     Gamma G2           5            deep            2              0
#> 18      Beta H2          13 moderately deep            7              0
#> 19     Delta I2          15 moderately deep            9             10
#> 20      Beta J2          13 moderately deep            7              0
#> 21     Alpha A3           7            deep            0              0
#> 22     Delta B3           5         shallow            4              2
#> 23     Alpha C3           7 moderately deep            1              1
#> 24     Alpha D3           7 moderately deep            1              1
#> 25     Gamma E3           5            deep            2              0
#> 26     Gamma F3           5 moderately deep            3              1
#> 27     Gamma G3           5            deep            2              0
#> 28      Beta H3          13            deep            6              1
#> 29     Delta I3          15            deep            8             10
#> 30      Beta J3          13 moderately deep            7              0
#>                 group similar
#> 1   7.moderately deep    TRUE
#> 2              7.deep    TRUE
#> 3   7.moderately deep    TRUE
#> 4              7.deep    TRUE
#> 5              5.deep   FALSE
#> 6           5.shallow   FALSE
#> 7   5.moderately deep   FALSE
#> 8             13.deep   FALSE
#> 9  13.moderately deep   FALSE
#> 10 13.moderately deep   FALSE
#> 11             7.deep    TRUE
#> 12             7.deep    TRUE
#> 13             7.deep    TRUE
#> 14             7.deep    TRUE
#> 15          5.shallow   FALSE
#> 16  5.moderately deep   FALSE
#> 17             5.deep   FALSE
#> 18 13.moderately deep   FALSE
#> 19 15.moderately deep   FALSE
#> 20 13.moderately deep   FALSE
#> 21             7.deep    TRUE
#> 22          5.shallow   FALSE
#> 23  7.moderately deep    TRUE
#> 24  7.moderately deep    TRUE
#> 25             5.deep   FALSE
#> 26  5.moderately deep   FALSE
#> 27             5.deep   FALSE
#> 28            13.deep   FALSE
#> 29            15.deep   FALSE
#> 30 13.moderately deep   FALSE
component id taxpartsize depth similar_dist similar_single group similar
2 Alpha B1 7 deep 0 0 7.deep TRUE
4 Alpha D1 7 deep 0 0 7.deep TRUE
11 Alpha A2 7 deep 0 0 7.deep TRUE
12 Alpha B2 7 deep 0 0 7.deep TRUE
13 Alpha C2 7 deep 0 0 7.deep TRUE
14 Alpha D2 7 deep 0 0 7.deep TRUE
21 Alpha A3 7 deep 0 0 7.deep TRUE
1 Alpha A1 7 moderately deep 1 1 7.moderately deep TRUE
3 Alpha C1 7 moderately deep 1 1 7.moderately deep TRUE
23 Alpha C3 7 moderately deep 1 1 7.moderately deep TRUE
24 Alpha D3 7 moderately deep 1 1 7.moderately deep TRUE
5 Gamma E1 5 deep 2 0 5.deep FALSE
17 Gamma G2 5 deep 2 0 5.deep FALSE
25 Gamma E3 5 deep 2 0 5.deep FALSE
27 Gamma G3 5 deep 2 0 5.deep FALSE
7 Gamma G1 5 moderately deep 3 1 5.moderately deep FALSE
16 Gamma F2 5 moderately deep 3 1 5.moderately deep FALSE
26 Gamma F3 5 moderately deep 3 1 5.moderately deep FALSE
6 Delta F1 5 shallow 4 2 5.shallow FALSE
15 Delta E2 5 shallow 4 2 5.shallow FALSE
22 Delta B3 5 shallow 4 2 5.shallow FALSE
8 Beta H1 13 deep 6 1 13.deep FALSE
28 Beta H3 13 deep 6 1 13.deep FALSE
9 Beta I1 13 moderately deep 7 0 13.moderately deep FALSE
10 Beta J1 13 moderately deep 7 0 13.moderately deep FALSE
18 Beta H2 13 moderately deep 7 0 13.moderately deep FALSE
20 Beta J2 13 moderately deep 7 0 13.moderately deep FALSE
30 Beta J3 13 moderately deep 7 0 13.moderately deep FALSE
29 Delta I3 15 deep 8 10 15.deep FALSE
19 Delta I2 15 moderately deep 9 10 15.moderately deep FALSE

This process of iteratively and grouping soils into similarity groups can be automated using the design_mapunit() function.