Synthetic Transects • SOILmilaR

library(SOILmilaR)

Set pseudorandom number generator seed for consistent results between runs (optional).

set.seed(123)

Next we will generate some sample data representing 5 “synthetic transects”.

We assume a similar “soil forming function” for these transects across each hypothetical “delineation” being transected.

Here the values have been customized so that taxonomic particle size is fine-loamy, but coarser textured in higher fragment material. Soil depth varies from shallow to very deep, uniform but centered around moderately deep. pscs_* quantities provided for example numeric quantities that can be used.

x <- do.call('rbind', lapply(1:3, \(i) data.frame(id = paste0(LETTERS[1:10], i),
           taxpartsize = c("fine-loamy","loamy","fine-loamy","fine-loamy",
                           "coarse-loamy", "coarse-loamy", "coarse-loamy",
                           "loamy-skeletal", "loamy-skeletal", "loamy-skeletal"),
           depth = runif(10, 35, 150),
           pscs_clay = c(runif(4, 18, 35), runif(6, 14, 18)),
           pscs_frags = c(runif(3, 0, 15), runif(4, 10, 34),
                          runif(3, 35, 60) + c(0, 40, 0)))))

Next we define some rating functions for particle size class and depth class.

rate_taxpartsize <- function(x) {
  dplyr::case_match(x,
                    c("sandy-skeletal") ~ 1,
                    c("sandy") ~ 2,
                    c("loamy", "coarse-loamy", "coarse-silty") ~ 3,
                    c("fine-loamy", "fine-silty") ~ 4,
                    c("clayey", "fine") ~ 5,
                    c("very-fine") ~ 6,
                    c("loamy-skeletal", "clayey-skeletal") ~ 7)
}

rate_depthclass <- function(x,
                            breaks = c(
                              `very shallow` = 25,
                              `shallow` = 50,
                              `moderately deep` = 100,
                              `deep` = 150,
                              `very deep` = 1e4
                            ),
                            ...) {
  res <- cut(x, c(0, breaks))
  factor(res, levels = levels(res), labels = names(breaks))
}

The above rating functions can be combined in a list (m) which will be use used as the mapping argument to similar_soils(). The similar_soils() function applies the rating functions to the columns of the input data x. Target column names in the data match the names of m, providing the “mapping” of data to rating functions.

Now we will demonstrate iterative filtering and application of similar soils criteria.

m <- list(taxpartsize = rate_taxpartsize,
          depth =  rate_depthclass)

res0 <- similar_soils(x, m)
#> comparing to dominant reference condition (`7.3` on 6 rows)
res0
#>    id taxpartsize depth similar_dist similar
#> 1  A1           4     3            3   FALSE
#> 2  B1           3     4            5   FALSE
#> 3  C1           4     3            3   FALSE
#> 4  D1           4     4            4   FALSE
#> 5  E1           3     4            5   FALSE
#> 6  F1           3     2            5   FALSE
#> 7  G1           3     3            4   FALSE
#> 8  H1           7     4            1    TRUE
#> 9  I1           7     3            0    TRUE
#> 10 J1           7     3            0    TRUE
#> 11 A2           4     4            4   FALSE
#> 12 B2           3     4            5   FALSE
#> 13 C2           4     4            4   FALSE
#> 14 D2           4     4            4   FALSE
#> 15 E2           3     2            5   FALSE
#> 16 F2           3     3            4   FALSE
#> 17 G2           3     4            5   FALSE
#> 18 H2           7     3            0    TRUE
#> 19 I2           7     3            0    TRUE
#> 20 J2           7     3            0    TRUE
#> 21 A3           4     4            4   FALSE
#> 22 B3           3     2            5   FALSE
#> 23 C3           4     3            3   FALSE
#> 24 D3           4     3            3   FALSE
#> 25 E3           3     4            5   FALSE
#> 26 F3           3     3            4   FALSE
#> 27 G3           3     4            5   FALSE
#> 28 H3           7     4            1    TRUE
#> 29 I3           7     4            1    TRUE
#> 30 J3           7     3            0    TRUE

First, we accept what is identified as the “dominant condition” for the dataset, including the similar soils, will be a major component of the hypothetical mapunit. Identifying soils similar to "7.3" (moderately deep, skeletal), corresponds to the soils supporting a hypothetical major component in a mapunit.

We might consider selecting a different reference condition manually after inspection. If we were to do that we could set, for example, condition="4.4" to select condition "4.4" rather than "7.3" that was automatically selected in this example.

Next, let’s take the remaining dissimilar soils, and re-apply the similarity criteria based on the next-most-dominant condition. We could consider loosening the concept of “similarity” by setting thresh=2.

y <- subset(x, !res0$similar, select = c("id", "taxpartsize", "depth"))

res1 <- similar_soils(y, m)
#> comparing to dominant reference condition (`3.4` on 6 rows)
res1
#>    id taxpartsize depth similar_dist similar
#> 1  A1           4     3            2   FALSE
#> 2  B1           3     4            0    TRUE
#> 3  C1           4     3            2   FALSE
#> 4  D1           4     4            1    TRUE
#> 5  E1           3     4            0    TRUE
#> 6  F1           3     2            2   FALSE
#> 7  G1           3     3            1    TRUE
#> 11 A2           4     4            1    TRUE
#> 12 B2           3     4            0    TRUE
#> 13 C2           4     4            1    TRUE
#> 14 D2           4     4            1    TRUE
#> 15 E2           3     2            2   FALSE
#> 16 F2           3     3            1    TRUE
#> 17 G2           3     4            0    TRUE
#> 21 A3           4     4            1    TRUE
#> 22 B3           3     2            2   FALSE
#> 23 C3           4     3            2   FALSE
#> 24 D3           4     3            2   FALSE
#> 25 E3           3     4            0    TRUE
#> 26 F3           3     3            1    TRUE
#> 27 G3           3     4            0    TRUE

At this second step, "3.4" (loamy, moderately deep) is the dominant condition, also identified as similar are "3.4" (coarse-loamy, deep). One might consider which one of these is the best representative condition for the mapunit (including unobserved areas) regardless of what is “dominant” per se.

If there are issues with dissimilar soils being included in the same groups, consider revising the rating functions to ensure dissimilar properties have a distance greater than the set threshold (thresh). With similar_soils(), you can specify an alternate condition to compare against, or a thresh value higher or lower than 1.

z <- subset(x, !x$id %in% c(res0$id[res0$similar], res1$id[res1$similar]),
            select = c("id", "taxpartsize", "depth"))
res2 <- similar_soils(z, m)
#> comparing to dominant reference condition (`4.3` on 4 rows)
res2
#>    id taxpartsize depth similar_dist similar
#> 1  A1           4     3            0    TRUE
#> 3  C1           4     3            0    TRUE
#> 6  F1           3     2            2   FALSE
#> 15 E2           3     2            2   FALSE
#> 22 B3           3     2            2   FALSE
#> 23 C3           4     3            0    TRUE
#> 24 D3           4     3            0    TRUE

Applying the similar soils criteria a final time, we are left with two components, "4.3" and "3.2"; the latter is dissimilar (and limiting).

Let’s reconstruct a data.frame, with the 4 groups of similar soils each identified with a greek letter. We will see which is the most prevalent overall based on the whole dataset. We could also assess prevalence within individual transects.

fin <- do.call('rbind', list(
  data.frame(component = greekletters[[1]][1], subset(res0, similar)),
  data.frame(component = greekletters[[1]][2], subset(res1, similar)),
  data.frame(component = greekletters[[1]][3], subset(res2, similar)),
  data.frame(component = greekletters[[1]][4], subset(res2, !similar))
))

# label any unassigned observations
una <- subset(res0, !res0$id %in% fin$id)
if (nrow(una) > 0) {
  fin <- rbind(fin, data.frame(component = "unassigned", una))
}

# put in original order of dataset
fin <- fin[match(x$id, fin$id), ]

We can tabulate the assignments we made and see how that corresponds with our concept for the relative abundance of the soils on the landscape in the typical delineation.

res <- sort(prop.table(table(fin$component)), decreasing = TRUE)
res
#> 
#>      Beta     Alpha     Gamma     Delta 
#> 0.4666667 0.3000000 0.1333333 0.1000000

# TODO: abstract this concept
cmp <- subset(fin, component == names(res[1]))
ref <- names(tail(sort(table(
  interaction(cmp$taxpartsize, cmp$depth)
)), 1))
fin_sim <- similar_soils(x, m, ref)
#> comparing to dominant reference condition (`3.4` on 6 rows)

# transfer similarity distance and similar ranking
fin$similar_dist <- fin_sim$similar_dist
fin$similar <- fin_sim$similar # similarity to the dominant condition within Beta

# original sort order
fin
#>    component id taxpartsize depth similar_dist similar
#> 1      Gamma A1           4     3            2   FALSE
#> 2       Beta B1           3     4            0    TRUE
#> 3      Gamma C1           4     3            2   FALSE
#> 4       Beta D1           4     4            1    TRUE
#> 5       Beta E1           3     4            0    TRUE
#> 6      Delta F1           3     2            2   FALSE
#> 7       Beta G1           3     3            1    TRUE
#> 8      Alpha H1           7     4            4   FALSE
#> 9      Alpha I1           7     3            5   FALSE
#> 10     Alpha J1           7     3            5   FALSE
#> 11      Beta A2           4     4            1    TRUE
#> 12      Beta B2           3     4            0    TRUE
#> 13      Beta C2           4     4            1    TRUE
#> 14      Beta D2           4     4            1    TRUE
#> 15     Delta E2           3     2            2   FALSE
#> 16      Beta F2           3     3            1    TRUE
#> 17      Beta G2           3     4            0    TRUE
#> 18     Alpha H2           7     3            5   FALSE
#> 19     Alpha I2           7     3            5   FALSE
#> 20     Alpha J2           7     3            5   FALSE
#> 21      Beta A3           4     4            1    TRUE
#> 22     Delta B3           3     2            2   FALSE
#> 23     Gamma C3           4     3            2   FALSE
#> 24     Gamma D3           4     3            2   FALSE
#> 25      Beta E3           3     4            0    TRUE
#> 26      Beta F3           3     3            1    TRUE
#> 27      Beta G3           3     4            0    TRUE
#> 28     Alpha H3           7     4            4   FALSE
#> 29     Alpha I3           7     4            4   FALSE
#> 30     Alpha J3           7     3            5   FALSE

	component	id	taxpartsize	depth	similar_dist	similar
2	Beta	B1	3	4	0	TRUE
5	Beta	E1	3	4	0	TRUE
12	Beta	B2	3	4	0	TRUE
17	Beta	G2	3	4	0	TRUE
25	Beta	E3	3	4	0	TRUE
27	Beta	G3	3	4	0	TRUE
4	Beta	D1	4	4	1	TRUE
7	Beta	G1	3	3	1	TRUE
11	Beta	A2	4	4	1	TRUE
13	Beta	C2	4	4	1	TRUE
14	Beta	D2	4	4	1	TRUE
16	Beta	F2	3	3	1	TRUE
21	Beta	A3	4	4	1	TRUE
26	Beta	F3	3	3	1	TRUE
1	Gamma	A1	4	3	2	FALSE
3	Gamma	C1	4	3	2	FALSE
6	Delta	F1	3	2	2	FALSE
15	Delta	E2	3	2	2	FALSE
22	Delta	B3	3	2	2	FALSE
23	Gamma	C3	4	3	2	FALSE
24	Gamma	D3	4	3	2	FALSE
8	Alpha	H1	7	4	4	FALSE
28	Alpha	H3	7	4	4	FALSE
29	Alpha	I3	7	4	4	FALSE
9	Alpha	I1	7	3	5	FALSE
10	Alpha	J1	7	3	5	FALSE
18	Alpha	H2	7	3	5	FALSE
19	Alpha	I2	7	3	5	FALSE
20	Alpha	J2	7	3	5	FALSE
30	Alpha	J3	7	3	5	FALSE