R zonal statistics for classes

I am trying to extract zonal statistics for different classes in R.
I have a raster with two classes (0,1) and need the area (or percentage) of each class under the area of each polygon. I tried exactextractr::exact_extract but I don’t manage to extract a frequency of each class. I can get the sum of class 1, but that wont tell me the proportions.

Geographic Information Systems Asked on November 12, 2021

2 Answers

2 Answers

You can write a simple function using prop.table and table to return proportions of multiple classes. The catch is that you have to know what all of the classes are before hand so you can fix the number of expected classes.

Here is an example of what is going on.

Here we set our "known" classes and then set up a loop that randomly samples a vector of 1:10 (some values may be missing in a given iteration). We can take the know classes and create an empty factor in x and then calculate our class proportions. If a value is missing then the resulting freq is 0.

classes <- 1:10

p <- list()
  for(i in 1:10) {
    x <- sample(1:10, 10, replace=TRUE)
    p[[i]] <-, levels = classes))))

Now we can expand this idea to zonal statistics using exact_extract.

Add libraries and create some example data


r <- raster(nrows=180, ncols=360, xmn=571823.6, xmx=616763.6, ymn=4423540, 
            ymx=4453690, resolution=270, crs = CRS("+proj=utm +zone=12 +datum=NAD83 
            +units=m +no_defs +ellps=GRS80 +towgs84=0,0,0"))

  r[] <- rpois(ncell(r), lambda=1)

x <- gBuffer(sampleRandom(r, 10, na.rm = TRUE, sp = TRUE),  
             byid = TRUE, width = 1000) 
[email protected] <- data.frame([email protected], ID=paste0("poly", 1:nrow(x)))

  plot(x, add=TRUE)

Now we extract the data and use lapply to apply a function to the resulting list object. We create the known classes by using unique on the raster object. Because you have to read the raster into memory, this could be a real processing bottleneck though.

( e <- exact_extract(r, as(x, "sf")) )   

classes <- sort(unique(r[]))
cp <- lapply(e, FUN=function(x) {[,1], 
                          levels = classes))))} ) 
names(cp) <- x$ID   

You can perform some fancy data wrangling to get a data.frame that relates back to your polygons using a simple for loop with transpose. I set up an empty data.frame first so I can populate it using a simple assignment.

props <- data.frame(matrix(vector(), length(cp), length(classes)+1,
                    dimnames=list(c(), c("ID", paste0("class_",classes)))))
  props$ID <- names(cp)  
for(i in 1:length(cp)){ props[i,][2:ncol(props)] <- t(cp[[i]][,2]) }

Answered by Jeffrey Evans on November 12, 2021

You can get the frequency of a single class by passing a custom summary function to exact_extract. For example, to get the fraction of pixels that have a value of 1, you could run:

exact_extract(rast, polys, function(value, fraction) {
     sum(fraction[value == 1]) / sum(fraction)

If you have an arbitrary number of classes, here's a solution that will provide the frequencies of each class in each polygon. It does not require knowing the classes in advance, nor does it require loading all intersected pixels into memory.


freqs <- exact_extract(r, g, function(value, coverage_fraction) {
  data.frame(value = value,
             frac = coverage_fraction / sum(coverage_fraction)) %>%
    group_by(value) %>%
    summarize(freq = sum(frac), .groups = 'drop') %>%
    pivot_wider(names_from = 'value',
                names_prefix = 'freq_',
                values_from = 'freq')
}) %>% 
  mutate(across(starts_with('freq'), replace_na, 0))

Basically, we provide a function to exact_extract that returns a one-row data frame for each polygon, with a column containing the frequency of each class found in that polygon. Doing this with a callback (i.e., specifying the fun argument) is important, because otherwise R must store every pixel that intersects every polygon in memory at the same time. With the callback, these pixels are reduced to a frequency table as each polygon is processed. Internally, exact_extract uses dplyr::bind_rows to merge the data frames for each polygon, which handles the fact that not all classes are present in each data frame. However, it fills in NA for the frequency of missing classes, so we use a final mutate call to replace these with zero.

Answered by dbaston on November 12, 2021

Add your own answers!

Related Questions

Adding shapefiles to PostGIS database

3  Asked on July 1, 2021 by sam007


Copying coordinates under mouse to clipboard in QGIS

3  Asked on July 1, 2021 by mefistotelis


QGIS SRS storage

0  Asked on July 1, 2021 by sergey-n


LiDAR XYZ Data Cleaning

0  Asked on July 1, 2021 by santosh-gujar


Ask a Question

Get help from others!

© 2021 All rights reserved.