--- title: "A weighted co-occurrence matrix (wecoma) representation" author: "Jakub Nowosad" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{(2) A weighted co-occurrence matrix (wecoma) representation} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` This vignette explains what a weighted co-occurrence matrix (*wecoma*) representation is and how to calculate it using the **comat** package. If you do not know what a co-occurrence matrix is, it could be worth to read [the first package vignette](coma.html) first. The examples below assume the **comat** package is attached, and the `raster_x` and `raster_w` datasets are loaded: ```{r setup} library(comat) data(raster_x, package = "comat") data(raster_w, package = "comat") ``` The `raster_x` object is a matrix with three rows and columns with values of 1, 2, and 3. ```{r} raster_x ``` We can imagine that the value of 1 (blueish color) represents population A, the value of 2 (dark green) is population B, and the value of 3 (light green) represents population C. ```{r, echo=FALSE, fig.width=4, fig.height=4} op = par(mar = rep(0, 4)) raster_x2 = apply(raster_x, 2, rev) image(1:3, 1:3, t(raster_x2), col = c("#5F4690", "#1D6996", "#38A6A5", "#0F8554", "#73AF48", "#EDAD08", "#E17C05", "#CC503E", "#94346E", "#6F4070", "#994E95", "#666666")[3:5], axes = FALSE, xlab = "", ylab = "") par(op) ``` The `raster_w` object is also a matrix of the same dimensions. It has values between 2 and 9. ```{r} raster_w ``` This object is different from the first one, as it does not represent categories. Its role is to provide some weights to the previous raster. We can think of it as a number of occurrences in each population. ```{r, echo=FALSE, fig.width=4, fig.height=4} op = par(mar = rep(0, 4)) raster_w2 = apply(raster_w, 2, rev) image(1:3, 1:3, t(raster_w2), col = rev(c("#3D1778", "#583A99", "#715DAA", "#8B7EBB", "#A69DCC", "#BFBADD", "#D8D4EC", "#EDECF9")), axes = FALSE, xlab = "", ylab = "") par(op) ``` The weighted co-occurrence matrix (*wecoma*) representation is a modification of the co-occurrence matrix (*coma*). In the co-occurrence matrix, each adjacency contributes to the output with the constant value 1. In the weighted co-occurrence matrix, on the other hand, each adjacency contributes to the output based on the values from the weight matrix. The contributed value is calculated as the average of the weights in the two adjacent cells. We can use the `get_wecoma()` function to calculate this weighted co-occurrence matrix (*wecoma*) representation: ```{r} get_wecoma(raster_x, raster_w) ``` In this representation, we do not count the neighbors but sum the contributed values from the weight matrix. The smallest value (5) represents the relation between the adjacent cells between the first and the second category. This is due to the relatively small values of the neighborhooding cells of these classes, but also because there is only one case of adjacent cells of these classes. Central left cell (blueish, category 1) has a value of 6, and the bottom left cell (dark green, category 2) has a value of 4. The output value, 5, is an average of the two adjacent weights. On the other hand, a light green region has the largest values in the weight matrix. Therefore, the output of the `get_wecoma()` function has the largest value (49) for the relation between the adjacent cells of the third category. This function allows for some parametrization using additional arguments - `fun` and `na_action`. The `fun` argument selects the function to calculate values from adjacent cells to contribute to the output matrix. It has three possible options: `"mean"` - calculate average values from adjacent cells of the weight matrix, `"geometric_mean"` - calculate geometric mean values from adjacent cells of the weight matrix, or `"focal"` assign a value from the focal cell. The `na_action` argument decides on how to behave in the presence of missing values in the weight matrix. The default, `"replace"`, replaces missing values with 0, `"omit"` does not use cells with missing values, and `"keep"` keeps missing values. ```{r} get_wecoma(raster_x, raster_w, fun = "focal", na_action = "omit") ``` Similarly to the co-occurrence matrix (*coma*), it is possible to convert *wecoma* to its 1D representation. This new form is called a weighted co-occurrence vector (*wecove*), and can be created using the `get_wecove()` function, which accepts an output of `get_wecoma()`: ```{r} my_wecoma = get_wecoma(raster_x, raster_w) get_wecove(my_wecoma, normalization = "pdf") ``` You can see the weighted co-occurrence matrix (*wecoma*) concept, there described as an *exposure matrix*, in action in the vignettes of the **raceland** package (Nowosad, Dmowska, and Stepinski, 2020): 1. [raceland: R package for a pattern-based, zoneless method for analysis and visualization of racial topography](https://jakubnowosad.com/raceland/articles/raceland-intro1.html) 2. [raceland: Describing local racial patterns of racial landscapes at different spatial scales](https://jakubnowosad.com/raceland/articles/raceland-intro2.html) 3. [raceland: Describing local pattern of racial landscape using SocScape grids](https://jakubnowosad.com/raceland/articles/raceland-intro3.html) ## References - Jakub Nowosad, Anna Dmowska and Tomasz Stepinski (2020). raceland: Pattern-Based Zoneless Method for Analysis and Visualization of Racial Topography. R package version 1.0.5. https://CRAN.R-project.org/package=raceland