Randomize binary or categorical community matrices using categorical generalizations of binary community null model algorithms. Optionally constrain mixing using spatial (row) and taxonomic (column) weights.
Usage
nullcat(
x,
method = nullcat_methods(),
n_iter = 1000L,
output = c("category", "index"),
swaps = c("auto", "vertical", "horizontal", "alternating"),
wt_row = NULL,
wt_col = NULL,
seed = NULL
)Arguments
- x
A matrix of categorical data, encoded as integers. Values should represent category or stratum membership for each cell.
- method
Character specifying the randomization algorithm to use. Options include the following; see details and linked functions for more info.
"curvecat": categorical analog tocurveball; seecurvecat()for details."swapcat": categorical analog toswap; seeswapcat()for details."tswapcat": categorical analog totswap; seetswapcat()for details."r0cat": categorical analog tor0; seer0cat()for details."c0cat": categorical analog toc0; seec0cat()for details.
- n_iter
Number of iterations. Default is 1000. Larger values yield more thorough mixing. Ignored for non-sequential methods. Minimum burn-in times can be estimated with
suggest_n_iter().- output
Character indicating type of result to return:
"category"(default) returns randomized matrix"index"returns an index matrix describing where original entries (a.k.a. "tokens") moved. Useful mainly for testing, and for applications likequantize()that care about token tracking in addition to generic integer categories.
- swaps
Character string controlling the direction of token movement. Only used when method is
"curvecat","swapcat", or"tswapcat". Affects the result only whenoutput = "index", otherwise it only affects computation speed. Options include:"vertical": Tokens move between rows (stay within columns)."
horizontal": Tokens move between columns (stay within rows)."alternating": Tokens move in both dimensions, alternating between vertical and horizontal swaps. Provides full 2D mixing without preserving either row or column token sets."auto"(default): Foroutput = "category", automatically selects the fastest option based on matrix dimensions. Foroutput = "index", defaults to"alternating"for full mixing. Whenwt_roworwt_colis supplied, defaults to the appropriate direction, or"alternating"if both are supplied.
- wt_row
An optional square numeric matrix of non-negative weights controlling which pairs of rows are likely to exchange tokens during randomization. Must be
nrow(x)bynrow(x). This enables spatially or trait-constrained null models where nearby or similar sites exchange tokens more frequently.Values are treated as relative weights (not probabilities) and are normalized internally. The diagonal is ignored. The matrix should be symmetric. Only supported for sequential methods (
curvecat,swapcat,tswapcat).When both
wt_rowandwt_colare supplied,swapsis forced to"alternating", producing a Gibbs-like sweep that applies each weight matrix on its respective margin in alternation.- wt_col
An optional square numeric matrix of non-negative weights controlling which pairs of columns are likely to exchange tokens during randomization. Must be
ncol(x)byncol(x). Seewt_rowfor details on weight interpretation.- seed
Integer used to seed random number generator, for reproducibility.
Value
A matrix of the same dimensions as x, either randomized
categorical values (when output = "category") or an integer index
matrix describing the permutation of entries (when output = "index").
Details
curvecat, swapcat, and tswapcat are sequential algorithms that hold
category multisets fixed in every row and column. These three algorithms
typically reach the same stationary distribution. They differ primarily in
efficiency, with curvecat being the most efficient (i.e. fewest steps to
become fully mixed); swapcat and tswapcat are thus useful mainly for
methodological comparison.
The r0cat algorithm holds category multisets fixed in rows but not columns,
while c0cat does the opposite.
Note that categorical null models are for cell-level categorical data. Site-level attributes (e.g., land cover) or species-level attributes (e.g., functional traits) should be analyzed using different approaches. See vignette for details.
See also
nullcat_batch() for efficient generation of multiple randomized
matrices; nullcat_commsim() for integration with vegan.
Examples
# Create a categorical matrix
set.seed(123)
x <- matrix(sample(1:4, 100, replace = TRUE), nrow = 10)
# Randomize using curvecat method (preserves row & column margins)
x_rand <- nullcat(x, method = "curvecat", n_iter = 1000)
# Check that row multisets are preserved
all.equal(sort(x[1, ]), sort(x_rand[1, ]))
#> [1] TRUE
# Spatially constrained randomization using row weights
coords <- cbind(runif(10), runif(10))
d <- as.matrix(dist(coords))
W <- exp(-d / 0.3) # Gaussian distance decay
x_spatial <- nullcat(x, method = "curvecat", n_iter = 1000, wt_row = W)
# Dual-margin weighting (Gibbs-like alternating)
W_row <- exp(-as.matrix(dist(cbind(runif(10), runif(10)))) / 0.3)
W_col <- exp(-as.matrix(dist(cbind(runif(10), runif(10)))) / 0.3)
x_dual <- nullcat(x, method = "curvecat", n_iter = 1000,
wt_row = W_row, wt_col = W_col)
