Monday, April 5, 2021

Estimating bias in transferring species distribution models

As some of you may have seen, I had a recent paper come out with Alex Dornburg, Teresa Iglesias, and Katerina Zapfe on the effects of climate change on Australia's only endemic Pokémon, kangaskhan.  


https://besjournals.onlinelibrary.wiley.com/doi/10.1111/2041-210X.13591




While the paper is obviously intended to be humorous (seriously, check out Supplement S1 because it is ridiculous), there's actually a pretty cool new method involved here.  We show that a given study design (i.e., sample size, study area, choice of predictor variables, modeling algorithm, and climate scenario) can create massive biases in the sorts of predictions you might make when building and transferring models.  In some cases these can be so strong that the qualitative prediction you make (e.g., range contraction or expansion) is completely unaffected by the data; the data can only affect the magnitude of the predicted change, not the direction of it.

The super cool bit (in my opinion) is that we show that you can make a fairly simple modification to the Raes and ter Steege (2007) test that allows you to estimate how biased a given design is.  This gives you some idea of which general methodological approaches let the data have the most affect on the outcome, and we even show how you can do this in a spatial context to tell you WHERE your model is more driven by bias and where it's more driven by data.  We think this is a super useful new tool that may give stakeholders some quite valuable information when it comes to applying models to make decisions.

I'll set up a video tutorial on how to do this soon, and eventually we'll probably come up with some sort of wrapper function in ENMTools that simplifies the process.  Right now, though, there are worked examples in the Dryad repo for the supplementary code.  That's here:

Warren, Dan; Dornburg, Alex; Zapfe, Katerina; Iglesias, Teresa (2021), Data and code for analysis of effects of climate change on kangaskhan and summary of simulations from Warren et al. 2020, Dryad, Dataset, https://doi.org/10.5061/dryad.p8cz8w9px


The "block" crossvalidation feature is currently broken on CRAN

 As part of fixing the recent spatstat-related issues, I somehow managed to roll back some much older changes and as a result broke the block crossvalidation features on the CRAN version of ENMTools.  I'm working on an update now that will fix it, but in the interim if you need that feature please just use the "develop" branch from GitHub.  As before, the code for that is:

install.packages("devtools")

devtools::install_github("danlwarren/ENMTools", ref = "develop")

Sunday, April 4, 2021

ENMTools is back on CRAN, minus ppmlasso models

 We've finished the changes necessary to come up to date with the changes to the spatstat package, and ENMTools is now back on CRAN.  Unfortunately one of the changes we had to make was to disable ppmlasso models, since they're not yet compatible with the new spatstat.  We don't know how long those are going to be unavailable, but it could be a while.

Thursday, April 1, 2021

Specifying regularization multiplier for maxent from ENMTools

 I just got a question about this and I figured I should post about it, since it's quite counterintuitive.  If you need to specify a regularization multiplier in ENMTools for Maxent to use, you need to do it with the "-b" flag and the numerical argument passed as two separate arguments.  For instance if you wanted to model Iberolacreta monticola from the sample data using a regularization multiplier of 5, you'd do this:


library(ENMTools)

mont <- iberolacerta.clade$species$monticola

mont.mx <- enmtools.maxent(mont, euro.worldclim, args = c("-b", "5"))