Friday, August 28, 2020

ENMTools version 1.0.1 is on CRAN! Clamping, variable importance, and progress bars!

 

Enhancements

  • Added variable importance tests via interface with the vip package
  • Added clamping for the predict functions, including plots of where clamping is happening
  • Added clamping for model construction functions, with a TRUE/FALSE switch defaulting to TRUE
  • Changed naming conventions for predict functions so that the suitability raster is in the $suitability slot, just as with modeling functions
  • Added progress bars for a lot of tests
  • Added “verbose” option for a lot of functions, defaulting to FALSE

Bug fixes

  • Fixed interactive.plot generic and moved the function to its own file to make it easier to extend
  • Temporarily suppressing some warnings coming out of leaflet that are being produced by the recent rgdal changes
  • Fixed background sampling code to resample when necessary
  • Changed enmtools.ranger demo code to actually use ranger instead of rf
  • Fixed code for calculating p values for some of the hypothesis tests, the old code was getting wrong answers when there were repeated value

Thursday, August 27, 2020

How do deal with recalibration errors


Due to changes to the CalibratR package, some of the recalibration methods don't work properly on some systems.  This is due to the way CalibratR is addressing the parallel package, which Mac OS (and maybe others????) doesn't seem to like.  There is a workaround, though; just copy the following code and run it before you run enmtools.calibrate, and all should be well!


if (Sys.getenv("RSTUDIO") == "1" && !nzchar(Sys.getenv("RSTUDIO_TERM")) && Sys.info()["sysname"] == "Darwin" && getRversion() >= "4.0.0") {

    parallel:::setDefaultClusterOptions(setup_strategy = "sequential")

} 

Tuesday, August 11, 2020

Hacking together the Bohl et al. test

 A recent paper by Bohl et al. suggested a new method for testing statistical significance of ENM predictions.  It's similar to a test by Raes and ter Steege and existing tests in ENMTools, but as I understand it the difference is as follows:

Raes and ter Steege choose random points from the study area to build a data set equivalent to the size of the empirical data set, and compare the performance of the models on training data to the performance on random training data.  This is what you get in ENMTools if you set rts.reps > 0 and test.prop = 0.

ENMTools' implementation of the Raes and ter Steege test added the ability (via setting test.prop > 0) to split the randomly drawn spatial data into training and test subsets and compare your empirical model's ability to predict your empirical test data to the ability of random training data to predict random test data.

The Bohl et al. test compares the ability of your model to predict your empirical test data to the ability of randomly drawn training points to predict your empirical test data.  As such the data for the replicate models would be the same as in ENMTools for test.prop > 0, but the data the models are evaluated on would be test data from the empirical data set instead of test data that was randomly drawn from the study area.

At this point I would not venture to say which of these approaches is better, as I don't feel that I fully understand it myself.  They each reflect different null hypotheses, and so perhaps the answer to "which is better" is a question of which one reflects the null you're most interested in rejecting.  I think there's a lot more work to be done in this area, and I'm not sure there's going to be a one-size-fits-all answer.

All of that aside, at some point we need to implement the Bohl et al. test in ENMTools.  Until then, it's fairly easy to hack together as is.  You can use the existing rts.reps argument to generate the reps, and then just evaluate those models on your empirical test data.  Here's a quick and dirty example using some of the built-in data from ENMTools.


library(ENMTools)

library(dplyr)

library(ggplot2)


monticola.gam <- enmtools.gam(iberolacerta.clade$species$monticola,

                              euro.worldclim,

                              test.prop = 0.3,

                              rts.reps = 10)


test.pres <- monticola.gam$test.data

test.bg <- monticola.gam$analysis.df %>%

  filter(presence == 0) %>%

  select(Longitude, Latitude)


bohl.test <- function(thismodel){

  dismo::evaluate(test.pres, test.bg, thismodel, euro.worldclim)

}


null.dist <-sapply(monticola.gam$rts.test$rts.models, 

                   FUN = function(x) bohl.test(x$model)@auc)

null.dist <- c(monticola.gam$test.evaluation@auc, null.dist)

names(null.dist)[1] <- "empirical"


qplot(null.dist, geom = "histogram", fill = "density", alpha = 0.5) +

  geom_vline(xintercept = null.dist["empirical"], linetype = "longdash") +

  xlim(0,1) + guides(fill = FALSE, alpha = FALSE) + xlab("AUC") +

  ggtitle(paste("Model performance in geographic space on test data")) +

  theme(plot.title = element_text(hjust = 0.5))


Ta da!!!!


Wednesday, July 29, 2020

install.packages("ENMTools") - ENMTools is now on CRAN!

Hey everybody! I'm happy to announce that we've finally put ENMTools onto CRAN!  It took a while, but everything seems to be working.  From now on you can install ENMTools just by typing:

install.packages("ENMTools")

After which, to get all of the dependencies, you should go ahead and do:

library(ENMTools)
install.extras()

After that, everything should work!  You might still need to put the maxent.jar file in the right place for dismo if you haven't done that already, but see the dismo maxent help file for that.

Tuesday, June 2, 2020

Introductory tutorials for the R version of ENMTools

Hey everybody!  I've started recording quick tutorials on the most important bits of ENMTools.  Here's one on how to install ENMTools and all of its dependencies:



And here's one on how to build ENMTools species objects and some quick models:





Sunday, May 24, 2020

Code snippet from Tyler Smith for fast plotting of ENMTools models

Over on the ENMTools GitHub page, Tyler Smith asked a question about plotting ENMTools models.  He pointed out that large models are very slow to plot, largely because of our use of ggplot2.  We like ggplot for this because it allows us to store plots in objects easily, and makes it possible for users to modify plots after the fact using all of the features of ggplot and the extensions people have written for it.  That said, it's probably frustrating to have a long draw time if you just want to take a quick peek at your model's predictions.  Tyler provided a code chunk that does a nice quick plot using base graphics.  The end result looks a lot like the standard ggplot plots we've been returning, but takes a fraction of the time to display.  Here's that code:

library(viridis) # to match ENMTools color palette

plotTWS <- ...="" function="" p="" x="">  plot(x$suitability, col = viridis(100, option = "B" ),
       xlab =  "Longitude", ylab = "Latitude",
       main = paste("Maxent model for", x$species.name),
       bty = 'l', box = FALSE)
  points(subset(x$analysis.df, presence == 1), pch = 21,
       bg = "white")
  points(x$test.data, pch = 21, bg = "green")
}

We're going to see if we can work out a quicker way to do our built-in plots using ggplot, but for now this is a nice workaround!