Sunday, June 16, 2013

Version 1.4.2, adding sampling without replacement to "Resample From Raster" function

By request, I have added a radio button for sampling with or without replacement to the "resample from raster" function.  This function was initially intended for simulating data for methodological studies, but can also be used to sample random points for conducting significance tests for AUC values a la Raes and ter Steege 2007 (using the "constant" setting).  The initial setup was to always resample with replacement.  This isn't ideal for the Raes and ter Steege test, but was unlikely to have any real impact except on models built over very small geographic regions and/or those with very coarse resolution (i.e., study areas with a very small number of grid cells).

I'll post a detailed tutorial eventually, once I get a spare moment to breathe.  Long story short: if you have N data points and want to do X replicates, you load up a raster file that has data in grid cells for your study area and nodata values outside the study area.  This can even be the .asc file for your model itself.  Use the resample from raster tool, constant sampling function, to sample N data points for X replicates.  Then build a single model for each of those replicates using the same study area, model construction settings, and environmental predictors as in your model for your empirical data.  Collect all of the AUC train and test scores from those replicate models, and use those as the null distribution against which to compare your empirical values for AUC train and test.  Guidance on how to do that is here:

Species In Space

The new version is here:

ENMTools 1.4.2

Perl version only, see my previous kvetching about Active State if you want to know why.

Thanks to Marie-France Ostrowski for the suggestion and Renee Catullo for testing it.

No comments:

Post a Comment