Thursday, July 28, 2016

New features for ENMTools model objects and functions: response plots, model evaluation, and new color ramps

One of the advantages of the enmtools.species object structure is that I can now provide a much more accessible interface to much of dismo's modeling functionality, and can add new functionality that automates a lot of the outputs you might typically want from an SDM/ENM.  You've already seen some of this here, but in the past week I've added a lot more.  For one thing I've switched from the default color ramps, such as this:



To viridis color ramps, e.g.,


This has a number of advantages.  First, viridis color ramps are very pretty, and this particular one has a very familiar Maxent-y sort of look about it which makes it easy for an experienced SDM modeler to interpret.  More importantly, the viridis color ramps are designed with accessibility in mind.  The authors put a ton of work into figuring out a set of color ramps that are interpretable when printed in greyscale AND accessible to people with varying types of color blindness.  That's pretty awesome.

You'll also notice that there are two sets of points plotted there.  Those are training and test points.  You can now call all of the enmtools modeling functions with an argument "test.prop", e.g.,

ahli.glm = enmtools.glm(pres ~ layer.1 + layer.2 + layer.3 + layer.4, ahli, env, test.prop = 0.2)

And they will automatically withhold that proportion of your data for model testing.  Now when you call your model object, you get training and test evaluation metrics!

>ahli.glm


Formula:  presence ~ layer.1 + layer.2 + layer.3 + layer.4



Data table (top ten lines): 

|   | Longitude| Latitude| layer.1| layer.2| layer.3| layer.4| presence|
|:--|---------:|--------:|-------:|-------:|-------:|-------:|--------:|
|1  |  -80.0106|  21.8744|    2765|    1235|    1174|     252|        1|
|2  |  -79.9086|  21.8095|    2289|    1732|     957|     231|        1|
|3  |  -79.8065|  21.7631|    2158|    1870|     983|     253|        1|
|4  |  -79.8251|  21.8095|    2207|    1877|     967|     259|        1|
|5  |  -79.8807|  21.8374|    2244|    1828|     945|     249|        1|
|6  |  -79.9550|  21.8374|    2250|    1766|     919|     235|        1|
|7  |  -80.3446|  22.0136|    2201|    1822|     978|     277|        1|
|8  |  -80.2983|  21.9951|    2214|    1786|     986|     284|        1|
|10 |  -80.1591|  21.9673|    2984|     965|    1311|     237|        1|
|11 |  -80.1498|  21.9858|    3042|     841|    1371|     221|        1|


Model:  
Call:
glm(formula = f, family = "binomial", data = analysis.df[, -c(1, 
    2)])

Deviance Residuals: 
     Min        1Q    Median        3Q       Max  
-0.65556  -0.18280  -0.12121  -0.08065   3.10812  

Coefficients:
             Estimate Std. Error z value Pr(>|z|)  
(Intercept) 40.033648  27.208173   1.471   0.1412  
layer.1     -0.012770   0.007165  -1.782   0.0747 .
layer.2     -0.009662   0.007346  -1.315   0.1884  
layer.3      0.006954   0.006638   1.047   0.2949  
layer.4     -0.020317   0.025631  -0.793   0.4280  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 130.29  on 1011  degrees of freedom
Residual deviance: 119.18  on 1007  degrees of freedom
AIC: 129.18

Number of Fisher Scoring iterations: 8



Model fit (training data):  class          : ModelEvaluation 
n presences    : 12 
n absences     : 1000 
AUC            : 0.7485833 
cor            : 0.09753628 
max TPR+TNR at : -4.82228 


Proportion of data wittheld for model fitting:  0.2

Model fit (test data):  class          : ModelEvaluation 
n presences    : 4 
n absences     : 1000 
AUC            : 0.74075 
cor            : 0.05048264 
max TPR+TNR at : -4.570937 


So that's cool, obviously.  Even cooler is that your model object now contains marginal response functions.  These are calculated by varying each predictor from the minimum value to the maximum value found in the environmental layers, while holding the other predictors constant at the mean value across all presence points.  At present these plots aren't printed by default when you call your object, but I may change that.  For now, you can type:

>ahli.glm$response.plots

And you get:






I may do some more tweaking in the future, but these are ggplot2 plots so you can easily modify them however you want.

No comments:

Post a Comment