Goodness-of-fit, confidence and consistency measures

Wrapper function for calculating the predictive distribution model's confidence, consistency, and optionally some well-known goodness-of-fit measures as well. The calculated measures are as follows:

confidence in predictions (CP) and confidence in positive predictions (CPP) within known presences for the training and evaluation subsets
consistency of predictions (difference of CPs; DCP) and positive predictions (difference of CPPs; DCPP)
Area Under the ROC Curve (AUC) - optional (see parameter goodness)
maximum of the True Skill Statistic (maxTSS) - optional (see parameter goodness)

Usage

measures(
  observations,
  predictions,
  evaluation_mask,
  goodness = FALSE,
  df = FALSE
)

Arguments

observations: Either an integer or logical vector containing the binary observations where presences are encoded as 1s/TRUEs and absences as 0s/FALSEs.
predictions: A numeric vector containing the predicted probabilities of occurrence typically within the [0, 1] interval. length(predictions) should be equal to length(observations) and the order of the elements should match.
evaluation_mask: A logical vector (mask) of the evaluation subset. Its ith element indicates whether the ith element of observations was used for evaluation (TRUE) or for training (FALSE). length(evaluation_mask) should be equal to length(observations) and the order of the elements should match, i.e. observations[evaluation_mask] were the evaluation subset and observations[!evaluation_mask] were the training subset.
goodness: Logical vector of length one, defaults to FALSE. Indicates, whether goodness-of-fit measures (AUC and maxTSS) should be calculated. If set to TRUE, external package ROCR (Sing et al. 2005) is needed for the calculation (see section 'Note').
df: Logical vector of length one, defaults to FALSE. Indicates, whether the returned value should be a one-row data.frame that is rbind()able if measures() is called on multiple models in a for loop or a lapply(). See section 'Value' and 'Examples' for details.

Value

A named numeric vector (if df is FALSE; the default) or a data.frame (if df is TRUE) of one row.

length() of the vector or ncol() of the data.frame is 6 (if goodness is FALSE; the default) or 8 (if

goodness is TRUE). The name of the elements/columns are as follows:

CP_train: confidence in predictions within known presences (CP) for the training subset
CP_eval: confidence in predictions within known presences (CP) for the evaluation subset
DCP: consistency of predictions (difference of CPs)
CPP_train: confidence in positive predictions within known presences (CPP) for the training subset
CPP_eval: confidence in positive predictions within known presences (CPP) for the evaluation subset
DCPP: consistency of positive predictions (difference of CPPs)
AUC: Area Under the ROC Curve (Hanley and McNeil 1982; calculated by ROCR::performance()). This element/column is available only if parameter 'goodness' is set to TRUE. If package ROCR is not available but parameter 'goodness' is set to TRUE, the value of AUC is NA_real_ and a warning is raised.
maxTSS: Maximum of the True Skill Statistic (Allouche et al. 2006; calculated by ROCR::performance()). This element/column is available only if parameter 'goodness' is set to TRUE. If package ROCR is not available but parameter 'goodness' is set to TRUE, the value of maxTSS is NA_real_ and a warning is raised.

Note

Since confcons is a light-weight, stand-alone packages, it does not import package ROCR (Sing et al. 2005), i.e. installing confcons does not mean installing ROCR automatically. If you need AUC and maxTSS (i.e., parameter 'goodness' is set to TRUE), you should install ROCR or install confcons along with its dependencies (i.e., devtools::install_github(repo = "bfakos/confcons", dependencies = TRUE)).

References

Allouche O, Tsoar A, Kadmon R (2006): Assessing the accuracy of species distribution models: prevalence, kappa and the true skill statistic (TSS). Journal of Applied Ecology 43(6): 1223-1232. doi:10.1111/j.1365-2664.2006.01214.x .
Hanley JA, McNeil BJ (1982): The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143(1): 29-36. doi:10.1148/radiology.143.1.7063747 .
Sing T, Sander O, Beerenwinkel N, Lengauer T. (2005): ROCR: visualizing classifier performance in R. Bioinformatics 21(20): 3940-3941. doi:10.1093/bioinformatics/bti623 .

Examples

set.seed(12345)
dataset <- data.frame(
  observations = c(rep(x = FALSE, times = 500),
                  rep(x = TRUE, times = 500)),
  predictions_model1 = c(runif(n = 250, min = 0, max = 0.6),
                        runif(n = 250, min = 0.1, max = 0.7),
                        runif(n = 250, min = 0.4, max = 1),
                        runif(n = 250, min = 0.3, max = 0.9)),
  predictions_model2 = c(runif(n = 250, min = 0.1, max = 0.55),
                        runif(n = 250, min = 0.15, max = 0.6),
                        runif(n = 250, min = 0.3, max = 0.9),
                        runif(n = 250, min = 0.25, max = 0.8)),
  evaluation_mask = c(rep(x = FALSE, times = 250),
                      rep(x = TRUE, times = 250),
                      rep(x = FALSE, times = 250),
                      rep(x = TRUE, times = 250))
)

# Default parameterization, return a vector without AUC and maxTSS:
conf_and_cons <- measures(observations = dataset$observations,
                          predictions = dataset$predictions_model1,
                          evaluation_mask = dataset$evaluation_mask)
print(conf_and_cons)
#>   CP_train    CP_eval        DCP  CPP_train   CPP_eval       DCPP 
#>  0.6120000  0.4760000 -0.1360000  0.6120000  0.4229075 -0.1890925 
names(conf_and_cons)
#> [1] "CP_train"  "CP_eval"   "DCP"       "CPP_train" "CPP_eval"  "DCPP"     
conf_and_cons[c("CPP_eval", "DCPP")]
#>   CPP_eval       DCPP 
#>  0.4229075 -0.1890925 

# Calculate AUC and maxTSS as well if package ROCR is installed:
if (requireNamespace(package = "ROCR", quietly = TRUE)) {
  conf_and_cons_and_goodness <- measures(observations = dataset$observations,
                                         predictions = dataset$predictions_model1,
                                         evaluation_mask = dataset$evaluation_mask,
                                         goodness = TRUE)
}

# Calculate the measures for multiple models in a for loop:
model_IDs <- as.character(1:2)
for (model_ID in model_IDs) {
  column_name <- paste0("predictions_model", model_ID)
  conf_and_cons <- measures(observations = dataset$observations,
                            predictions = dataset[, column_name, drop = TRUE],
                            evaluation_mask = dataset$evaluation_mask,
                            df = TRUE)
  if (model_ID == model_IDs[1]) {
    conf_and_cons_df <- conf_and_cons
  } else {
    conf_and_cons_df <- rbind(conf_and_cons_df, conf_and_cons)
  }
}
conf_and_cons_df
#>   CP_train CP_eval    DCP CPP_train  CPP_eval       DCPP
#> 1    0.612   0.476 -0.136 0.6120000 0.4229075 -0.1890925
#> 2    0.668   0.568 -0.100 0.6391304 0.4653465 -0.1737839

# Calculate the measures for multiple models in a lapply():
conf_and_cons_list <- lapply(X = model_IDs,
                             FUN = function(model_ID) {
                               column_name <- paste0("predictions_model", model_ID)
                               measures(observations = dataset$observations,
                                        predictions = dataset[, column_name, drop = TRUE],
                                        evaluation_mask = dataset$evaluation_mask,
                                        df = TRUE)
                             })
conf_and_cons_df <- do.call(what = rbind,
                            args = conf_and_cons_list)
conf_and_cons_df
#>   CP_train CP_eval    DCP CPP_train  CPP_eval       DCPP
#> 1    0.612   0.476 -0.136 0.6120000 0.4229075 -0.1890925
#> 2    0.668   0.568 -0.100 0.6391304 0.4653465 -0.1737839