Create a confusion matrix and calculate its metrics. This function is an agnostic yardstick
wrapper: it applies all yardstick
functions that are metrics used for confusion matrices without internal hard-coded function names.
confusion_matrix(data, ...)
# Default S3 method
confusion_matrix(data, truth, estimate, na.rm = getOption("na.rm", FALSE), ...)
Either a data.frame
containing the columns specified by the
truth
and estimate
arguments, or a table
/matrix
where the true
class results should be in the columns of the table.
Not currently used.
The column identifier for the true class results
(that is a factor
). This should be an unquoted column name although
this argument is passed by expression and supports
quasiquotation (you can unquote column
names). For _vec()
functions, a factor
vector.
The column identifier for the predicted class
results (that is also factor
). As with truth
this can be
specified different ways but the primary method is to use an
unquoted variable name. For _vec()
functions, a factor
vector.
a logical to indicate whether empty must be removed
na.rm
This 'certestats' package supports a global default setting for na.rm
in many mathematical functions. This can be set with options(na.rm = TRUE)
or options(na.rm = FALSE)
.
For normality()
, quantile()
and IQR()
, this also applies to the type
argument. The default, type = 7
, is the default of base R. Use type = 6
to comply with SPSS.
df <- tibble::tibble(name = c("Predict Yes", "Predict No"),
"Actual Yes" = c(123, 26),
"Actual No" = c(13, 834))
df
#> # A tibble: 2 × 3
#> name `Actual Yes` `Actual No`
#> <chr> <dbl> <dbl>
#> 1 Predict Yes 123 13
#> 2 Predict No 26 834
confusion_matrix(df)
#>
#> ── Confusion Matrix ────────────────────────────────────────────────────────────
#>
#> Actual Yes Actual No
#> Actual Yes 123 13
#> Actual No 26 834
#>
#>
#> ── Model Metrics ───────────────────────────────────────────────────────────────
#>
#> Accuracy 0.961
#> Area under the Precision Recall Curve (APRC) 0.125
#> Area under the Receiver Operator Curve (AROC) 0.063
#> Balanced Accuracy 0.937
#> Brier Score for Classification Models (BSCM) 3.389
#> Costs Function for Poor Classification (CFPC) 1.688
#> F Measure 0.863
#> Gain Capture -0.874
#> J-Index 0.874
#> Kappa 0.840
#> Matthews Correlation Coefficient (MCC) 0.842
#> Mean log Loss for Multinomial Data (MLMD) 31.122
#> Negative Predictive Value (NPV) 0.985
#> Positive Predictive Value (PPV) 0.826
#> Precision 0.826
#> Prevalence 0.150
#> Recall 0.904
#> Sensitivity 0.904
#> Specificity 0.970