These are simple distribution metric functions.
se(x, na.rm = getOption("na.rm", FALSE))
ci(x, level = 0.95, na.rm = getOption("na.rm", FALSE))
sum_of_squares(x, correct_mean = TRUE, na.rm = getOption("na.rm", FALSE))
cv(x, na.rm = getOption("na.rm", FALSE))
cqv(x, na.rm = getOption("na.rm", FALSE))
mse(actual, predicted, na.rm = getOption("na.rm", FALSE))
mape(actual, predicted, na.rm = getOption("na.rm", FALSE))
rmse(actual, predicted, na.rm = getOption("na.rm", FALSE))
mae(actual, predicted, na.rm = getOption("na.rm", FALSE))
z_score(x, na.rm = getOption("na.rm", FALSE))
midhinge(x, na.rm = getOption("na.rm", FALSE))
ewma(x, lambda, na.rm = getOption("na.rm", FALSE))
rr_ewma(x, lambda, na.rm = getOption("na.rm", FALSE))
normalise(n, n_ref, per = 1000)
normalize(n, n_ref, per = 1000)
scale_sd(x)
centre_mean(x)
percentiles(x, na.rm = getOption("na.rm", FALSE))
deciles(x, na.rm = getOption("na.rm", FALSE))
values
a logical to indicate whether empty must be removed from x
alpha level, defaults to 95%
with TRUE
(the default) correct for the mean will be applied, by summing each square of x
after the mean of x
has been subtracted, so that this says something about x
. With FALSE
, all x^2
are simply added together, so this says something about x
's location in the data.
Vector of actual values
Vector of predicted values
smoothing parameter, a value between 0 and 1. A value of 0 is equal to x
, a value of 1 equal to the mean of x
. The EWMA looks back and has a delay - the rrEWMA takes the mean of a 'forward' and 'backward' EWMA.
number to be normalised
reference to use for normalisation
normalisation factor
These are the explanations of the functions:
se()
calculates the standard error: sd / square root of length
ci()
calculates the confidence intervals for a mean (defaults at 95%), which returns length 2
sum_of_squares()
calculates the sum of (x - mean(x)) ^ 2
cv()
calculates the coefficient of variation: standard deviation / mean
cqv()
calculates the coefficient of quartile variation: (Q3 - Q1) / (Q3 + Q1)
mse()
calculates the mean squared error
mape()
calculates the mean absolute percentage error
rmse()
calculates the root mean squared error
mae()
calculates the mean absolute error
z_score()
calculates the number of standard deviations from the mean: (x - mean) / sd
midhinge()
calculates the mean of interquartile range: (Q1 + Q3) / 2
ewma()
calculates the EWMA (exponentially weighted moving average)
rr_ewma()
calculates the rrEWMA (reversed-recombined exponentially weighted moving average)
normalise()
normalises the data based on a reference: (n / reference) * unit
scale_sd()
normalises the data to have a standard deviation of 1, while retaining the mean
centre_mean()
normalises the data to have a mean of 0, while retaining the standard deviation
percentiles()
and deciles()
take a numeric vector as input, and return the lowest percentiles or deciles for each value
na.rm
This 'certestats' package supports a global default setting for na.rm
in many mathematical functions. This can be set with options(na.rm = TRUE)
or options(na.rm = FALSE)
.
For normality()
, quantile()
and IQR()
, this also applies to the type
argument. The default, type = 7
, is the default of base R. Use type = 6
to comply with SPSS.
For the sum of squares: https://www.thoughtco.com/sum-of-squares-formula-shortcut-3126266
x <- c(0, 1, 2, 3, 4, 5, 5, 5, 5, 5, 5, 5, 6)
percentiles(x)
#> [1] 1 8 17 25 33 42 42 42 42 42 42 42 100
deciles(x)
#> [1] 1 1 2 2 3 5 5 5 5 5 5 5 10
percentiles(rnorm(10))
#> [1] 22 78 1 67 89 100 11 45 55 34
library(dplyr, warn.conflicts = FALSE)
tib <- as_tibble(matrix(as.integer(runif(40, min = 1, max = 7)), ncol = 4),
.name_repair = function(...) LETTERS[1:4])
tib
#> # A tibble: 10 × 4
#> A B C D
#> <int> <int> <int> <int>
#> 1 2 1 5 3
#> 2 5 2 3 2
#> 3 5 2 4 2
#> 4 2 4 4 4
#> 5 6 3 1 3
#> 6 5 3 5 5
#> 7 1 5 5 2
#> 8 4 6 6 5
#> 9 5 2 6 1
#> 10 5 2 3 3
# percentiles per column
tib |> mutate_all(percentiles)
#> # A tibble: 10 × 4
#> A B C D
#> <dbl> <dbl> <dbl> <dbl>
#> 1 12 1 56 45
#> 2 45 12 12 12
#> 3 45 12 34 12
#> 4 12 78 34 78
#> 5 100 56 1 45
#> 6 45 56 56 89
#> 7 1 89 56 12
#> 8 33 100 89 89
#> 9 45 12 89 1
#> 10 45 12 12 45