Releases · serkor1/SLmetrics

11 Jan 13:21

serkor1

108e434

{SLmetrics} v0.3-1 Latest

Latest

Note

Version 0.3-1 is considered pre-release of {SLmetrics}. We do not
expect any breaking changes, unless a major bug/issue is reported and
its nature forces breaking changes.

🚀 Improvements

OpenMP Support (PR #40):
{SLmetrics} now supports parallelization through OpenMP. The OpenMP
can be utilized as follows:

# 1) probability distribution
# generator
rand.sum <- function(n){
    x <- sort(runif(n-1))
    c(x,1) - c(0,x)
  }

# 2) generate probability
# matrix
set.seed(1903)
pk <- t(replicate(100,rand.sum(1e3)))

# 3) Enable OpenMP
SLmetrics::setUseOpenMP(TRUE)
#> OpenMP usage set to: enabled
system.time(SLmetrics::entropy(pk))
#>    user  system elapsed 
#>   0.211   0.001   0.010

# 3) Disable OpenMP
SLmetrics::setUseOpenMP(FALSE)
#> OpenMP usage set to: disabled
system.time(SLmetrics::entropy(pk))
#>    user  system elapsed 
#>   0.001   0.000   0.001

Entropy with soft labels
(#37): entropy(),
cross.entropy() and relative.entropy() have been introduced. These
functions are heavily inspired by {scipy}. The functions can be used
as follows:

# 1) Define actual
# and observed probabilities

# 1.1) actual probabilies
pk <- matrix(
  cbind(1/2, 1/2),
  ncol = 2
)

# 1.2) observed (estimated) probabilites
qk <- matrix(
  cbind(9/10, 1/10), 
  ncol = 2
)

# 2) calculate
# Entropy
cat(
  "Entropy", SLmetrics::entropy(
    pk
  ),
  "Relative Entropy", SLmetrics::relative.entropy(
    pk,
    qk
  ),
  "Cross Entropy", SLmetrics::cross.entropy(
    pk,
    qk
  ),
  sep = "\n"
)
#> Entropy
#> 0.6931472
#> Relative Entropy
#> 0.5108256
#> Cross Entropy
#> 1.203973

⚠️ Breaking changes

logloss: The argument response have ben renamed to qk as in
the entropy()-family to maintain some degree of consistency.
entropy.factor(): The function have been deleted and is no more.
This was mainly due to avoid the documentation from being too large.
The logloss()-function replaces it.

🐛 Bug-fixes

Plot-method in ROC and prROC
(#36): Fixed a bug in
plot.ROC() and plot.prROC() where if panels = FALSE additional
lines would be added to the plot.

Assets 2

30 Dec 13:25

serkor1

v0.3-0

878972a

{SLmetrics} v0.3-0

Note

Version 0.3-0 is considered pre-release of {SLmetrics}. We do not
expect any breaking changes, unless a major bug/issue is reported and
its nature forces breaking changes.

See NEWS or commit history for detailed changes.

📚 What?

🚀 New features

This update introduces four new features. These are described below,

Cross-Entropy Loss (PR #34): Weighted and unweighted cross-entropy loss. The function can be used as follows,

# 1) define classes and
# observed classes (actual)
classes <- c("Class A", "Class B")

actual   <- factor(
  c("Class A", "Class B", "Class A"), 
  levels = classes

)

# 2) define probabilites
# and construct response_matrix
response <- c(
  0.2, 0.8, 
  0.8, 0.2, 
  0.7, 0.3
)

response_matrix <- matrix(
  response,
  nrow = 3,
  ncol = 2,
  byrow = TRUE
)

colnames(response_matrix) <- classes

response_matrix
#>      Class A Class B
#> [1,]     0.2     0.8
#> [2,]     0.8     0.2
#> [3,]     0.7     0.3

# 3) calculate entropy
SLmetrics::entropy(
  actual,
  response_matrix
)
#> [1] 1.19185

Relative Root Mean Squared Error (Commit 5521b5b):

The function normalizes the Root Mean Squared Error by a factor. There is no official way of normalizing it - and in {SLmetrics} the RMSE can be normalized using three options; mean-, range- and IQR-normalization. It can be used as follows,

# 1) define values
actual <- rnorm(1e3)
predicted <- actual + rnorm(1e3)

# 2) calculate Relative Root Mean Squared Error
cat(
  "Mean Relative Root Mean Squared Error", SLmetrics::rrmse(
    actual        = actual,
    predicted     = predicted,
    normalization = 0
  ),
  "Range Relative Root Mean Squared Error", SLmetrics::rrmse(
    actual        = actual,
    predicted     = predicted,
    normalization = 1
  ),
  "IQR Relative Root Mean Squared Error", SLmetrics::rrmse(
    actual        = actual,
    predicted     = predicted,
    normalization = 2
  ),
  sep = "\n"
)

#> Mean Relative Root Mean Squared Error
#> 2751.381
#> Range Relative Root Mean Squared Error
#> 0.1564043
#> IQR Relative Root Mean Squared Error
#> 0.7323898

Weighted Receiver Operator Characteristics and Precision-Recall Curves (PR #31):

These functions returns the weighted version of TPR, FPR and precision, recalll in weighted.ROC() and weighted.prROC() respectively. The weighted.ROC()-function¹ can be used as follows,

actual    <- factor(sample(c("Class 1", "Class 2"), size = 1e6, replace = TRUE, prob = c(0.7, 0.3)))
response  <- ifelse(actual == "Class 1", rbeta(sum(actual == "Class 1"), 2, 5), rbeta(sum(actual == "Class 2"), 5, 2))
w         <- ifelse(actual == "Class 1", runif(sum(actual == "Class 1"), 0.5, 1.5), runif(sum(actual == "Class 2"), 1, **2))

# Plot
plot(SLmetrics::weighted.ROC(actual, response, w))

⚠️ Breaking Changes

Weighted Confusion Matix: The w-argument in cmatrix() has been
removed in favor of the more verbose weighted confusion matrix call
weighted.cmatrix()-function. See below,

Prior to version 0.3-0 the weighted confusion matrix were a part of
the cmatrix()-function and were called as follows,

SLmetrics::cmatrix(
    actual    = actual,
    predicted = predicted,
    w         = weights
)

This solution, although simple, were inconsistent with the remaining
implementation of weighted metrics in {SLmetrics}. To regain consistency
and simplicity the weighted confusion matrix are now retrieved as
follows,

# 1) define factors
actual    <- factor(sample(letters[1:3], 100, replace = TRUE))
predicted <- factor(sample(letters[1:3], 100, replace = TRUE))
weights   <- runif(length(actual))

# 2) without weights
SLmetrics::cmatrix(
    actual    = actual,
    predicted = predicted
)

#>    a  b  c
#> a  7  8 18
#> b  6 13 15
#> c 15 14  4

# 2) with weights
SLmetrics::weighted.cmatrix(
    actual    = actual,
    predicted = predicted,
    w         = weights
)

#>          a        b        c
#> a 3.627355 4.443065 7.164199
#> b 3.506631 5.426818 8.358687
#> c 6.615661 6.390454 2.233511

🐛 Bug-fixes

Return named vectors: The classification metrics when
micro == NULL were not returning named vectors. This has been fixed.

The syntax is the same for weighted.prROC() ↩

Assets 2

22 Dec 01:12

serkor1

v0.2-0

2c896fa

{SLmetrics} v0.2-0

Note

Version 0.2-0 is considered pre-release of {SLmetrics}. We do not
expect any breaking changes, unless a major bug/issue is reported and
its nature forces breaking changes.

Improvements

documentation: The documentation has gotten some extra love, and
now all functions have their formulas embedded, the details section
have been freed from a general description of [factor] creation.
This will make room for future expansions on the various functions
where more details are required.
weighted classification metrics: The cmatrix()-function now
accepts the argument w which is the sample weights; if passed the
respective method will return the weighted metric. Below is an example
using sample weights for the confusion matrix,

# 1) define factors
actual    <- factor(sample(letters[1:3], 100, replace = TRUE))
predicted <- factor(sample(letters[1:3], 100, replace = TRUE))
weights   <- runif(length(actual))

# 2) without weights
SLmetrics::cmatrix(
    actual    = actual,
    predicted = predicted
)

#>    a  b  c
#> a 16  6  8
#> b 14 10 11
#> c  5 15 15

# 2) with weights
SLmetrics::cmatrix(
    actual    = actual,
    predicted = predicted,
    w         = weights
)

#>          a        b        c
#> a 8.796270 3.581817 3.422532
#> b 6.471277 4.873632 5.732148
#> c 0.908202 8.319738 8.484611

Calculating weighted metrics manually or by using
foo.cmatrix()-method,

# 1) weigthed confusion matrix
# and weighted accuray
confusion_matrix <- SLmetrics::cmatrix(
    actual    = actual,
    predicted = predicted,
    w         = weights
)

# 2) pass into accuracy
# function
SLmetrics::accuracy(
    confusion_matrix
)

#> [1] 0.4379208

# 3) calculate the weighted
# accuracy manually
SLmetrics::weighted.accuracy(
    actual    = actual,
    predicted = predicted,
    w         = weights
)

#> [1] 0.4379208

Please note, however, that it is not possible to pass cmatix()-into
weighted.accurracy(),

Unit-testing: All functions are now being tested for edge-cases in
balanced and imbalanced classifcation problems, and regression
problems, individually. This will enable a more robust development
process and prevent avoidable bugs.

try(
    SLmetrics::weighted.accuracy(
        confusion_matrix
    )
)

#> Error in UseMethod(generic = "weighted.accuracy", object = ..1) : 
#>   no applicable method for 'weighted.accuracy' applied to an object of class "cmatrix"

Bug-fixes

Floating precision: Metrics would give different results based on
the method used. This means that foo.cmatrix() and foo.factor()
would produce different results (See Issue
#16). This has been fixed
by using higher precision Rcpp::NumericMatrix instead of
Rcpp::IntegerMatrix.
Miscalculation of Confusion Matrix elements: An error in how FN,
TN, FP and TP were calculated have been fixed. No issue has been
raised for this bug. This was not something that was caught by the
unit-tests, as the total samples were too high to spot this error. It
has, however, been fixed now. This means that all metrics that uses
these explicitly are now stable, and produces the desired output.
Calculation Error in Fowlks Mallows Index: A bug in the
calculation of the fmi()-function has been fixed. The
fmi()-function now correctly calculates the measure.
Calculation Error in Pinball Deviance and Concordance Correlation
Coefficient: See issue
#19. Switched to unbiased
variance calculation in ccc()-function. The pinball()-function
were missing a weighted quantile function. The issue is now fixed.
Calculation Error in Balanced Accuracy: See issue
#24. The function now
correctly adjusts for random chance, and the result matches that of
{scikit-learn}
Calculation Error in F-beta Score: See issue
#23. The function werent
respecting na.rm and micro, this has been fixed accordingly.
Calculation Error in Relative Absolute Error: The function was
incorrectly calculating means, instead of sums. This has been fixed.

Breaking changes

All regression metrics have had na.rm- and w-arguments removed.
All weighted regression metrics have a seperate function on the
weighted.foo() to increase consistency across all metrics. See
example below,

# 1) define regression problem
actual    <- rnorm(n = 1e3)
predicted <- actual + rnorm(n = 1e3)
w         <- runif(n = 1e3)

# 2) unweighted metrics
SLmetrics::rmse(actual, predicted)

#> [1] 0.9613081

# 3) weighted metrics
SLmetrics::weighted.rmse(actual, predicted, w = w)

#> [1] 0.957806

The rrmse()-function have been removed in favor of the
rrse()-function. This function was incorrectly specified and
described in the package.

Assets 2

08 Dec 18:25

serkor1

v0.1-1

29a2b5d

{SLmetrics} v0.1-1

Note

Version 0.1-1 is considered pre-release of {SLmetrics}. We do not
expect any breaking changes, unless a major bug/issue is reported and
its nature forces breaking changes.

General

Backend changes: All pair-wise metrics arer moved from {Rcpp} to
C++, this have reduced execution time by half. All pair-wise metrics
are now faster.

Improvements

NA-controls: All pair-wise metrics that doesn’t have a
micro-argument were handling missing values as according to C++
and {Rcpp} internals. See
Issue. Thank you
@EmilHvitfeldt for pointing this out. This has now been fixed so
functions uses an na.rm-argument to explicitly control for this.
See below,

# 1) define factors
actual    <- factor(c("no", "yes"))
predicted <- factor(c(NA, "no"))

# 2) accuracy with na.rm = TRUE
SLmetrics::accuracy(
    actual    = actual,
    predicted = predicted,
    na.rm     = TRUE
)

#> [1] 0

# 2) accuracy with na.rm = FALSE
SLmetrics::accuracy(
    actual    = actual,
    predicted = predicted,
    na.rm     = FALSE
)

#> [1] NaN

Bug-fixes

The plot.prROC()- and plot.ROC()-functions now adds a line to
the plot when panels = FALSE. See Issue
#9.

# 1) define actual
# classes
actual <- factor(
  sample(letters[1:2], size = 100, replace = TRUE)
)

# 2) define response
# probabilities
response <- runif(100)

# 3) calculate
# ROC and prROC

# 3.1) ROC
roc <- SLmetrics::ROC(
    actual,
    response
)

# 3.2) prROC
prroc <- SLmetrics::prROC(
    actual,
    response
)

# 4) plot with panels
# FALSE
par(mfrow = c(1,2))
plot(
  roc,
  panels = FALSE
)

plot(
    prroc,
    panels = FALSE
)

Contributors

EmilHvitfeldt

Assets 2

02 Dec 19:22

serkor1

v0.1-0

fc8c7cf

{SLmetrics} v0.1-0

Version 0.1-0 is considered pre-release of {SLmetrics}. We do not
expect any breaking changes, unless a major bug/issue is reported and
its nature forces breaking changes.

General

{SLmetrics} is a collection of Machine Learning performance
evaluation functions for supervised learning. Visit the online
documentation on GitHub
Pages.

Examples

Supervised classification metrics

# 1) actual classes
print(
    actual <- factor(
        sample(letters[1:3], size = 10, replace = TRUE)
    )
)

#>  [1] b a b b a c b c c a
#> Levels: a b c

# 2) predicted classes
print(
    predicted <- factor(
        sample(letters[1:3], size = 10, replace = TRUE)
    )
)

#>  [1] c c a b a b c c a c
#> Levels: a b c

# 1) calculate confusion
# matrix and summarise
# it
summary(
    confusion_matrix <- SLmetrics::cmatrix(
        actual    = actual,
        predicted = predicted
    )
)

#> Confusion Matrix (3 x 3) 
#> ================================================================================
#>   a b c
#> a 1 0 2
#> b 1 1 2
#> c 1 1 1
#> ================================================================================
#> Overall Statistics (micro average)
#>  - Accuracy:          0.30
#>  - Balanced Accuracy: 0.31
#>  - Sensitivity:       0.30
#>  - Specificity:       0.65
#>  - Precision:         0.30

# 2) calculate false positive
# rate using micro average
SLmetrics::fpr(
    confusion_matrix
)

#>         a         b         c 
#> 0.2857143 0.1666667 0.5714286

Supervised regression metrics

# 1) actual values
actual <- rnorm(n = 100)

# 2) predicted values
predicted <- actual + rnorm(n = 100)

# 1) calculate
# huber loss
SLmetrics::huberloss(
    actual    = actual,
    predicted = predicted
)

#> [1] 0.394088

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🚀 Improvements

⚠️ Breaking changes

🐛 Bug-fixes

📚 What?

🚀 New features

⚠️ Breaking Changes

🐛 Bug-fixes

Improvements

Bug-fixes

Breaking changes

General

Improvements

Bug-fixes

Contributors

General

Examples

Supervised classification metrics

Supervised regression metrics

Releases: serkor1/SLmetrics

{SLmetrics} v0.3-1

🚀 Improvements

⚠️ Breaking changes

🐛 Bug-fixes

{SLmetrics} v0.3-0

📚 What?

🚀 New features

⚠️ Breaking Changes

🐛 Bug-fixes

{SLmetrics} v0.2-0

Improvements

Bug-fixes

Breaking changes

{SLmetrics} v0.1-1

General

Improvements

Bug-fixes

Contributors

{SLmetrics} v0.1-0

General

Examples

Supervised classification metrics

Supervised regression metrics