{SLmetrics} Version 0.3-0 🚀 (#33)

> [!NOTE] > > See NEWS or commit history for detailed changes. ## 📚 What? ### 🚀 New features This update introduces four new features. These are described below, **Cross-Entropy Loss (PR #34 Weighted and unweighted cross-entropy loss. The function can be used as follows, ``` r # 1) define classes and # observed classes (actual) classes <- c("Class A", "Class B") actual <- factor( c("Class A", "Class B", "Class A"), levels = classes ) # 2) define probabilites # and construct response_matrix response <- c( 0.2, 0.8, 0.8, 0.2, 0.7, 0.3 ) response_matrix <- matrix( response, nrow = 3, ncol = 2, byrow = TRUE ) colnames(response_matrix) <- classes response_matrix #> Class A Class B #> [1,] 0.2 0.8 #> [2,] 0.8 0.2 #> [3,] 0.7 0.3 # 3) calculate entropy SLmetrics::entropy( actual, response_matrix ) #> [1] 1.19185 ``` **Relative Root Mean Squared Error (Commit 5521b5b49b1e268c50d6b8d61ae1c6243c4944b3):** The function normalizes the Root Mean Squared Error by a factor. There is no official way of normalizing it - and in {SLmetrics} the RMSE can be normalized using three options; mean-, range- and IQR-normalization. It can be used as follows, ```r # 1) define values actual <- rnorm(1e3) predicted <- actual + rnorm(1e3) # 2) calculate Relative Root Mean Squared Error cat( "Mean Relative Root Mean Squared Error", SLmetrics::rrmse( actual = actual, predicted = predicted, normalization = 0 ), "Range Relative Root Mean Squared Error", SLmetrics::rrmse( actual = actual, predicted = predicted, normalization = 1 ), "IQR Relative Root Mean Squared Error", SLmetrics::rrmse( actual = actual, predicted = predicted, normalization = 2 ), sep = "\n" ) #> Mean Relative Root Mean Squared Error #> 2751.381 #> Range Relative Root Mean Squared Error #> 0.1564043 #> IQR Relative Root Mean Squared Error #> 0.7323898 ``` **Weighted Receiver Operator Characteristics and Precision-Recall Curves (PR #31 These functions returns the weighted version of `TPR`, `FPR` and `precision`, `recalll` in `weighted.ROC()` and `weighted.prROC()` respectively. The `weighted.ROC()`-function[^1] can be used as follows, ```r actual <- factor(sample(c("Class 1", "Class 2"), size = 1e6, replace = TRUE, prob = c(0.7, 0.3))) response <- ifelse(actual == "Class 1", rbeta(sum(actual == "Class 1"), 2, 5), rbeta(sum(actual == "Class 2"), 5, 2)) w <- ifelse(actual == "Class 1", runif(sum(actual == "Class 1"), 0.5, 1.5), runif(sum(actual == "Class 2"), 1, **2)) ``` ``` r # Plot plot(SLmetrics::weighted.ROC(actual, response, w)) ``` ![](https://i.imgur.com/YG9kqZa.png) ### ⚠️ Breaking Changes - **Weighted Confusion Matix:** The `w`-argument in `cmatrix()` has been removed in favor of the more verbose weighted confusion matrix call `weighted.cmatrix()`-function. See below, Prior to version `0.3-0` the weighted confusion matrix were a part of the `cmatrix()`-function and were called as follows, ``` r SLmetrics::cmatrix( actual = actual, predicted = predicted, w = weights ) ``` This solution, although simple, were inconsistent with the remaining implementation of weighted metrics in {SLmetrics}. To regain consistency and simplicity the weighted confusion matrix are now retrieved as follows, ``` r # 1) define factors actual <- factor(sample(letters[1:3], 100, replace = TRUE)) predicted <- factor(sample(letters[1:3], 100, replace = TRUE)) weights <- runif(length(actual)) # 2) without weights SLmetrics::cmatrix( actual = actual, predicted = predicted ) ``` #> a b c #> a 7 8 18 #> b 6 13 15 #> c 15 14 4 ``` r # 2) with weights SLmetrics::weighted.cmatrix( actual = actual, predicted = predicted, w = weights ) ``` #> a b c #> a 3.627355 4.443065 7.164199 #> b 3.506631 5.426818 8.358687 #> c 6.615661 6.390454 2.233511 ### 🐛 Bug-fixes - **Return named vectors:** The classification metrics when `micro == NULL` were not returning named vectors. This has been fixed. [^1]: The syntax is the same for `weighted.prROC()`
serkor1 · Dec 30, 2024 · 878972a · 878972a
1 parent 2c896fa
commit 878972a
Show file tree

Hide file tree

Showing 135 changed files with 4,862 additions and 3,746 deletions.
diff --git a/DESCRIPTION b/DESCRIPTION
@@ -1,6 +1,6 @@
 Package: SLmetrics
 Title: Machine Learning Performance Evaluation on Steroids
-Version: 0.2-0
+Version: 0.3-0
 Authors@R: c(
     person(
       given   = "Serkan", 

diff --git a/Makefile b/Makefile
@@ -9,38 +9,27 @@ PKGNAME = SLmetrics
 VERSION = $(shell grep "^Version:" DESCRIPTION | sed "s/Version: //")
 TARBALL = $(PKGNAME)_$(VERSION).tar.gz
 
-py-setup:
-	@echo "Setting up Python environment"
-	@echo "============================="
-	@python -m venv .venv
-	@echo "Activating virtual environment"
-	@pip cache purge
-	@python -m pip install --upgrade pip
-	@pip install numpy scipy torch torchmetrics scikit-learn imbalanced-learn mkl mkl-service mkl_fft mkl_random
-	@echo "Done!"
-
-py-check:
-	@echo "Checking installed python modules"
-	@echo "================================="
-	@pip list
-
 document:
 	clear
 	@echo "Documenting {$(PKGNAME)}"
 	@Rscript tools/document.R
 
 build: document
 	@echo "Installing {$(PKGNAME)}"
+	rm -f src/*.o src/*.so
 	R CMD build .
 	R CMD INSTALL $(TARBALL)
 	rm -f $(TARBALL)
+	rm -f src/*.o src/*.so
 
 check: document
 	@echo "Checking {$(PKGNAME)}"
+	rm -f src/*.o src/*.so
 	R CMD build .
 	R CMD check $(TARBALL)
 	rm -f $(TARBALL)
 	rm -rf $(PKGNAME).Rcheck
+	rm -f src/*.o src/*.so
 
 build-site:
 	@echo "Building {pkgdown}"

diff --git a/NAMESPACE b/NAMESPACE
@@ -13,6 +13,7 @@ S3method(csi,cmatrix)
 S3method(csi,factor)
 S3method(dor,cmatrix)
 S3method(dor,factor)
+S3method(entropy,factor)
 S3method(fallout,cmatrix)
 S3method(fallout,factor)
 S3method(fbeta,cmatrix)
@@ -28,6 +29,7 @@ S3method(fpr,factor)
 S3method(huberloss,numeric)
 S3method(jaccard,cmatrix)
 S3method(jaccard,factor)
+S3method(logloss,factor)
 S3method(mae,numeric)
 S3method(mape,numeric)
 S3method(mcc,cmatrix)
@@ -61,6 +63,7 @@ S3method(recall,cmatrix)
 S3method(recall,factor)
 S3method(rmse,numeric)
 S3method(rmsle,numeric)
+S3method(rrmse,numeric)
 S3method(rrse,numeric)
 S3method(rsq,numeric)
 S3method(selectivity,cmatrix)
@@ -79,19 +82,23 @@ S3method(tpr,cmatrix)
 S3method(tpr,factor)
 S3method(tscore,cmatrix)
 S3method(tscore,factor)
+S3method(weighted.ROC,factor)
 S3method(weighted.accuracy,factor)
 S3method(weighted.baccuracy,factor)
 S3method(weighted.ccc,numeric)
 S3method(weighted.ckappa,factor)
+S3method(weighted.cmatrix,factor)
 S3method(weighted.csi,factor)
 S3method(weighted.dor,factor)
+S3method(weighted.entropy,factor)
 S3method(weighted.fallout,factor)
 S3method(weighted.fbeta,factor)
 S3method(weighted.fdr,factor)
 S3method(weighted.fer,factor)
 S3method(weighted.fpr,factor)
 S3method(weighted.huberloss,numeric)
 S3method(weighted.jaccard,factor)
+S3method(weighted.logloss,factor)
 S3method(weighted.mae,numeric)
 S3method(weighted.mape,numeric)
 S3method(weighted.mcc,factor)
@@ -103,11 +110,13 @@ S3method(weighted.phi,factor)
 S3method(weighted.pinball,numeric)
 S3method(weighted.plr,factor)
 S3method(weighted.ppv,factor)
+S3method(weighted.prROC,factor)
 S3method(weighted.precision,factor)
 S3method(weighted.rae,numeric)
 S3method(weighted.recall,factor)
 S3method(weighted.rmse,numeric)
 S3method(weighted.rmsle,numeric)
+S3method(weighted.rrmse,numeric)
 S3method(weighted.rrse,numeric)
 S3method(weighted.rsq,numeric)
 S3method(weighted.selectivity,factor)
@@ -128,6 +137,7 @@ export(ckappa)
 export(cmatrix)
 export(csi)
 export(dor)
+export(entropy)
 export(fallout)
 export(fbeta)
 export(fdr)
@@ -136,6 +146,7 @@ export(fmi)
 export(fpr)
 export(huberloss)
 export(jaccard)
+export(logloss)
 export(mae)
 export(mape)
 export(mcc)
@@ -153,6 +164,7 @@ export(rae)
 export(recall)
 export(rmse)
 export(rmsle)
+export(rrmse)
 export(rrse)
 export(rsq)
 export(selectivity)
@@ -162,19 +174,23 @@ export(specificity)
 export(tnr)
 export(tpr)
 export(tscore)
+export(weighted.ROC)
 export(weighted.accuracy)
 export(weighted.baccuracy)
 export(weighted.ccc)
 export(weighted.ckappa)
+export(weighted.cmatrix)
 export(weighted.csi)
 export(weighted.dor)
+export(weighted.entropy)
 export(weighted.fallout)
 export(weighted.fbeta)
 export(weighted.fdr)
 export(weighted.fer)
 export(weighted.fpr)
 export(weighted.huberloss)
 export(weighted.jaccard)
+export(weighted.logloss)
 export(weighted.mae)
 export(weighted.mape)
 export(weighted.mcc)
@@ -186,11 +202,13 @@ export(weighted.phi)
 export(weighted.pinball)
 export(weighted.plr)
 export(weighted.ppv)
+export(weighted.prROC)
 export(weighted.precision)
 export(weighted.rae)
 export(weighted.recall)
 export(weighted.rmse)
 export(weighted.rmsle)
+export(weighted.rrmse)
 export(weighted.rrse)
 export(weighted.rsq)
 export(weighted.selectivity)

diff --git a/NEWS.Rmd b/NEWS.Rmd
@@ -15,14 +15,120 @@ knitr::opts_chunk$set(
 set.seed(1903) 
 ```
 
-# Version 0.2-0
+# Version 0.3-0
 
-> Version 0.2-0 is considered pre-release of {SLmetrics}. We do not
+> Version 0.3-0 is considered pre-release of {SLmetrics}. We do not
 > expect any breaking changes, unless a major bug/issue is reported and its nature
 > forces breaking changes.
 
 ## Improvements
 
+## New Feature
+
+* **Relative Root Mean Squared Error:** The function normalizes the Root Mean Squared Error by a facttor. There is no official way of normalizing it - and in {SLmetrics} the RMSE can be normalized using three options; mean-, range- and IQR-normalization. It can be used as follows,
+
+```{r}
+# 1) define values
+actual <- rnorm(1e3)
+predicted <- actual + rnorm(1e3)
+
+# 2) calculate Relative Root Mean Squared Error
+cat(
+  "Mean Relative Root Mean Squared Error", SLmetrics::rrmse(
+    actual        = actual,
+    predicted     = predicted,
+    normalization = 0
+  ),
+  "Range Relative Root Mean Squared Error", SLmetrics::rrmse(
+    actual        = actual,
+    predicted     = predicted,
+    normalization = 1
+  ),
+  "IQR Relative Root Mean Squared Error", SLmetrics::rrmse(
+    actual        = actual,
+    predicted     = predicted,
+    normalization = 2
+  ),
+  sep = "\n"
+)
+```
+
+* **Cross Entropy:** Weighted and unweighted Cross Entropy, with and without normalization. The function can be used as follows,
+
+```{r}
+# Create factors and response probabilities
+actual   <- factor(c("Class A", "Class B", "Class A"))
+weights  <- c(0.3,0.9,1) 
+response <- matrix(cbind(
+    0.2, 0.8,
+    0.8, 0.2,
+    0.7, 0.3
+),nrow = 3, ncol = 2)
+
+cat(
+    "Unweighted Cross Entropy:",
+    SLmetrics::entropy(
+        actual,
+        response
+    ),
+    "Weighted Cross Entropy:",
+    SLmetrics::weighted.entropy(
+        actual   = actual,
+        response = response,
+        w        = weights
+    ),
+    sep = "\n"
+)
+```
+
+* **Weighted Receiver Operator Characteristics:** `weighted.ROC()`, the function calculates the weighted True Positive and False Positive Rates for each threshold.
+
+* **Weighted Precision-Recall Curve:** `weighted.prROC()`, the function calculates the weighted Recall and Precsion for each threshold.
+
+## Breaking Changes
+
+* **Weighted Confusion Matix:** The `w`-argument in `cmatrix()` has been removed in favor of the more verbose weighted confusion matrix call `weighted.cmatrix()`-function. See below,
+
+Prior to version `0.3-0` the weighted confusion matrix were a part of the `cmatrix()`-function and were called as follows,
+
+```{r, eval = FALSE}
+SLmetrics::cmatrix(
+    actual    = actual,
+    predicted = predicted,
+    w         = weights
+)
+```
+
+This solution, although simple, were inconsistent with the remaining implementation of weighted metrics in {SLmetrics}. To regain consistency and simplicity the weighted confusion matrix are now retrieved as follows,
+
+```{r}
+# 1) define factors
+actual    <- factor(sample(letters[1:3], 100, replace = TRUE))
+predicted <- factor(sample(letters[1:3], 100, replace = TRUE))
+weights   <- runif(length(actual))
+
+# 2) without weights
+SLmetrics::cmatrix(
+    actual    = actual,
+    predicted = predicted
+)
+
+# 2) with weights
+SLmetrics::weighted.cmatrix(
+    actual    = actual,
+    predicted = predicted,
+    w         = weights
+)
+```
+
+## Bug-fixes
+
+* **Return named vectors:** The classification metrics when `micro == NULL` were not returning named vectors. This has been fixed. 
+
+# Version 0.2-0
+
+## Improvements
+
 * **documentation:** The documentation has gotten some extra love, and now all functions have their formulas embedded, the details section have been freed from a general description of [factor] creation. This will make room for future expansions on the various functions where more details are required.
 
 * **weighted classification metrics:** The `cmatrix()`-function now accepts the argument `w` which is the sample weights; if passed the respective method will return the weighted metric. Below is an example using sample weights for the confusion matrix,
@@ -40,7 +146,7 @@ SLmetrics::cmatrix(
 )
 
 # 2) with weights
-SLmetrics::cmatrix(
+SLmetrics::weighted.cmatrix(
     actual    = actual,
     predicted = predicted,
     w         = weights