Skip to content

Commit

Permalink
Adding cumulative counts and percents to ard_categorical() (#373)
Browse files Browse the repository at this point in the history
**What changes are proposed in this pull request?**
* The `ard_categorical()` function can now return cumulative counts and
percentages with `ard_categorical(statistic = varname ~ c('n_cum',
'p_cum'))`. (#145)

**Reference GitHub issue associated with pull request.** _e.g., 'closes
#<issue number>'_
closes #145


--------------------------------------------------------------------------------

Pre-review Checklist (if item does not apply, mark is as complete)
- [x] **All** GitHub Action workflows pass with a ✅
- [x] PR branch has pulled the most recent updates from master branch:
`usethis::pr_merge_main()`
- [x] If a bug was fixed, a unit test was added.
- [x] Code coverage is suitable for any new functions/features
(generally, 100% coverage for new code): `devtools::test_coverage()`
- [x] Request a reviewer

Reviewer Checklist (if item does not apply, mark is as complete)

- [ ] If a bug was fixed, a unit test was added.
- [ ] Run `pkgdown::build_site()`. Check the R console for errors, and
review the rendered website.
- [ ] Code coverage is suitable for any new functions/features:
`devtools::test_coverage()`

When the branch is ready to be merged:
- [ ] Update `NEWS.md` with the changes from this pull request under the
heading "`# cards (development version)`". If there is an issue
associated with the pull request, reference it in parentheses at the end
update (see `NEWS.md` for examples).
- [ ] **All** GitHub Action workflows pass with a ✅
- [ ] Approve Pull Request
- [ ] Merge the PR. Please use "Squash and merge" or "Rebase and merge".

_Optional Reverse Dependency Checks_:

- Install `checked` with `pak::pak("Genentech/checked")` or
`pak::pak("checked")`
- Check dev versions of `cardx`, `gtsummary`, and `tfrmt` which are in
the `ddsjoberg` R Universe

  ```shell
Rscript -e "options(checked.check_envvars = c(NOT_CRAN = TRUE));
checked::check_rev_deps(path = '.', repos =
c('https://ddsjoberg.r-universe.dev', 'https://cloud.r-project.org'))"
  ```

- Check CRAN reverse dependencies but run tests skipped on CRAN

  ```shell
Rscript -e "options(checked.check_envvars = c(NOT_CRAN = TRUE));
checked::check_rev_deps(path = '.', repos =
'https://cloud.r-project.org')"
  ```

- Check CRAN reverse dependencies in a CRAN-like environment

  ```shell
Rscript -e "options(checked.check_envvars = c(NOT_CRAN = FALSE),
checked.check_build_args = '--as-cran'); checked::check_rev_deps(path =
'.', repos = 'https://cloud.r-project.org')"
  ```

---------

Co-authored-by: Becca Krouse <[email protected]>
  • Loading branch information
ddsjoberg and bzkrouse authored Jan 16, 2025
1 parent d9db1d4 commit e027ad7
Show file tree
Hide file tree
Showing 15 changed files with 396 additions and 73 deletions.
6 changes: 3 additions & 3 deletions .github/PULL_REQUEST_TEMPLATE.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,17 +33,17 @@ _Optional Reverse Dependency Checks_:
- Check dev versions of `cardx`, `gtsummary`, and `tfrmt` which are in the `ddsjoberg` R Universe

```shell
Rscript -e "options(checked.check_envvars = c(NOT_CRAN = TRUE)); checked::check_rev_deps(path = '.', repos = c('https://ddsjoberg.r-universe.dev', 'https://cloud.r-project.org'))"
Rscript -e "options(checked.check_envvars = c(NOT_CRAN = TRUE)); checked::check_rev_deps(path = '.', n = parallel::detectCores() - 2L, repos = c('https://ddsjoberg.r-universe.dev', 'https://cloud.r-project.org'))"
```

- Check CRAN reverse dependencies but run tests skipped on CRAN

```shell
Rscript -e "options(checked.check_envvars = c(NOT_CRAN = TRUE)); checked::check_rev_deps(path = '.', repos = 'https://cloud.r-project.org')"
Rscript -e "options(checked.check_envvars = c(NOT_CRAN = TRUE)); checked::check_rev_deps(path = '.', n = parallel::detectCores() - 2, repos = 'https://cloud.r-project.org')"
```

- Check CRAN reverse dependencies in a CRAN-like environment

```shell
Rscript -e "options(checked.check_envvars = c(NOT_CRAN = FALSE), checked.check_build_args = '--as-cran'); checked::check_rev_deps(path = '.', repos = 'https://cloud.r-project.org')"
Rscript -e "options(checked.check_envvars = c(NOT_CRAN = FALSE), checked.check_build_args = '--as-cran'); checked::check_rev_deps(path = '.', n = parallel::detectCores() - 2, repos = 'https://cloud.r-project.org')"
```
29 changes: 20 additions & 9 deletions .github/workflows/test-coverage.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,10 @@ on:
push:
branches: [main, master]
pull_request:
branches: [main, master]

name: test-coverage
name: test-coverage.yaml

permissions: read-all

jobs:
test-coverage:
Expand All @@ -15,36 +16,46 @@ jobs:
GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}

steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4

- uses: r-lib/actions/setup-r@v2
with:
use-public-rspm: true

- uses: r-lib/actions/setup-r-dependencies@v2
with:
extra-packages: any::covr
needs: coverage, check
extra-packages: any::covr, any::xml2
needs: coverage

- name: Test coverage
run: |
covr::codecov(
cov <- covr::package_coverage(
quiet = FALSE,
clean = FALSE,
install_path = file.path(Sys.getenv("RUNNER_TEMP"), "package")
install_path = file.path(normalizePath(Sys.getenv("RUNNER_TEMP"), winslash = "/"), "package")
)
covr::to_cobertura(cov)
shell: Rscript {0}

- uses: codecov/codecov-action@v4
with:
# Fail if error if not on PR, or if on PR and token is given
fail_ci_if_error: ${{ github.event_name != 'pull_request' || secrets.CODECOV_TOKEN }}
file: ./cobertura.xml
plugin: noop
disable_search: true
token: ${{ secrets.CODECOV_TOKEN }}

- name: Show testthat output
if: always()
run: |
## --------------------------------------------------------------------
find ${{ runner.temp }}/package -name 'testthat.Rout*' -exec cat '{}' \; || true
find '${{ runner.temp }}/package' -name 'testthat.Rout*' -exec cat '{}' \; || true
shell: bash

- name: Upload test results
if: failure()
uses: actions/upload-artifact@v3
uses: actions/upload-artifact@v4
with:
name: coverage-test-failures
path: ${{ runner.temp }}/package
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ Suggests:
spelling (>= 2.2.0),
testthat (>= 3.2.0),
withr (>= 3.0.0)
Config/Needs/check: hms
Config/Needs/coverage: hms
Config/Needs/website: rmarkdown, jsonlite, yaml, gtsummary, tfrmt,
insightsengineering/nesttemplate
Config/testthat/edition: 3
Expand Down
94 changes: 77 additions & 17 deletions R/ard_categorical.R
Original file line number Diff line number Diff line change
Expand Up @@ -17,13 +17,13 @@
#' Arguments may be used in conjunction with one another.
#' @param variables ([`tidy-select`][dplyr::dplyr_tidy_select])\cr
#' columns to include in summaries. Default is `everything()`.
#' @param denominator (`data.frame`, `integer`)\cr
#' Specify this *optional* argument to change the denominator,
#' e.g. the `"N"` statistic. Default is `NULL`. See below for details.
#' @param denominator (`string`, `data.frame`, `integer`)\cr
#' Specify this argument to change the denominator,
#' e.g. the `"N"` statistic. Default is `'column'`. See below for details.
#' @param statistic ([`formula-list-selector`][syntax])\cr
#' a named list, a list of formulas,
#' or a single formula where the list element one or more of `c("n", "N", "p")`
#' (or the RHS of a formula).
#' or a single formula where the list element one or more of `c("n", "N", "p", "n_cum", "p_cum")`
#' (on the RHS of a formula).
#' @param stat_label ([`formula-list-selector`][syntax])\cr
#' a named list, a list of formulas, or a single formula where
#' the list element is either a named list or a list of formulas defining the
Expand All @@ -45,14 +45,18 @@
#' In such cases, use the `denominator` argument to specify a new definition
#' of `"N"`, and subsequently `"p"`.
#' The argument expects one of the following inputs:
#' - a string: one of `"column"`, `"row"`, or `"cell"`.
#' - `"column"`, the default, returns percentages where the sum is equal to
#' one within the variable after the data frame has been subset with `by`/`strata`.
#' - `"row"` gives 'row' percentages where `by`/`strata` columns are the 'top'
#' of a cross table, and the variables are the rows. This is well-defined
#' for a single `by` or `strata` variable, and care must be taken when there
#' are more to ensure the the results are as you expect.
#' - `"cell"` gives percentages where the denominator is the number of non-missing
#' rows in the source data frame.
#' - a data frame. Any columns in the data frame that overlap with the `by`/`strata`
#' columns will be used to calculate the new `"N"`.
#' - an integer. This single integer will be used as the new `"N"`
#' - a string: one of `"column"`, `"row"`, or `"cell"`. `"column"` is equivalent
#' to `denominator=NULL`. `"row"` gives 'row' percentages where `by`/`strata`
#' columns are the 'top' of a cross table, and the variables are the rows.
#' `"cell"` gives percentages where the denominator is the number of non-missing
#' rows in the source data frame.
#' - a structured data frame. The data frame will include columns from `by`/`strata`.
#' The last column must be named `"...ard_N..."`. The integers in this column will
#' be used as the updated `"N"` in the calculations.
Expand Down Expand Up @@ -104,7 +108,7 @@ ard_categorical.data.frame <- function(data,
by = dplyr::group_vars(data),
strata = NULL,
statistic = everything() ~ c("n", "p", "N"),
denominator = NULL,
denominator = "column",
fmt_fn = NULL,
stat_label = everything() ~ default_stat_labels(),
...) {
Expand Down Expand Up @@ -137,8 +141,8 @@ ard_categorical.data.frame <- function(data,
)
check_list_elements(
x = statistic,
predicate = \(x) is.character(x) && all(x %in% c("n", "p", "N")),
error_msg = "Elements passed in the {.arg statistic} argument must be one or more of {.val {c('n', 'p', 'N')}}"
predicate = \(x) is.character(x) && all(x %in% c("n", "p", "N", "n_cum", "p_cum")),
error_msg = "Elements passed in the {.arg statistic} argument must be one or more of {.val {c('n', 'p', 'N', 'n_cum', 'p_cum')}}"
)

# return empty ARD if no variables selected ----------------------------------
Expand Down Expand Up @@ -247,7 +251,7 @@ ard_categorical.data.frame <- function(data,
imap(
statistics_tabulation,
function(x, variable) {
if (any(c("N", "p") %in% x[["tabulation"]])) {
if (any(c("N", "p", "p_cum") %in% x[["tabulation"]])) {
TRUE
} else {
NULL
Expand Down Expand Up @@ -282,22 +286,32 @@ ard_categorical.data.frame <- function(data,
))
}
}
if ("p" %in% tab_stats[["tabulation"]]) {
if (any(c("p", "p_cum") %in% tab_stats[["tabulation"]])) {
df_result_tabulation <-
df_result_tabulation |>
dplyr::mutate(
...ard_p... = .data$...ard_n... / .data$...ard_N...
)
}

df_result_tabulation <-
.add_cum_count_stats(
df_result_tabulation,
variable = variable,
by = by,
strata = strata,
denominator = denominator,
tab_stats = tab_stats
)

df_result_tabulation |>
.nesting_rename_ard_columns(variable = variable, by = by, strata = strata) |>
dplyr::mutate(
across(any_of(c("...ard_n...", "...ard_N...", "...ard_p...")), as.list),
across(any_of(c("...ard_n...", "...ard_N...", "...ard_p...", "...ard_n_cum...", "...ard_p_cum...")), as.list),
across(c(matches("^group[0-9]+_level$"), any_of("variable_level")), as.list)
) |>
tidyr::pivot_longer(
cols = any_of(c("...ard_n...", "...ard_N...", "...ard_p...")),
cols = any_of(c("...ard_n...", "...ard_N...", "...ard_p...", "...ard_n_cum...", "...ard_p_cum...")),
names_to = "stat_name",
values_to = "stat"
) |>
Expand Down Expand Up @@ -334,6 +348,52 @@ ard_categorical.data.frame <- function(data,
)
}



.add_cum_count_stats <- function(x, variable, by, strata, denominator, tab_stats) {
# if no cumulative stats were requested, return the object
if (!any(c("p_cum", "n_cum") %in% tab_stats[["tabulation"]])) {
return(x)
}

# to return cumulative stats, the denominator must be 'column' or 'row'
if (!is_string(denominator) || !denominator %in% c("column", "row")) {
cli::cli_abort(
"The {.arg denominator} argument must be one of {.val {c(\"column\", \"row\")}}
when cumulative statistics {.val n_cum} or {.val p_cum} are specified, which
were requested for variable {.var {variable}}.",
call = get_cli_abort_call()
)
}

# calculate the cumulative statistics
if (denominator %in% "column") {
x <- x |>
dplyr::mutate(
.by = any_of(c(by, strata)),
...ard_n_cum... = switch("n_cum" %in% tab_stats[["tabulation"]],
cumsum(.data$...ard_n...)
),
...ard_p_cum... = switch("p_cum" %in% tab_stats[["tabulation"]],
cumsum(.data$...ard_p...)
)
)
} else if (denominator %in% "row") {
x <- x |>
dplyr::mutate(
.by = any_of(variable),
...ard_n_cum... = switch("n_cum" %in% tab_stats[["tabulation"]],
cumsum(.data$...ard_n...)
),
...ard_p_cum... = switch("p_cum" %in% tab_stats[["tabulation"]],
cumsum(.data$...ard_p...)
)
)
}

x
}

#' Results from `table()` as Data Frame
#'
#' Takes the results from [table()] and returns them as a data frame.
Expand Down
2 changes: 2 additions & 0 deletions R/default_stat_labels.R
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,8 @@ default_stat_labels <- function() {
n = "n",
N = "N",
p = "%",
n_cum = "Cumulative n",
p_cum = "Cumulative %",
N_obs = "Vector Length",
N_miss = "N Missing",
N_nonmiss = "N Non-missing",
Expand Down
2 changes: 1 addition & 1 deletion README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ knitr::opts_chunk$set(

<!-- badges: start -->
[![CRAN status](https://www.r-pkg.org/badges/version/cards)](https://CRAN.R-project.org/package=cards)
[![Codecov test coverage](https://codecov.io/gh/insightsengineering/cards/branch/main/graph/badge.svg)](https://app.codecov.io/gh/insightsengineering/cards?branch=main)
[![Codecov test coverage](https://codecov.io/gh/insightsengineering/cards/graph/badge.svg)](https://app.codecov.io/gh/insightsengineering/cards)
[![Downloads](https://cranlogs.r-pkg.org/badges/cards)](https://cran.r-project.org/package=cards)
[![R-CMD-check](https://github.com/insightsengineering/cards/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/insightsengineering/cards/actions/workflows/R-CMD-check.yaml)
[![Lifecycle: experimental](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](https://lifecycle.r-lib.org/articles/stages.html#experimental)
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
[![CRAN
status](https://www.r-pkg.org/badges/version/cards)](https://CRAN.R-project.org/package=cards)
[![Codecov test
coverage](https://codecov.io/gh/insightsengineering/cards/branch/main/graph/badge.svg)](https://app.codecov.io/gh/insightsengineering/cards?branch=main)
coverage](https://codecov.io/gh/insightsengineering/cards/graph/badge.svg)](https://app.codecov.io/gh/insightsengineering/cards)
[![Downloads](https://cranlogs.r-pkg.org/badges/cards)](https://cran.r-project.org/package=cards)
[![R-CMD-check](https://github.com/insightsengineering/cards/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/insightsengineering/cards/actions/workflows/R-CMD-check.yaml)
[![Lifecycle:
Expand Down
28 changes: 17 additions & 11 deletions man/ard_categorical.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

26 changes: 16 additions & 10 deletions man/ard_dichotomous.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit e027ad7

Please sign in to comment.