diff --git a/main/404.html b/main/404.html index 72e9514b..226a44de 100644 --- a/main/404.html +++ b/main/404.html @@ -1,4 +1,5 @@ - + +
@@ -42,17 +43,7 @@This file outlines how to propose and make changes to rbmi
as well as providing details about more obscure aspects of the package’s development process.
In order to develop or contribute to rbmi
you will need to access to a C/C++ compiler. If you are on Windows you should install rtools or if you are on macOS you should install Xcode. Likewise, you will also need to install all of the package’s development dependencies. This can be done by launching R from within the project root and executing:
devtools::install_dev_deps()
If you want to make a code contribution, it’s a good idea to first file an issue and make sure someone from the team agrees that it’s needed. If you’ve found a bug, please file an issue that illustrates the bug with a minimal reprex (this will also help you write a unit test, if needed).
This project uses a simple GitHub flow model for development. That is, code changes should be done in their own feature branch based off of the main
branch and merged back into the main
branch once complete.
This project uses a simple GitHub flow model for development. That is, code changes should be done in their own feature branch based off of the main
branch and merged back into the main
branch once complete.
Pull Requests will not be accepted unless all CI/CD checks have passed. (See the CI/CD section for more information).
Pull Requests relating to any of the package’s core R code must be accompanied by a corresponding unit test. Any pull requests containing changes to the core R code that do not contain a unit test to demonstrate that it is working as intended will not be accepted. (See the Unit Testing section for more information).
Pull Requests should add a few lines about what has been changed to the NEWS.md
file.
We use roxygen2, with Markdown syntax, for documentation.
We use roxygen2, with Markdown syntax, for documentation.
Please ensure your code conforms to lintr. You can check this by running lintr::lint("FILE NAME")
on any files you have modified and ensuring that the findings are kept to as few as possible. We do not have any hard requirements on following lintr
’s conventions but do encourage developers to follow its guidance as closely as possible.
This project uses 4 space indents, contributions not following this will not be accepted.
This project makes use of S3 and R6 for OOP. Usage of S4 and other OOP systems should be avoided unless absolutely necessary to ensure consistency. Having said that it is recommended to stick to S3 unless modification in place or other R6 specific features are required.
The current desire of this package is to keep the dependency tree as small as possible. To that end you are discouraged from adding any additional packages to the “Depends” / “Imports” section unless absolutely essential. If you are importing a package just to use a single function then consider just copying the source code of that function instead, though please check the licence and include proper attribution/notices. There are no such expectations for “Suggests” and you are free to use any package in the vignettes / unit tests, though again please be mindful to not be unnecessarily excessive with this.
This project uses testthat
to perform unit testing in combination with GitHub Actions for CI/CD.
Due to the stochastic nature of this package some unit tests take a considerable amount of time to execute. To avoid issues with usability, unit tests that take more than a couple of seconds to run should be deferred to the scheduled testing. These are tests that are only run occasionally on a periodic basis (currently twice a month) and not on every pull request / push event.
To defer a test to the scheduled build simply include skip_if_not(is_full_test())
to the top of the test_that()
block i.e.
@@ -122,39 +82,32 @@Scheduled Testinghttps://github.com/insightsengineering/rbmi” -> “Actions” -> “Bi-Weekly” -> “Run Workflow”. It is advisable to do this before releasing to CRAN.
To support CI/CD, in terms of reducing setup time, a Docker images has been created which contains all the packages and system dependencies required for this project. The image can be found at:
-This image is automatically re-built once a month to contain the latest version of R and its packages. The code to create this images can be found in misc/docker
.
This image is automatically re-built once a month to contain the latest version of R and its packages. The code to create this images can be found in misc/docker
.
To build the image locally run the following from the project root directory:
docker build -f misc/docker/Dockerfile -t rbmi:latest .
A particular issue with testing this package is reproducibility. For the most part this is handled well via set.seed()
however stan
/rstan
does not guarantee reproducibility even with the same seed if run on different hardware.
This issue surfaces itself when testing the print messages of the pool
object which displays treatment estimates which are thus not identical when run on different machines. To address this issue pre-made pool
objects have been generated and stored in R/sysdata.rda
(which itself is generated by data-raw/create_print_test_data.R
). The generated print messages are compared to expected values which are stored in tests/testthat/_snaps/
(which themselves are automatically created by testthat::expect_snapshot()
)
This package currently uses the mmrm
package to fit MMRM models. This package is still fairly new but has so far proven to be very stable, fast and reliable. If you do spot any issues with the MMRM package please do raise them in the corresponding GitHub Repository - link
As the mmrm
package uses TMB
it is not uncommon to see warnings about either inconsistent versions between what TMB
and the Matrix
package were compiled as. In order to resolve this you may wish to re-compile these packages from source using:
install.packages(c("TMB", "mmrm"), type = "source")
Note that you will need to have rtools installed if you are on a Windows machine or Xcode if you are running macOS (or somehow else have access to a C/C++ compiler).
The Bayesian models fitted by this package are implemented via stan/rstan. The code for this can be found in inst/stan/MMRM.stan
. Note that the package will automatically take care of compiling this code when you install it or run devtools::load_all()
. Please note that the package won’t recompile the code unless you have changed the source code or you delete the src
directory.
CRAN imposes a 10-minute run limit on building, compiling and testing your package. To keep to this limit the vignettes are pre-built; that is to say that simply changing the source code will not automatically update the vignettes, you will need to manually re-build them.
To do this you need to run:
Rscript vignettes/build.R
@@ -162,16 +115,14 @@ The misc/
folder in this project is used to hold useful scripts, analyses, simulations & infrastructure code that we wish to keep but isn’t essential to the build or deployment of the package. Feel free to store additional stuff in here that you feel is worth keeping.
Likewise, local/
has been added to the .gitignore
file meaning anything stored in this folder won’t be committed to the repository. For example, you may find this useful for storing personal scripts for testing or more generally exploring the package during development.
Version 2.0, January 2004 <http://www.apache.org/licenses/>
“License” shall mean the terms and conditions for use, reproduction, and distribution as defined by Sections 1 through 9 of this document.
“Licensor” shall mean the copyright owner or entity authorized by the copyright owner that is granting the License.
“Legal Entity” shall mean the union of the acting entity and all other entities that control, are controlled by, or are under common control with that entity. For the purposes of this definition, “control” means (i) the power, direct or indirect, to cause the direction or management of such entity, whether by contract or otherwise, or (ii) ownership of fifty percent (50%) or more of the outstanding shares, or (iii) beneficial ownership of such entity.
@@ -91,21 +59,17 @@Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable copyright license to reproduce, prepare Derivative Works of, publicly display, publicly perform, sublicense, and distribute the Work and such Derivative Works in Source or Object form.
Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable (except as stated in this section) patent license to make, have made, use, offer to sell, sell, import, and otherwise transfer the Work, where such license applies only to those patent claims licensable by such Contributor that are necessarily infringed by their Contribution(s) alone or by combination of their Contribution(s) with the Work to which such Contribution(s) was submitted. If You institute patent litigation against any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Work or a Contribution incorporated within the Work constitutes direct or contributory patent infringement, then any patent licenses granted to You under this License for that Work shall terminate as of the date such litigation is filed.
You may reproduce and distribute copies of the Work or Derivative Works thereof in any medium, with or without modifications, and in Source or Object form, provided that You meet the following conditions:
-Unless You explicitly state otherwise, any Contribution intentionally submitted for inclusion in the Work by You to the Licensor shall be under the terms and conditions of this License, without any additional terms or conditions. Notwithstanding the above, nothing herein shall supersede or modify the terms of any separate license agreement you may have executed with Licensor regarding such Contributions.
This License does not grant permission to use the trade names, trademarks, service marks, or product names of the Licensor, except as required for reasonable and customary use in describing the origin of the Work and reproducing the content of the NOTICE file.
Unless required by applicable law or agreed to in writing, Licensor provides the Work (and each Contributor provides its Contributions) on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. You are solely responsible for determining the appropriateness of using or redistributing the Work and assume any risks associated with Your exercise of permissions under this License.
In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, shall any Contributor be liable to You for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising as a result of this License or out of the use or inability to use the Work (including but not limited to damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses), even if such Contributor has been advised of the possibility of such damages.
While redistributing the Work or Derivative Works thereof, You may choose to offer, and charge a fee for, acceptance of support, warranty, indemnity, or other liability obligations and/or rights consistent with this License. However, in accepting such obligations, You may act only on Your own behalf and on Your sole responsibility, not on behalf of any other Contributor, and only if You agree to indemnify, defend, and hold each Contributor harmless for any liability incurred by, or claims asserted against, such Contributor by reason of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
To apply the Apache License to your work, attach the following boilerplate notice, with the fields enclosed by brackets []
replaced with your own identifying information. (Don’t include the brackets!) The text should be enclosed in the appropriate comment syntax for the file format. We also recommend that a file or class name and description of purpose be included on the same “printed page” as the copyright notice for easier identification within third-party archives.
Source: inst/CITATION
Gower-Page C, Noci A, Gravestock I (2024). +
Gower-Page C, Noci A, Gravestock I (2025). rbmi: Reference Based Multiple Imputation. R package version 1.3.1, https://github.com/insightsengineering/rbmi, https://insightsengineering.github.io/rbmi/.
@Manual{, title = {rbmi: Reference Based Multiple Imputation}, author = {Craig Gower-Page and Alessandro Noci and Isaac Gravestock}, - year = {2024}, + year = {2025}, note = {R package version 1.3.1, https://github.com/insightsengineering/rbmi}, url = {https://insightsengineering.github.io/rbmi/}, }@@ -131,8 +99,7 @@
CRAN release: 2024-12-11
-rstan
model were not being correctly cleared (#459)CRAN release: 2024-12-11
+rstan
model were not being correctly cleared (#459)CRAN release: 2024-10-16
+CRAN release: 2024-10-16
rstan
to be a suggested package to simplify the installation process. This means that the Bayesian imputation functionality will not be available by default. To use this feature, you will need to install rstan
separately (#441)rstan
to be a suggested package to simplify the installation process. This means that the Bayesian imputation functionality will not be available by default. To use this feature, you will need to install rstan
separately (#441)seed
argument to method_bayes()
in favour of using the base set.seed()
function (#431)rbmi
(#406)lsmeans()
for better consistency with the emmeans
package (#412)
-lsmeans(..., weights = "proportional")
to lsmeans(..., weights = "counterfactual")
to more accurately reflect the weights used in the calculation.lsmeans(..., weights = "proportional")
to lsmeans(..., weights = "counterfactual")
to more accurately reflect the weights used in the calculation.lsmeans(..., weights = "proportional_em")
which provides consistent results with emmeans(..., weights = "proportional")
lsmeans(..., weights = "proportional")
has been left in the package for backwards compatibility and is an alias for lsmeans(..., weights = "counterfactual")
but now gives a message prompting users to use either “proptional_em” or “counterfactual” instead.analyse()
function (#370)mmrm
package (#437)rbmi
citation detail (#423 #425)CRAN release: 2023-11-24
-CRAN release: 2023-11-24
+CRAN release: 2023-09-20
-CRAN release: 2023-09-20
+CRAN release: 2022-11-14
-rbmi
depends onCRAN release: 2022-11-14
+rbmi
depends onCRAN release: 2022-10-25
-|>
in testing code so package is backwards compatible with older serversCRAN release: 2022-10-25
+|>
in testing code so package is backwards compatible with older serversglmmTMB
dependency with the mmrm
package. This has resulted in the package being more stable (less model fitting convergence issues) as well as speeding up run times 3-fold.CRAN release: 2022-05-18
-CRAN release: 2022-05-18
+delta_template()
draws()
simulate_data()
CRAN release: 2022-03-02
+add()
-Adds content to the end of the stack (must be a list)
-pop()
-Retrieve content from the stack
-clone()
-The objects of this class are cloneable with this method.
-Named list containing the simulation parameters of the multivariate -normal distribution assumed for the given treatment group. It contains the following elements:
-mu
: Numeric vector indicating the mean outcome trajectory. It should include the outcome
+
Named list containing the simulation parameters of the multivariate +normal distribution assumed for the given treatment group. It contains the following elements:
mu
: Numeric vector indicating the mean outcome trajectory. It should include the outcome
at baseline.
sigma
Covariance matrix of the outcome trajectory.
Numeric variable that specifies the longitudinal outcome.
Factor variable that specifies the id of each subject.
A binary variable that takes value 1
if the corresponding outcome is affected
by the ICE and 0
otherwise.
Function implementing trajectories after the intercurrent event (ICE). Must
be one of getStrategies()
. See getStrategies()
for details.
Optional. Named list containing the simulation parameters of the -reference arm. It contains the following elements:
-mu
: Numeric vector indicating the mean outcome trajectory assuming no ICEs. It should
+
Optional. Named list containing the simulation parameters of the +reference arm. It contains the following elements:
mu
: Numeric vector indicating the mean outcome trajectory assuming no ICEs. It should
include the outcome at baseline.
sigma
Covariance matrix of the outcome trajectory assuming no ICEs.
Named list containing the simulation parameters of the multivariate -normal distribution assumed for the given treatment group. It contains the following elements:
-mu
: Numeric vector indicating the mean outcome trajectory. It should include the
+
Named list containing the simulation parameters of the multivariate +normal distribution assumed for the given treatment group. It contains the following elements:
mu
: Numeric vector indicating the mean outcome trajectory. It should include the
outcome at baseline.
sigma
Covariance matrix of the outcome trajectory.
Numeric variable that specifies the longitudinal outcome.
Function implementing trajectories after the intercurrent event (ICE).
Must be one of getStrategies()
. See getStrategies()
for details.
Optional. Named list containing the simulation parameters of the -reference arm. It contains the following elements:
-mu
: Numeric vector indicating the mean outcome trajectory assuming no ICEs. It should
+
Optional. Named list containing the simulation parameters of the +reference arm. It contains the following elements:
mu
: Numeric vector indicating the mean outcome trajectory assuming no ICEs. It should
include the outcome at baseline.
sigma
Covariance matrix of the outcome trajectory assuming no ICEs.
outcome
should be specified such that all-and-only the post-ICE observations
(i.e. the
observations to be adjusted) are set to NA
.
An imputations
object as created by impute()
.
An analysis function to be applied to each imputed dataset. See details.
A data.frame
containing the delta transformation to be applied to the imputed
datasets prior to running fun
. See details.
Additional arguments passed onto fun
.
The number of parallel processes to use when running this function. Can also be a
cluster object created by make_rbmi_cluster()
. See the parallisation section below.
Should inputations
be checked to ensure it conforms to the required format
(default = TRUE
) ? Can gain a small performance increase if this is set to FALSE
when
analysing a large number of samples.
This function works by performing the following steps:
-Extract a dataset from the imputations
object.
This function works by performing the following steps:
Extract a dataset from the imputations
object.
Apply any delta adjustments as specified by the delta
argument.
Run the analysis function fun
on the dataset.
Repeat steps 1-3 across all of the datasets inside the imputations
object.
Collect and return all of the analysis results.
The analysis function fun
must take a data.frame
as its first
+
The analysis function fun
must take a data.frame
as its first
argument. All other options to analyse()
are passed onto fun
via ...
.
fun
must return a named list with each element itself being a
@@ -157,9 +111,7 @@
est
(or additionally se
and df
if
you had originally specified method_bayes()
or method_approxbayes()
)
i.e.:
-
-myfun <- function(dat, ...) {
+myfun <- function(dat, ...) {
mod_1 <- lm(data = dat, outcome ~ group)
mod_2 <- lm(data = dat, outcome ~ group + covar)
x <- list(
@@ -175,9 +127,7 @@ Details
)
)
return(x)
- }
-
-
+ }
Please note that the vars$subjid
column (as defined in the original call to
draws()
) will be scrambled in the data.frames that are provided to fun
.
This is to say they will not contain the original subject values and as such
@@ -198,11 +148,7 @@
draws()
) and delta
. Essentially this data.frame
is merged onto the imputed dataset by vars$subjid
and vars$visit
and then
the outcome variable is modified by:
-
-
+
Please note that in order to provide maximum flexibility, the delta
argument
can be used to modify any/all outcome values including those that were not
imputed. Care must be taken when defining offsets. It is recommend that you
@@ -211,8 +157,7 @@
To speed up the evaluation of analyse()
you can use the ncores
argument to enable parallelisation.
@@ -220,9 +165,7 @@
my_custom_fun <- function(...) <some analysis code>
+my_custom_fun <- function(...) <some analysis code>
cl <- make_rbmi_cluster(
4,
objects = list("my_custom_fun" = my_custom_fun),
@@ -233,9 +176,7 @@ Parallelisation fun = my_custom_fun,
ncores = cl
)
-parallel::stopCluster(cl)
-
-
+parallel::stopCluster(cl)
Note that there is significant overhead both with setting up the sub-processes and with
transferring data back-and-forth between the main process and the sub-processes. As such
parallelisation of the analyse()
function tends to only be worth it when you have
@@ -251,9 +192,7 @@
Finally, if you are doing a tipping point analysis you can get a reasonable performance
improvement by re-using the cluster between each call to analyse()
e.g.
extract_imputed_dfs()
for manually extracting imputed
+
extract_imputed_dfs()
for manually extracting imputed
datasets.
delta_template()
for creating delta data.frames.
ancova()
for the default analysis function.
ancova()
for the default analysis function.
A data.frame
containing the data to be used in the model.
A vars
object as generated by set_vars()
. Only the group
,
visit
, outcome
and covariates
elements are required. See details.
An optional character vector specifying which visits to
fit the ancova model at. If NULL
, a separate ancova model will be fit to the
outcomes for each visit (as determined by unique(data[[vars$visit]])
).
See details.
Character, either "counterfactual"
(default), "equal"
,
"proportional_em"
or "proportional"
.
Specifies the weighting strategy to be used when calculating the lsmeans.
See the weighting section for more details.
The function works as follows:
-Select the first value from visits
.
The function works as follows:
Select the first value from visits
.
Subset the data to only the observations that occurred on this visit.
Fit a linear model as vars$outcome ~ vars$group + vars$covariates
.
Extract the "treatment effect" & least square means for each treatment group.
Repeat points 2-3 for all other values in visits
.
If no value for visits
is provided then it will be set to
+
If no value for visits
is provided then it will be set to
unique(data[[vars$visit]])
.
In order to meet the formatting standards set by analyse()
the results will be collapsed
into a single list suffixed by the visit name, e.g.:
list(
+list(
trt_visit_1 = list(est = ...),
lsm_ref_visit_1 = list(est = ...),
lsm_alt_visit_1 = list(est = ...),
@@ -153,9 +107,7 @@ Details
lsm_ref_visit_2 = list(est = ...),
lsm_alt_visit_2 = list(est = ...),
...
-)
-
-
+)
Please note that "ref" refers to the first factor level of vars$group
which does not necessarily
coincide with the control arm. Analogously, "alt" refers to the second factor level of vars$group
.
"trt" refers to the model contrast translating the mean difference between the second level and first level.
set_vars(covariates = c("sex*age"))
.
For weights = "counterfactual"
(the default) the lsmeans are obtained by
@@ -178,11 +128,7 @@
Note that to ensure backwards compatibility with previous versions of rbmi
weights = "proportional"
is an alias for weights = "counterfactual"
.
To get results consistent with emmeans
's weights = "proportional"
@@ -190,59 +136,42 @@
For weights = "equal"
the lsmeans are obtained by taking the model fitted
-value of a hypothetical patient whose covariates are defined as follows:
Continuous covariates are set to mean(X)
Continuous covariates are set to mean(X)
Dummy categorical variables are set to 1/N
where N
is the number of levels
Continuous * continuous interactions are set to mean(X) * mean(Y)
Continuous * categorical interactions are set to mean(X) * 1/N
Dummy categorical * categorical interactions are set to 1/N * 1/M
In comparison to emmeans
this approach is equivalent to:
In comparison to emmeans
this approach is equivalent to:
For weights = "proportional_em"
the lsmeans are obtained as per weights = "equal"
except instead of weighting each observation equally they are weighted by the proportion
in which the given combination of categorical values occurred in the data.
In comparison to emmeans
this approach is equivalent to:
Note that this is not to be confused with weights = "proportional"
which is an alias
for weights = "counterfactual"
.
A data.frame
containing the data to be used in the model.
Character, the name of the outcome variable in data
.
Character, the name of the group variable in data
.
Character vector containing the name of any additional covariates to be included in the model as well as any interaction terms.
Character, either "counterfactual"
(default), "equal"
,
"proportional_em"
or "proportional"
.
Specifies the weighting strategy to be used when calculating the lsmeans.
See the weighting section for more details.
group
must be a factor variable with only 2 levels.
group
must be a factor variable with only 2 levels.
outcome
must be a continuous numeric variable.
For weights = "counterfactual"
(the default) the lsmeans are obtained by
@@ -149,11 +103,7 @@
Note that to ensure backwards compatibility with previous versions of rbmi
weights = "proportional"
is an alias for weights = "counterfactual"
.
To get results consistent with emmeans
's weights = "proportional"
@@ -161,55 +111,40 @@
For weights = "equal"
the lsmeans are obtained by taking the model fitted
-value of a hypothetical patient whose covariates are defined as follows:
Continuous covariates are set to mean(X)
Continuous covariates are set to mean(X)
Dummy categorical variables are set to 1/N
where N
is the number of levels
Continuous * continuous interactions are set to mean(X) * mean(Y)
Continuous * categorical interactions are set to mean(X) * 1/N
Dummy categorical * categorical interactions are set to 1/N * 1/M
In comparison to emmeans
this approach is equivalent to:
In comparison to emmeans
this approach is equivalent to:
For weights = "proportional_em"
the lsmeans are obtained as per weights = "equal"
except instead of weighting each observation equally they are weighted by the proportion
in which the given combination of categorical values occurred in the data.
In comparison to emmeans
this approach is equivalent to:
Note that this is not to be confused with weights = "proportional"
which is an alias
for weights = "counterfactual"
.
A data.frame
with 608 rows and 11 variables:
PATIENT
: patients IDs.
A data.frame
with 608 rows and 11 variables:
PATIENT
: patients IDs.
HAMATOTL
: total score Hamilton Anxiety Rating Scale.
PGIIMP
: patient's Global Impression of Improvement Rating Scale.
RELDAYS
: number of days between visit and baseline.
BASVAL
: baseline outcome value.
HAMDTL17
: Hamilton 17-item rating scale value.
CHANGE
: change from baseline in the Hamilton 17-item rating scale.
The relevant endpoint is the Hamilton 17-item rating scale for depression (HAMD17) for which baseline and weeks 1, 2, 4, and 6 assessments are included. Study drug discontinuation occurred in 24% subjects from the active drug and 26% from @@ -140,16 +102,14 @@
Goldstein, Lu, Detke, Wiltse, Mallinckrodt, Demitrack. Duloxetine in the treatment of depression: a double-blind placebo-controlled comparison with paroxetine. J Clin Psychopharmacol 2004;24: 389-399.
data.frame
which will have its outcome
column adjusted.
data.frame
(must contain a column called delta
).
character vector of variables in both data
and delta
that will be used
to merge the 2 data.frames together by.
character, name of the outcome variable in data
.
A list of lists contain the analysis results for each imputation
See analyse()
for details on what this object should look like.
The method object as specified in draws()
.
The delta dataset used. See analyse()
for details on how this
should be specified.
The analysis function that was used.
The character name of the analysis function (used for printing) purposes.
A method
object as generated by either method_bayes()
,
method_approxbayes()
, method_condmean()
or method_bmlmi()
.
A list of sample_single
objects. See sample_single()
.
R6 longdata
object containing all relevant input data information.
Fixed effects formula object used for the model specification.
Absolute number of failures of the model fit.
If method_bayes()
is chosen, returns the MCMC Stan fit object. Otherwise NULL
.
A draws
object which is a named list containing the following:
data
: R6 longdata
object containing all relevant input data information.
A draws
object which is a named list containing the following:
data
: R6 longdata
object containing all relevant input data information.
method
: A method
object as generated by either method_bayes()
,
method_approxbayes()
or method_condmean()
.
samples
: list containing the estimated parameters of interest.
-Each element of samples
is a named list containing the following:
ids
: vector of characters containing the ids of the subjects included in the original dataset.
samples
: list containing the estimated parameters of interest.
+Each element of samples
is a named list containing the following:
ids
: vector of characters containing the ids of the subjects included in the original dataset.
beta
: numeric vector of estimated regression coefficients.
sigma
: list of estimated covariance matrices (one for each level of vars$group
).
theta
: numeric vector of transformed covariances.
failed
: Logical. TRUE
if the model fit failed.
ids_samp
: vector of characters containing the ids of the subjects included in the given sample.
fit
: if method_bayes()
is chosen, returns the MCMC Stan fit object. Otherwise NULL
.
n_failures
: absolute number of failures of the model fit.
Relevant only for method_condmean(type = "bootstrap")
, method_approxbayes()
and method_bmlmi()
.
formula
: fixed effects formula object used for the model specification.
A list of imputations_list
's as created by imputation_df()
A longdata
object as created by longDataConstructor()
A method
object as created by method_condmean()
, method_bayes()
or
method_approxbayes()
A named vector. Identifies the references to be used when generating the
imputed values. Should be of the form c("Group" = "Reference", "Group" = "Reference")
.
Converts a design matrix + key variables into a common format -In particular this function does the following:
-Renames all covariates as V1
, V2
, etc to avoid issues of special characters in variable names
Renames all covariates as V1
, V2
, etc to avoid issues of special characters in variable names
Ensures all key variables are of the right type
Inserts the outcome, visit and subjid variables into the data.frame
naming them as outcome
, visit
and subjid
If provided will also insert the group variable into the data.frame
named as group
a data.frame
or matrix
containing the covariates to use in the MMRM model.
Dummy variables must already be expanded out, i.e. via stats::model.matrix()
. Cannot contain
any missing values
a numeric vector. The outcome value to be regressed on in the MMRM model.
a character / factor vector. Indicates which visit the outcome value occurred on.
a character / factor vector. The subject identifier used to link separate visits that belong to the same subject.
a character / factor vector. Indicates which treatment group the patient belongs to.
an mmrm data.frame
as created by as_mmrm_df()
Character - The covariance structure to be used, must be one of "us"
(default),
"ad"
, "adh"
, "ar1"
, "ar1h"
, "cs"
, "csh"
, "toep"
, or "toeph"
)
The outcome column may contain NA's but none of the other variables listed in the formula should contain missing values
Collapse multiple categorical variables into distinct unique categories. e.g.
- - +would return
- - +check_ESS()
works as follows:
Extract the ESS from stan_fit
for each parameter of the model.
check_ESS()
works as follows:
Extract the ESS from stan_fit
for each parameter of the model.
Compute the relative ESS (i.e. the ESS divided by the number of draws).
Check whether for any of the parameter the ESS is lower than threshold
.
If for at least one parameter the relative ESS is below the threshold,
a warning is thrown.
Check that:
-There are no divergent iterations.
Check that:
There are no divergent iterations.
The Bayesian Fraction of Missing Information (BFMI) is sufficiently low.
The number of iterations that saturated the max treedepth is zero.
Please see rstan::check_hmc_diagnostics()
for details.
Please see rstan::check_hmc_diagnostics()
for details.
Performs checks of the quality of the MCMC. See check_ESS()
and check_hmc_diagn()
for details.
the covariance matrix with dimensions equal to index_mar
for
the subjects original group
the covariance matrix with dimensions equal to index_mar
for
the subjects reference group
A logical vector indicating which visits meet the MAR assumption for the subject. I.e. this identifies the observations that after a non-MAR intercurrent event (ICE).
Carpenter, James R., James H. Roger, and Michael G. Kenward. "Analysis of longitudinal trials with protocol deviation: a framework for relevant, accessible assumptions, and inference via multiple imputation." Journal of Biopharmaceutical statistics 23.6 (2013): @@ -128,8 +88,7 @@
a list of imputation_list_single()
objects
A list with 1 element per required imputation_df. Each element +
A list with 1 element per required imputation_df. Each element
must contain a vector of "ID"'s which correspond to the imputation_single()
ID's
that are required for that dataset. The total number of ID's must by equal to the
total number of rows within all of imputes$imputations
imputes = list(
+imputes = list(
imputation_list_single(
id = "Tom",
imputations = matrix(
@@ -135,13 +95,9 @@ Argumentssample_ids <- list(
c("Tom", "Harry", "Tom"),
c("Tom")
-)
-
-
+)
Then convert_to_imputation_df(imputes, sample_ids)
would result in:
imputation_list_df(
+imputation_list_df(
imputation_df(
imputation_single_t_1_1,
imputation_single_h_1_1,
@@ -158,19 +114,14 @@ Arguments imputation_df(
imputation_single_t_3_2
)
-)
-
-
+)
Note that the different repetitions (i.e. the value set for D) are grouped together -sequentially.
- +sequentially. - - + - + - + + + - - diff --git a/main/reference/d_lagscale.html b/main/reference/d_lagscale.html index e8fbba4b..4413b810 100644 --- a/main/reference/d_lagscale.html +++ b/main/reference/d_lagscale.html @@ -1,22 +1,7 @@ - - - - - - -a numeric vector. Determines the baseline amount of delta to be applied to each visit.
a numeric vector. Determines the scaling to be applied
to delta
based upon with visit the ICE occurred on. Must be the same
length as delta.
logical vector. Indicates whether a visit is "post-ICE" or not.
an imputation
object as created by impute()
.
NULL
or a numeric vector. Determines the baseline amount of delta
to be applied to each visit. See details. If a numeric vector it must have
the same length as the number of unique visits in the original dataset.
NULL
or a numeric vector. Determines the scaling to be applied
to delta
based upon which visit the ICE occurred on. See details. If a
numeric vector it must have the same length as the number of unique visits in
the original dataset.
Logical, if TRUE
then non-missing post-ICE data will have a delta value
of 0 assigned. Note that the calculation (as described in the details section) is performed
first and then overwritten with 0's at the end (i.e. the delta values for missing
post-ICE visits will stay the same regardless of this option).
To apply a delta adjustment the analyse()
function expects
a delta data.frame
with 3 variables: vars$subjid
, vars$visit
and delta
(where vars
is the object supplied in the original call to draws()
@@ -139,49 +98,37 @@
Let delta = c(5,6,7,8)
and dlag=c(1,2,3,4)
(i.e. assuming there are 4 visits)
and lets say that the subject had an ICE on visit 2. The calculation would then be
as follows:
v1 v2 v3 v4
+v1 v2 v3 v4
--------------
5 6 7 8 # delta assigned to each visit
0 1 2 3 # lagged scaling starting from the first visit after the subjects ICE
--------------
0 6 14 24 # delta * lagged scaling
--------------
- 0 6 20 44 # accumulative sum of delta to be applied to each visit
-
-
+ 0 6 20 44 # accumulative sum of delta to be applied to each visit
That is to say the subject would have a delta offset of 0 applied for visit-1, 6 for visit-2, 20 for visit-3 and 44 for visit-4. As a comparison, lets say that the subject instead had their ICE on visit 3, the calculation would then be as follows:
- -v1 v2 v3 v4
+v1 v2 v3 v4
--------------
5 6 7 8 # delta assigned to each visit
0 0 1 2 # lagged scaling starting from the first visit after the subjects ICE
--------------
0 0 7 16 # delta * lagged scaling
--------------
- 0 0 7 23 # accumulative sum of delta to be applied to each visit
-
-
+ 0 0 7 23 # accumulative sum of delta to be applied to each visit
In terms of practical usage, lets say that you wanted a delta of 5 to be used for all post ICE visits
regardless of their proximity to the ICE visit. This can be achieved by setting
delta = c(5,5,5,5)
and dlag = c(1,0,0,0)
. For example lets say a subject had their
ICE on visit-1, then the calculation would be as follows:
v1 v2 v3 v4
+v1 v2 v3 v4
--------------
5 5 5 5 # delta assigned to each visit
1 0 0 0 # lagged scaling starting from the first visit after the subjects ICE
--------------
5 0 0 0 # delta * lagged scaling
--------------
- 5 5 5 5 # accumulative sum of delta to be applied to each visit
-
-
+ 5 5 5 5 # accumulative sum of delta to be applied to each visit
Another way of using these arguments
is to set delta
to be the difference in time between visits and dlag
to be the
amount of delta per unit of time. For example lets say that we have a visit on weeks
@@ -189,49 +136,39 @@
delta = c(0,4,1,3)
(the difference in weeks between each visit)
and dlag = c(3, 3, 3, 3)
. For example lets say we have a subject who had their ICE on week-5
(i.e. visit-2) then the calculation would be:
-
-v1 v2 v3 v4
+v1 v2 v3 v4
--------------
0 4 1 3 # delta assigned to each visit
0 0 3 3 # lagged scaling starting from the first visit after the subjects ICE
--------------
0 0 3 9 # delta * lagged scaling
--------------
- 0 0 3 12 # accumulative sum of delta to be applied to each visit
-
-
+ 0 0 3 12 # accumulative sum of delta to be applied to each visit
i.e. on week-6 (1 week after the ICE) they have a delta of 3 and on week-9 (4 weeks after the ICE) they have a delta of 12.
Please note that this function also returns several utility variables so that
the user can create their own custom logic for defining what delta
-should be set to. These additional variables include:
is_mar
- If the observation was missing would it be regarded as MAR? This variable
+should be set to. These additional variables include:
is_mar
- If the observation was missing would it be regarded as MAR? This variable
is set to FALSE
for observations that occurred after a non-MAR ICE, otherwise it is set to TRUE
.
is_missing
- Is the outcome variable for this observation missing.
is_post_ice
- Does the observation occur after the patient's ICE as defined by the
data_ice
dataset supplied to draws()
.
strategy
- What imputation strategy was assigned to for this subject.
The design and implementation of this function is largely based upon the same functionality +
The design and implementation of this function is largely based upon the same functionality as implemented in the so called "five marcos" by James Roger. See Roger (2021).
Roger, James. Reference-based mi via multivariate normal rm (the “five macros” and miwithd), 2021. URL https://www.lshtm.ac.uk/research/centres-projects-groups/missing-data#dia-missing-data.
draws(data, data_ice = NULL, vars, method, ncores = 1, quiet = FALSE)
# S3 method for class 'approxbayes'
@@ -141,84 +108,64 @@ Usage
A data.frame
containing the data to be used in the model. See details.
A data.frame
that specifies the information related
to the ICEs and the imputation strategies. See details.
A vars
object as generated by set_vars()
. See details.
A method
object as generated by either method_bayes()
,
method_approxbayes()
, method_condmean()
or method_bmlmi()
.
It specifies the multiple imputation methodology to be used. See details.
A single numeric specifying the number of cores to use in creating the draws object.
Note that this parameter is ignored for method_bayes()
(Default = 1). Can also be a cluster object
generated by make_rbmi_cluster()
Logical, if TRUE
will suppress printing of progress information that is printed to
the console.
A draws
object which is a named list containing the following:
data
: R6 longdata
object containing all relevant input data information.
A draws
object which is a named list containing the following:
data
: R6 longdata
object containing all relevant input data information.
method
: A method
object as generated by either method_bayes()
,
method_approxbayes()
or method_condmean()
.
samples
: list containing the estimated parameters of interest.
-Each element of samples
is a named list containing the following:
ids
: vector of characters containing the ids of the subjects included in the original dataset.
samples
: list containing the estimated parameters of interest.
+Each element of samples
is a named list containing the following:
ids
: vector of characters containing the ids of the subjects included in the original dataset.
beta
: numeric vector of estimated regression coefficients.
sigma
: list of estimated covariance matrices (one for each level of vars$group
).
theta
: numeric vector of transformed covariances.
failed
: Logical. TRUE
if the model fit failed.
ids_samp
: vector of characters containing the ids of the subjects included in the given sample.
fit
: if method_bayes()
is chosen, returns the MCMC Stan fit object. Otherwise NULL
.
n_failures
: absolute number of failures of the model fit.
Relevant only for method_condmean(type = "bootstrap")
, method_approxbayes()
and method_bmlmi()
.
formula
: fixed effects formula object used for the model specification.
draws
performs the first step of the multiple imputation (MI) procedure: fitting the
base imputation model. The goal is to estimate the parameters of interest needed
for the imputation phase (i.e. the regression coefficients and the covariance matrices
from a MMRM model).
The function distinguishes between the following methods:
-Bayesian MI based on MCMC sampling: draws
returns the draws
+
The function distinguishes between the following methods:
Bayesian MI based on MCMC sampling: draws
returns the draws
from the posterior distribution of the parameters using a Bayesian approach based on
MCMC sampling. This method can be specified by using method = method_bayes()
.
Approximate Bayesian MI based on bootstrapping: draws
returns
@@ -237,20 +184,16 @@
Bootstrapped Maximum Likelihood MI: draws
returns the MMRM parameter estimates from
a given number of bootstrap samples needed to perform random imputations of the bootstrapped samples.
This method can be specified by using method = method_bmlmi()
.
Bayesian MI based on MCMC sampling has been proposed in Carpenter, Roger, and Kenward (2013) who first introduced +
Bayesian MI based on MCMC sampling has been proposed in Carpenter, Roger, and Kenward (2013) who first introduced reference-based imputation methods. Approximate Bayesian MI is discussed in Little and Rubin (2002). Conditional mean imputation methods are discussed in Wolbers et al (2022). Bootstrapped Maximum Likelihood MI is described in Von Hippel & Bartlett (2021).
-The argument data
contains the longitudinal data. It must have at least the following variables:
subjid
: a factor vector containing the subject ids.
The argument data
contains the longitudinal data. It must have at least the following variables:
subjid
: a factor vector containing the subject ids.
visit
: a factor vector containing the visit the outcome was observed on.
group
: a factor vector containing the group that the subject belongs to.
outcome
: a numeric vector containing the outcome variable. It might contain missing values.
Additional baseline or time-varying covariates must be included in data
.
data
must have one row per visit per subject. This means that incomplete
+
data
must have one row per visit per subject. This means that incomplete
outcome data must be set as NA
instead of having the related row missing. Missing values
in the covariates are not allowed.
If data
is incomplete
@@ -268,21 +211,16 @@
draws()
.
The argument data_ice
contains information about the occurrence of ICEs. It is a
-data.frame
with 3 columns:
Subject ID: a character vector containing the ids of the subjects that experienced
+data.frame
with 3 columns:
Subject ID: a character vector containing the ids of the subjects that experienced
the ICE. This column must be named as specified in vars$subjid
.
Visit: a character vector containing the first visit after the occurrence of the ICE
(i.e. the first visit affected by the ICE).
The visits must be equal to one of the levels of data[[vars$visit]]
.
If multiple ICEs happen for the same subject, then only the first non-MAR visit should be used.
This column must be named as specified in vars$visit
.
Strategy: a character vector specifying the imputation strategy to address the ICE for this subject. +
Strategy: a character vector specifying the imputation strategy to address the ICE for this subject.
This column must be named as specified in vars$strategy
.
-Possible imputation strategies are:
"MAR"
: Missing At Random.
"MAR"
: Missing At Random.
"CIR"
: Copy Increments in Reference.
"CR"
: Copy Reference.
"JR"
: Jump to Reference.
The data_ice
argument is necessary at this stage since (as explained in Wolbers et al (2022)), the model is fitted
+
+
The data_ice
argument is necessary at this stage since (as explained in Wolbers et al (2022)), the model is fitted
after removing the observations which are incompatible with the imputation model, i.e.
any observed data on or after data_ice[[vars$visit]]
that are addressed with an imputation
strategy different from MAR are excluded for the model fit. However such observations
@@ -307,9 +243,7 @@
impute()
; this means that subjects who didn't have a record in data_ice
will always have their
missing data imputed under the MAR assumption even if their strategy is updated.
The vars
argument is a named list that specifies the names of key variables within
-data
and data_ice
. This list is created by set_vars()
and contains the following named elements:
subjid
: name of the column in data
and data_ice
which contains the subject ids variable.
data
and data_ice
. This list is created by set_vars()
and contains the following named elements:subjid
: name of the column in data
and data_ice
which contains the subject ids variable.
visit
: name of the column in data
and data_ice
which contains the visit variable.
group
: name of the column in data
which contains the group variable.
outcome
: name of the column in data
which contains the outcome variable.
vars$group
is set as stratification variable.
Needed only for method_condmean(type = "bootstrap")
and method_approxbayes()
.
strategy
: name of the column in data_ice
which contains the subject-specific imputation strategy.
In our experience, Bayesian MI (method = method_bayes()
) with a relatively low number of
+
In our experience, Bayesian MI (method = method_bayes()
) with a relatively low number of
samples (e.g. n_samples
below 100) frequently triggers STAN warnings about R-hat such as
"The largest R-hat is X.XX, indicating chains have not mixed". In many instances, this warning
might be spurious, i.e. standard diagnostics analysis of the MCMC samples do not indicate any
@@ -331,8 +264,7 @@
James R Carpenter, James H Roger, and Michael G Kenward. Analysis of longitudinal trials with protocol deviation: a framework for relevant, accessible assumptions, and inference via multiple imputation. Journal of Biopharmaceutical Statistics, 23(6):1352–1371, 2013.
@@ -349,20 +281,16 @@method_bayes()
, method_approxbayes()
, method_condmean()
, method_bmlmi()
for setting method
.
method_bayes()
, method_approxbayes()
, method_condmean()
, method_bmlmi()
for setting method
.
set_vars()
for setting vars
.
expand_locf()
for expanding data
in case of missing rows.
For more details see the quickstart vignette:
-vignette("quickstart", package = "rbmi")
.
vignette("quickstart", package = "rbmi")
.An expression to be evaluated. Should be a call to mmrm::mmrm()
.
This function was originally developed for use with glmmTMB which needed more hand-holding and dropping of false-positive warnings. It is not as important now but is kept around encase we need to catch false-positive warnings again in the future.
dataset to expand or fill in.
variables and the levels that should be expanded out (note that duplicate entries of levels will result in multiple rows for that level).
character vector containing the names of variables that need to be filled in.
character vector containing the names of variables to group
by when performing LOCF imputation of var
.
character vector containing the names of additional variables to sort the data.frame
by before performing LOCF.
The draws()
function makes the assumption that all subjects and visits are present
in the data.frame
and that all covariate values are non missing; expand()
,
fill_locf()
and expand_locf()
are utility functions to support users in ensuring
@@ -144,10 +102,8 @@
c(group, order)
before performing the LOCF imputation; the data.frame
will be returned in the original sort order however.
expand_locf()
a simple composition function of fill_locf()
and expand()
i.e.
-fill_locf(expand(...))
.
fill_locf(expand(...))
.The fill_locf()
function performs last observation carried forward imputation.
@@ -158,9 +114,7 @@
library(dplyr)
+library(dplyr)
dat_expanded <- expand(
data = dat,
@@ -169,16 +123,13 @@ Missing First Values)
dat_filled <- dat_expanded %>%
- left_join(baseline_covariates, by = "subject")
-
-
+ left_join(baseline_covariates, by = "subject")
if (FALSE) { # \dontrun{
dat_expanded <- expand(
data = dat,
@@ -207,8 +158,7 @@ Examples
-
A data.frame
containing longdata$get_data(longdata$ids)
, but MNAR outcome
values are set to NA
.
A named list of length 2 containing:
-beta
: a list of length equal to the number of draws containing
+
A named list of length 2 containing:
beta
: a list of length equal to the number of draws containing
the draws from the posterior distribution of the regression coefficients.
sigma
: a list of length equal to the number of draws containing
the draws from the posterior distribution of the covariance matrices. Each element
of the list is a list with length equal to 1 if same_cov = TRUE
or equal to the
number of groups if same_cov = FALSE
.
An imputation object as generated by imputation_df()
.
A longdata
object as generated by longDataConstructor()
.
Either NULL
or a data.frame
. Is used to offset outcome values in the imputed dataset.
Logical. If TRUE
an attribute called "idmap" is attached to
the return object which contains a list
that maps the old subject ids
the new subject ids.
extract_imputed_dfs(
imputations,
index = seq_along(imputations$imputations),
@@ -92,55 +59,43 @@ Usage
An imputations
object as created by impute()
.
The indexes of the imputed datasets to return. By default,
all datasets within the imputations
object will be returned.
A data.frame
containing the delta transformation to be
applied to the imputed dataset. See analyse()
for details on the
format and specification of this data.frame
.
Logical. The subject IDs in the imputed data.frame
's are
replaced with new IDs to ensure they are unique. Setting this argument to
TRUE
attaches an attribute, called idmap
, to the returned data.frame
's
that will provide a map from the new subject IDs to the old subject IDs.
delta_template()
for creating delta data.frames.
delta_template()
for creating delta data.frames.
The design matrix of the fixed effects.
The response variable. Must be numeric.
Character vector containing the group variable.
Character vector containing the subjects IDs.
Character vector containing the visit variable.
A method
object as generated by method_bayes()
.
Specify whether the stan sampling log should be printed to the console.
A named list composed by the following:
-samples
: a named list containing the draws for each parameter. It corresponds to the output of extract_draws()
.
A named list composed by the following:
samples
: a named list containing the draws for each parameter. It corresponds to the output of extract_draws()
.
fit
: a stanfit
object.
The Bayesian model assumes a multivariate normal likelihood function and weakly-informative priors for the model parameters: in particular, uniform priors are assumed for the regression coefficients and inverse-Wishart priors for the covariance matrices. The chain is initialized using the REML parameter estimates from MMRM as starting values.
-The function performs the following steps:
-Fit MMRM using a REML approach.
The function performs the following steps:
Fit MMRM using a REML approach.
Prepare the input data for the MCMC fit as described in the data{}
block of the Stan file. See prepare_stan_data()
for details.
Run the MCMC according the input arguments and using as starting values the REML parameter estimates estimated at point 1.
Performs diagnostics checks of the MCMC. See check_mcmc()
for details.
Extract the draws from the model fit.
The chains perform method$n_samples
draws by keeping one every method$burn_between
iterations. Additionally
+
The chains perform method$n_samples
draws by keeping one every method$burn_between
iterations. Additionally
the first method$burn_in
iterations are discarded. The total number of iterations will
then be method$burn_in + method$burn_between*method$n_samples
.
The purpose of method$burn_in
is to ensure that the samples are drawn from the stationary
@@ -175,8 +124,7 @@
a data.frame
or matrix
containing the covariates to use in the MMRM model.
Dummy variables must already be expanded out, i.e. via stats::model.matrix()
. Cannot contain
any missing values
a numeric vector. The outcome value to be regressed on in the MMRM model.
a character / factor vector. The subject identifier used to link separate visits that belong to the same subject.
a character / factor vector. Indicates which visit the outcome value occurred on.
a character / factor vector. Indicates which treatment group the patient belongs to.
a character value. Specifies which covariance structure to use. Must be one of "us"
(default),
"ad"
, "adh"
, "ar1"
, "ar1h"
, "cs"
, "csh"
, "toep"
, or "toeph"
)
logical. Specifies whether restricted maximum likelihood should be used
logical. Used to specify if a shared or individual covariance matrix should be
used per group
A simul_pars
object as generated by set_simul_pars()
. It specifies
the simulation parameters of the given group.
Function implementing trajectories after the intercurrent event (ICE).
Must be one of getStrategies()
. See getStrategies()
for details. If NULL
then post-ICE
outcomes are untouched.
Optional. Named list containing the simulation parameters of the -reference arm. It contains the following elements:
-mu
: Numeric vector indicating the mean outcome trajectory assuming no ICEs. It should
+
Optional. Named list containing the simulation parameters of the +reference arm. It contains the following elements:
mu
: Numeric vector indicating the mean outcome trajectory assuming no ICEs. It should
include the outcome at baseline.
sigma
Covariance matrix of the outcome trajectory assuming no ICEs.
If NULL
, then these parameters are inherited from pars_group
.
A data.frame
containing the simulated data. It includes the following variables:
id
: Factor variable that specifies the id of each subject.
A data.frame
containing the simulated data. It includes the following variables:
id
: Factor variable that specifies the id of each subject.
visit
: Factor variable that specifies the visit of each assessment. Visit 0
denotes
the baseline visit.
group
: Factor variable that specifies which treatment group each subject belongs to.
outcome
: Numeric variable that specifies the longitudinal outcome including ICE1, ICE2
and the intermittent missing values.
By default Jump to Reference (JR), Copy Reference (CR), Copy Increments in Reference (CIR), Last Mean Carried Forward (LMCF) and Missing at Random (MAR) are defined.
@@ -121,8 +83,7 @@if (FALSE) { # \dontrun{
getStrategies()
getStrategies(
@@ -134,8 +95,7 @@ Examples
-
A longDataConstructor()
object
A method
object
A Stack()
object (this is only exposed for unit testing purposes)
an imputations object created by impute()
.
R6 longdata
object containing all relevant input data information.
A method
object as generated by either
method_approxbayes()
or method_condmean()
with argument type = "bootstrap"
.
A stack object containing the subject ids to be used on each mmrm iteration.
Number of samples needed to be created
Logical. If TRUE
the function returns method$n_samples + 1
samples where
the first sample contains the parameter estimates from the original dataset and method$n_samples
samples contain the parameter estimates from bootstrap samples.
@@ -132,59 +92,45 @@
Logical. If TRUE
, the sampled subject ids are returned. Otherwise
the subject ids from the original dataset are returned. These values are used to tell impute()
what subjects should be used to derive the imputed dataset.
Number of failed samples that are allowed before throwing an error
Number of processes to parallelise the job over
Logical, If TRUE
will suppress printing of progress information that is printed to
the console.
A draws
object which is a named list containing the following:
data
: R6 longdata
object containing all relevant input data information.
A draws
object which is a named list containing the following:
data
: R6 longdata
object containing all relevant input data information.
method
: A method
object as generated by either method_bayes()
,
method_approxbayes()
or method_condmean()
.
samples
: list containing the estimated parameters of interest.
-Each element of samples
is a named list containing the following:
ids
: vector of characters containing the ids of the subjects included in the original dataset.
samples
: list containing the estimated parameters of interest.
+Each element of samples
is a named list containing the following:
ids
: vector of characters containing the ids of the subjects included in the original dataset.
beta
: numeric vector of estimated regression coefficients.
sigma
: list of estimated covariance matrices (one for each level of vars$group
).
theta
: numeric vector of transformed covariances.
failed
: Logical. TRUE
if the model fit failed.
ids_samp
: vector of characters containing the ids of the subjects included in the given sample.
fit
: if method_bayes()
is chosen, returns the MCMC Stan fit object. Otherwise NULL
.
n_failures
: absolute number of failures of the model fit.
Relevant only for method_condmean(type = "bootstrap")
, method_approxbayes()
and method_bmlmi()
.
formula
: fixed effects formula object used for the model specification.
This function takes a Stack
object which contains multiple lists of patient ids. The function
takes this Stack and pulls a set ids and then constructs a dataset just consisting of these
patients (i.e. potentially a bootstrap or a jackknife sample).
ests
must be provided in the following order: the firsts D elements are related to analyses from
random imputation of one bootstrap sample. The second set of D elements (i.e. from D+1 to 2*D)
are related to the second bootstrap sample and so on.
Von Hippel, Paul T and Bartlett, Jonathan W8. Maximum likelihood multiple imputation: Faster imputations and consistent standard errors without posterior draws. 2021
get_example_data()
simulates a 1:1 randomized trial of
an active drug (intervention) versus placebo (control) with 100 subjects per
group and 6 post-baseline assessments (bi-monthly visits until 12 months).
One intercurrent event corresponding to treatment discontinuation is also simulated.
-Specifically, data are simulated under the following assumptions:
The mean outcome trajectory in the placebo group increases linearly from +Specifically, data are simulated under the following assumptions:
The mean outcome trajectory in the placebo group increases linearly from 50 at baseline (visit 0) to 60 at visit 6, i.e. the slope is 10 points/year.
The mean outcome trajectory in the intervention group is identical to the placebo group up to visit 2. From visit 2 onward, the slope decreases by 50% to 5 points/year.
Study drop-out at the study drug discontinuation visit occurs with a probability of 50% leading to missing outcome data from that time point onward.
A longDataConstructor()
object
A method
object
A Stack()
object (this is only exposed for unit testing purposes)
vector of characters containing the ids of the subjects.
R6 longdata
object containing all relevant input data information.
A method
object as generated by either
method_approxbayes()
or method_condmean()
.
A named list of class sample_single
. It contains the following:
ids
vector of characters containing the ids of the subjects included in the original dataset.
A named list of class sample_single
. It contains the following:
ids
vector of characters containing the ids of the subjects included in the original dataset.
beta
numeric vector of estimated regression coefficients.
sigma
list of estimated covariance matrices (one for each level of vars$group
).
theta
numeric vector of transformed covariances.
failed
logical. TRUE
if the model fit failed.
ids_samp
vector of characters containing the ids of the subjects included in the given sample.
The column is_avail
must be a character or numeric 0
or 1