Expectation of factors in DMT01 #675

barnett11 · 2023-10-16T11:03:55Z

Hi,
This issue covers beyond DMT01, but primarily impacts this extremely common table.
We can pass many variables into DMT01 to summarize (normally Demographic and Baseline Characteristics). If a category of a variable is present in one treatment group, but not another, then DMT01 fails and requires the variable to be passed as factor. This is a little erratic, especially as it will happen entirely dependent on data and its spread across treatment groups - and will eg. cause an error as data changes, or when applying filtering.
I understand this is not a chevron issue, but broader - but is there a better way we can process these types of templates so that the user is not forced to keep updating columns to factors just to display 0 under a treatment group?

The text was updated successfully, but these errors were encountered:

BFalquet · 2023-10-16T11:16:47Z

@clarkliming

clarkliming · 2023-10-16T11:49:28Z

I totally agree that this should be fixed. @shajoezhu is it possible to fix this in rtables?

clarkliming · 2023-10-16T15:49:44Z

depending on the requirement, it has different solutions.

if empty space is needed, then we may want to update rtables so that even different stats they can be used (and non-existing will be shown as empty fields. However, there is a need to smartly pick up the order the these stats, e.g. col1 has a,b, col 2 has b, c we need to infer that all stats are a,b, c.
if 0 is needed, then probably we can stop any "character" input and only allow "factor".
Or, instead of really analyzing within each "row and column" split independently, we need to know the data in each column split. Then in a_summary a something like df_rows is needed to know all the levels

current solution: there is already a format related thing to convert character to factor. If new variablee need to be added, of course user can update the formats file. If new levels comes in, if proper format is used, the 0 will be still displayed

clarkliming · 2023-11-23T06:04:05Z

https://github.com/insightsengineering/chevron.adr/blob/main/docs/decisions/0009-chevron-only-factor.md

@BFalquet @barnett11 please have a look at the proposal

@BFalquet please help evaluate the impact of only allowing factor in chevron
@barnett11 please also consider if it is fine to only allow factor in chevron?

BFalquet · 2023-11-23T07:49:53Z

I agree with the diagnosis, we can easily convert to character with the risk of loosing (or at least not gaining) information about the missing levels. We can probably mitigate the risk by creating reformatting rules that introduces level that we know should be present (maybe possible for the different flag and other very controlled variables). For more exotic variables, unfortunately, there is not much we can do.

clarkliming · 2023-11-23T07:53:16Z

@BFalquet it is converting all character to factor, so won't lose any levels

barnett11 · 2023-11-23T08:00:24Z

We had this originally is that correct, with a function to convert the dataset being read in to all factors (was it df_explicit_na?). Would this be the advised way again then, and why did we remove this before (I can't remember sorry)

clarkliming · 2023-11-26T23:43:03Z

We had this originally is that correct, with a function to convert the dataset being read in to all factors (was it df_explicit_na?). Would this be the advised way again then, and why did we remove this before (I can't remember sorry)

df_explicit_na is doing to much things and all NA values will be treated identically (to be converted to something else) which is not ideal I believe. And a lot of analysis does not require factor I think, e.g. for counting "Y" or "N", so this step is not added to our script.

However now, we may consider at least convert those "analyze_vars" to use "factor" only, and users should take care of the "character-factor" conversion (for odering , keep missing levels, etc)

BFalquet added the chevron label Oct 16, 2023

clarkliming removed the chevron label Oct 16, 2023

clarkliming mentioned this issue Oct 19, 2023

character or factor in dataset #600

Open

BFalquet added the chevron label Oct 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expectation of factors in DMT01 #675

Expectation of factors in DMT01 #675

barnett11 commented Oct 16, 2023

BFalquet commented Oct 16, 2023

clarkliming commented Oct 16, 2023

clarkliming commented Oct 16, 2023

clarkliming commented Nov 23, 2023

BFalquet commented Nov 23, 2023

clarkliming commented Nov 23, 2023

barnett11 commented Nov 23, 2023

clarkliming commented Nov 26, 2023

Expectation of factors in DMT01 #675

Expectation of factors in DMT01 #675

Comments

barnett11 commented Oct 16, 2023

BFalquet commented Oct 16, 2023

clarkliming commented Oct 16, 2023

clarkliming commented Oct 16, 2023

clarkliming commented Nov 23, 2023

BFalquet commented Nov 23, 2023

clarkliming commented Nov 23, 2023

barnett11 commented Nov 23, 2023

clarkliming commented Nov 26, 2023