3

tbl_summary [library (gtsummary)] does not treat all numeric variables in the same way and I can't figure out how to change it. For example.

mtcars only has numeric variables, so when I run this, I expect the means of every variable to be calcuated. Instead, it treats cyl, gear and carb as categorical.

tbl_summary(mtcars, statistic = list(all_numeric() ~ "{mean} ({sd})",
                                      all_categorical() ~ "{n} / {N} ({p}%)"))

I actually have a much bigger dataset and tbl_summary is treating some of the numeric variables as categorical. Would it be because there are such few N's (let's say I have a lot of missing rows) and tbl_summary does not want to calculate the mean for such a small N?

I can't wrap my mind around this!

Just a further example from my data. Q12_5_TEXT is a numeric variable, but this is the output from tbl_summary.

enter image description here

1
  • @Daniel D. Sjoberg please let me know if you have any suggestions! Commented Sep 24, 2020 at 0:24

3 Answers 3

4

Variables with few unique levels are summarized categorically. For example, mtcars$cyl only has three unique levels: 4, 6, 8. With only three levels, a categorical summary is more appropriate than a mean or median.

Use the type= argument to change the default summary type.

Sign up to request clarification or add additional context in comments.

2 Comments

How to do I change this default? type = all_continuous() ~ "continuous2", does not work
continuous2 is in version 1.3.5 and higher. If downloading the new version doesn't solve your issue, then please post a reprex
3

I tried type = all_continuous() ~ "continuous2", and I have version 1.3.5, and it didn't change the summary type:

library(tidyverse)
library(gtsummary)

nrows <- 30

df <- tibble(
  a = sample(c(0,1,3.5,7.5),nrows,replace = T),
  b = sample(c("Group I","Group II"),nrows,replace = T)
)

df %>% 
  tbl_summary(
    by = b,
    type = all_continuous() ~ "continuous2",
    statistic = all_continuous() ~ "{mean} ({sd})"
  )

The output from this summarized variable 'a' as if it was a categorical variable in spite of the type argument. I also ran into this issue which is why I came here for the answer. If there is a different argument that I should be using that you could point me to, I would greatly appreciate it!

2 Comments

If 1.3.5 doesn't have this functionality yet, is there anyway you could add it in the upcoming 1.3.6 update?
You need type = a ~ "continuous2". 😸
2

I had this same issue and I fixed it by telling tbl_summary that the categorical variables are in fact continuous. Try:

df %>% 
  tbl_summary(
    by = b,
    type = list(all_continuous() ~ "continuous2",
                          all_categorical() ~ "continuous2"),
    statistic = all_continuous() ~ "{mean} ({sd})"
  )

1 Comment

Thanks! Seems like a bug to have to declare that our continuous variables that are being treated as categorical are in fact continuous.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.