Plot multiple variable in the same bar plot

Question

With my dataframe that looks like this (I have in total 1322 rows) :

My dataframe

I'd like to make a bar plot with the percentage of rating of the CFS score. It should look similar to this :

enter image description here

With this code, I can make a single bar plot for the column cfs_triage :

ggplot(data = df) + 
  geom_bar(mapping = aes(x = cfs_triage, y = (..count..)/sum(..count..)))

My very basic barplot

But I can't find out to make one with the three varaibles next to another.

Thank you in advance to all of you that will help me with making this barplot with the percentage of rating for this three variable !(I'm not sure that my explanations are very clear, but I hope that it's the case :))

Allan Cameron · Accepted Answer · 2022-11-25 15:11:50Z

2

Your best bet here is to pivot your data into long format. We don't have your data, but we can reproduce a similar data set like this:

set.seed(1)

df <- data.frame(cfs_triage  = sample(10, 1322, TRUE, prob = 1:10), 
                 cfs_silver  = sample(10, 1322, TRUE), 
                 cfs_student = sample(10, 1322, TRUE, prob = 10:1))

df[] <- lapply(df, function(x) { x[sample(1322, 300)] <- NA; x})

Now the dummy data set looks a lot like yours:

head(df)
#>   cfs_triage cfs_silver cfs_student
#> 1          9         NA           1
#> 2          8          4           2
#> 3         NA          8          NA
#> 4         NA         10           9
#> 5          9          5          NA
#> 6          3          1          NA

If we pivot into long format, then we will end up with two columns: one containing the values, and one containing the column name that the value belonged to in the original data frame:

library(tidyverse)

df_long <- df %>%
  pivot_longer(everything())

head(df_long)
#> # A tibble: 6 x 2
#>   name        value
#>   <chr>       <int>
#> 1 cfs_triage      9
#> 2 cfs_silver     NA
#> 3 cfs_student     1
#> 4 cfs_triage      8
#> 5 cfs_silver      4
#> 6 cfs_student     2

This then allows us to plot with value on the x axis, and we can use name as a grouping / fill variable:

ggplot(df_long, aes(value, fill = name)) +
  geom_bar(position = 'dodge') +
  scale_fill_grey(name = NULL) +
  theme_bw(base_size = 16) +
  scale_x_continuous(breaks = 1:10)
#> Warning: Removed 900 rows containing non-finite values (`stat_count()`).

^{Created on 2022-11-25 with reprex v2.0.2}

answered Nov 25, 2022 at 15:11

Allan Cameron

178k7 gold badges70 silver badges118 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

esptang Over a year ago

Thank you for your help. Pivoting the dataframe is a brilliant idea. As I'd like to have the percentages and not the count on the y-axis, I try to change it to percentages with this code

ggplot(df_long, aes(x = value, y = (..count..)/sum(..count..), fill = name)) +   geom_bar(position = 'dodge') +   scale_fill_grey(name = NULL) +   theme_bw(base_size = 16) +   scale_x_continuous(breaks = 1:10)

. This code is changing the y-axis labels to decimals (percentages) but not the values in the plot. Any idea how to change that? Thank you in Advance.

TarJae · Accepted Answer · 2022-11-25 15:47:25Z

2

Maybe you need something like this: The formatting was taken from @Allan Cameron (many Thanks!):

library(tidyverse)
library(scales)

df %>% 
  mutate(id = row_number()) %>% 
  pivot_longer(-id) %>% 
  group_by(id) %>% 
  mutate(percent = value/sum(value, na.rm = TRUE)) %>% 
  mutate(percent = ifelse(is.na(percent), 0, percent)) %>% 
  mutate(my_label = str_trim(paste0(format(100 * percent, digits = 1), "%"))) %>% 
  ggplot(aes(x = factor(name), y = percent, fill = factor(name), label = my_label))+
  geom_col(position = position_dodge())+
  geom_text(aes(label = my_label), vjust=-1) +
  facet_wrap(. ~ id, nrow=1,  strip.position = "bottom")+
  scale_fill_grey(name = NULL) +
  scale_y_continuous(labels = scales::percent)+
  theme_bw(base_size = 16)+
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))

answered Nov 25, 2022 at 15:47

TarJae

80.2k6 gold badges30 silver badges94 bronze badges

2 Comments

esptang Over a year ago

[PART 1] Thank you for this code idea. Unfortunately the "ggplot part" of the code is not working for me... The console is still waiting for me to enter code (the > in the console doesn't appear). I split it in two :

df_1 <- df %>%    mutate(id = row_number()) %>%    pivot_longer(-id) %>%    group_by(id) %>%    mutate(percent = value/sum(value, na.rm = TRUE)) %>%    mutate(percent = ifelse(is.na(percent), 0, percent)) %>%    mutate(my_label = str_trim(paste0(format(100 * percent, digits = 1), "%")))

and

esptang Over a year ago

[PART 2]

ggplot(data = df_1, aes(x = factor(name), y = percent, fill = factor(name), label = my_label)) +   geom_col(position = position_dodge()) +   geom_text(aes(label = my_label), vjust=-1) +   facet_wrap(. ~ id, nrow=1,  strip.position = "bottom") +   scale_fill_grey(name = NULL) +   scale_y_continuous(labels = scales::percent) +   theme_bw(base_size = 16) +   theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))

The codes for creating the new dataframe works but the ggplot part doesn't. Any idea waht's the problem? Thank you in advance.

Collectives™ on Stack Overflow

Plot multiple variable in the same bar plot

2 Answers 2

1 Comment

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related