0

I know this gets asked a lot, but I'm having trouble making a 100% stacked bar plot in R. I know there are tons of pages out there explaining how, but nothing is working and I think the data I'm importing isn't configured correctly, so basically I want to know what I'm doing wrong in that respect. The data I'm using looks like the data in the attached picture. I'm able to create the exact chart I want in Excel, which I've also attached (the bar graph on the right; I couldn't attach more than one picture so they're just both in the same one), but for various reasons I need it to be in R. Is the way the data is written in Excel incorrect, and if so, how do I make it right?

data being used on left, correct excel graph on right

1
  • 1
    Can you add some code that you tried and where things went wrong? Right now it seems like a duplicate to me, possibly of, e.g., this question. But there may be subtle differences that we'll be able to see once you have added some code. Read here for ideas on how to make your question reproducible. Commented Aug 7, 2018 at 18:01

1 Answer 1

1

In ggplot2 at least, you need to convert your data from "wide" to "long" format. Below, I use the tidyr::gather function to "gather" the two data columns ("running" and "jumping") into a single "fraction" column, which you can then color by "activity".

library(magrittr)                       # For pipe (%>%)

dat <- tibble::tibble(
  weeks = 1:15,
  running = runif(15, 0, 1),
  jumping = 1 - running
)

dat
#> # A tibble: 15 x 3
#>    weeks running jumping
#>    <int>   <dbl>   <dbl>
#>  1     1  0.675   0.325 
#>  2     2  0.727   0.273 
#>  3     3  0.430   0.570 
#>  4     4  0.324   0.676 
#>  5     5  0.809   0.191 
#>  6     6  0.260   0.740 
#>  7     7  0.433   0.567 
#>  8     8  0.872   0.128 
#>  9     9  0.0288  0.971 
#> 10    10  0.903   0.0970
#> 11    11  0.295   0.705 
#> 12    12  0.538   0.462 
#> 13    13  0.342   0.658 
#> 14    14  0.291   0.709 
#> 15    15  0.877   0.123

library(ggplot2)

dat_long <- dat %>%
  tidyr::gather(activity, fraction, running, jumping)

dat_long
#> # A tibble: 30 x 3
#>    weeks activity fraction
#>    <int> <chr>       <dbl>
#>  1     1 running    0.675 
#>  2     2 running    0.727 
#>  3     3 running    0.430 
#>  4     4 running    0.324 
#>  5     5 running    0.809 
#>  6     6 running    0.260 
#>  7     7 running    0.433 
#>  8     8 running    0.872 
#>  9     9 running    0.0288
#> 10    10 running    0.903 
#> # ... with 20 more rows

ggplot(dat_long) +
  aes(x = factor(weeks), y = fraction, fill = activity) +
  geom_col()

You can also do this in base R by converting to a "wide" matrix. (Note that I also use [, -1] to drop the first column).

dat_tmat <- t(as.matrix(dat[, -1]))
dat_tmat
#>              [,1]      [,2]      [,3]      [,4]       [,5]      [,6]
#> running 0.5227949 0.5352537 0.5879579 0.2678927 0.93068128 0.2948861
#> jumping 0.4772051 0.4647463 0.4120421 0.7321073 0.06931872 0.7051139
#>               [,7]      [,8]      [,9]       [,10]      [,11]     [,12]
#> running 0.07729363 0.8925416 0.5503279 0.007479232 0.02991765 0.5832765
#> jumping 0.92270637 0.1074584 0.4496721 0.992520768 0.97008235 0.4167235
#>             [,13]     [,14]     [,15]
#> running 0.8660134 0.1156794 0.3176998
#> jumping 0.1339866 0.8843206 0.6823002

barplot(dat_tmat, col = c("blue", "red"))
legend("topleft", c("running", "jumping"), col = c("blue", "red"), lwd = 5, bg = "white")

Sign up to request clarification or add additional context in comments.

4 Comments

Thank you!! Is there a way to do this by importing the data, instead of typing it all up? (for some reason I can't tag you @Alexey)
Of course. R can import a wide variety of data types. I would read through The "Data Import" chapter of "R for Data Science" by Garret Grolemund and Hadley Wickham (r4ds.had.co.nz/data-import.html). There are R packages for reading directly from Excel, but it's probably easier to export to CSV. Also, if this answer works for you, please accept it (click the grey check mark) and upvote it (click the up arrow).
What I mean is, how do I convert the imported data from wide to long, and then do the same thing? I
As I said in my answer, tidyr::gather will convert data from wide to long. The code I have above already does this, and you can find more information and examples in the documentation (?tidyr::gather at the R prompt). To convert from long to wide, use tidyr::spread.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.