1

I have a dataframe that stores marks for questions against multiple ids.

ID, Q1, Q2, Q3, Q4, Q5
R1,  4,  3,  3,  2,  1
R2,  3,  2,  3,  2,  4
R3,  5,  1,  3,  4,  3
R4,  1,  3,  3,  5,  3
...
...

I want to plot the average marks of the 5 questions in a single plot.

How do I go about doing this in R using the ggplot2 package? What would be my 'x' and 'y' aesthetics?

3 Answers 3

1

You need to start transforming your data. Here I make a data.frame with one column for the labels and another for the averages and then feed it to ggplot.

library(ggplot2)
col_means <- colMeans(data[paste0("Q", 1:5)])
col_meansdf <- stack(col_means)
col_meansdf
#   values ind
# 1   3.25  Q1
# 2   2.25  Q2
# 3   3.00  Q3
# 4   3.25  Q4
# 5   2.75  Q5

ggplot(col_meansdf, aes(x = ind, y = values)) + 
  geom_col()


# or in one step:
qplot(
  x = paste0("Q", 1:5), 
  y = colMeans(data[paste0("Q", 1:5)]), 
  geom = "col"
)

enter image description here

Reproducible data:

data <- read.table(
  text = "ID, Q1, Q2, Q3, Q4, Q5
  R1,  4,  3,  3,  2,  1
  R2,  3,  2,  3,  2,  4
  R3,  5,  1,  3,  4,  3
  R4,  1,  3,  3,  5,  3", 
  header = TRUE,
  sep = ","
)
Sign up to request clarification or add additional context in comments.

2 Comments

Hey, thanks for such a prompt reply. I am new to R, so excuse my naivety. I was wondering if there is any way to do this without creating a new data frame. My original data frame is quite big and creating a new dataframe would significantly increase analysis time.
@SouravAdhikari see the second solution marked with the comment # or in one step:
0

One-liner with geom_col:

ggplot(data.frame(mean = colMeans(df), question = names(df))) +
      geom_col(aes(question, mean))

enter image description here

Data

df <- data.frame(Q1 = c(4,3,5,1), 
           Q2 = c(3,2,1,3),
           Q3 = c(2,2,4,5),
           Q4 = c(1,4,3,3))

Comments

0

You can do this with stat_summary after converting from wide to long format. Change geom = "point" at will, see other possible geoms in ?stat_summary.

library(dplyr)
library(ggplot2)

long <- df1 %>%
  gather(Question, Answer, -ID)

ggplot(long, aes(Question, Answer)) +
  stat_summary(geom = "point", fun.y = mean)

enter image description here

Data.

df1 <- read.csv(text = "
ID, Q1, Q2, Q3, Q4, Q5
R1,  4,  3,  3,  2,  1
R2,  3,  2,  3,  2,  4
R3,  5,  1,  3,  4,  3
R4,  1,  3,  3,  5,  3
")

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.