2

I have a large dataframe, and am trying to count up the scores of many questions. Here is some sample data.

Q1 = c("apple", "banana", "cider", "muffin", "chocolate")
Q2 = c("orange", "kiwi", "calzone", "cupcake", "cake")
ID = c("P1", "P2", "P3", "P4", "P5")

mydf = data.frame(Q1,Q2,ID)

answer_key = c("apple", "kiwi", "pizza", "dessert", "cake")

I've been trying to use ifelse and %in% for the whole dataframe

mydf = ifelse(mydf %in% answer_key, 1,0)

but it doesn't work, and it returns a vector when I need a dataframe. I just want to replace my values without having to do this for each question because there are many:

mydf$Q1 <-ifelse(mydf$Q1 == "apple", 1, 0)
mydf$Q2 <-ifelse(mydf$Q2 == "kiwi", 1, 0)
1
  • 1
    another one: mydf[1:2] <- +(mydf[1:2] == answer_key) Commented Apr 10, 2021 at 22:34

2 Answers 2

2

Perhaps this is what you're looking for?

library(dplry)
mydf %>%
   mutate(across(Q1:Q2,~ +(. %in% answer_key)))
  Q1 Q2 ID
1  1  0 P1
2  0  1 P2
3  0  0 P3
4  0  0 P4
5  0  1 P5

Or a bit messy with base R:

mydf[,c("Q1","Q2")] <- sapply(mydf[,c("Q1","Q2")],function(x) +(x%in%answer_key))
mydf
  Q1 Q2 ID
1  1  0 P1
2  0  1 P2
3  0  0 P3
4  0  0 P4
5  0  1 P5
Sign up to request clarification or add additional context in comments.

1 Comment

Or another way mydf[1:2] <- +(dim<-(as.matrix(mydf[1:2]) %in% answer_key, dim(mydf[1:2])))
1

I hope this is what you are looking for:

library(dplyr)

mydf %>%
  mutate(across(Q1:Q2, ~ ifelse(.x %in% answer_key, 1, 0)))

  Q1 Q2 ID
1  1  0 P1
2  0  1 P2
3  0  0 P3
4  0  0 P4
5  0  1 P5

1 Comment

I edited my code since the previous one would not have been working if you had large data set.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.