1

I am a new R user and I have been trying to execute an if statement nested within a for loop in order to code a new variable. I have a data.frame where some guys previously forgot to code the "condition" variable (factor with 3 levels: old,new,lure) from E-prime. The task has two phases encoding/retrieval (Block 1 and 2), two set of images (A and B) and a unique Word ID.

So I have this:

phase <- rep(c("Block1", "Block2"), each = 7)
condition <- rep(NA, times = 14)
setAorB <- rep(c("A", "B"), times = c(9,5)) 
WordID <- c(23,34,56,76,45,88,99,23,34,56,76,45,100,105)

loris_data <- data.frame(phase,condition,setAorB,WordID) 

which gives me:

> loris_data
   phase     condition setAorB WordID
1  Block1        NA       A     23
2  Block1        NA       A     34
3  Block1        NA       A     56
4  Block1        NA       A     76
5  Block1        NA       A     45
6  Block1        NA       A     88
7  Block1        NA       A     99
8  Block2        NA       A     23
9  Block2        NA       A     34
10 Block2        NA       B     56
11 Block2        NA       B     76
12 Block2        NA       B     45
13 Block2        NA       B    100
14 Block2        NA       B    105

What I would like to achieve is: At retrieval (Block2), if setAorB is "A", then condition is "old". I tried this basic loop but, obviously, worked only for the old items given that it does not discriminate lures vs new items.

for(i in 1:length(loris_data$condition)) {
      if(loris_data$setAorB[i] == "A") {
            loris_data$condition[i] <-"old"}
      else {
            loris_data$condition[i] <- "new"
      }
    }

Then, I would like to say: if setAorB is "B" and the Word ID is the same of A (which means that are lures), then the condition is "lure", otherwise if setAorB is "B" but it has a unique WordID, the the condition is "new".

This would be the expected output:

> loris_data
    phase    condition setAorB WordID
1  Block1      <NA>       A     23
2  Block1      <NA>       A     34
3  Block1      <NA>       A     56
4  Block1      <NA>       A     76
5  Block1      <NA>       A     45
6  Block1      <NA>       A     88
7  Block1      <NA>       A     99
8  Block2       old       A     23
9  Block2       old       A     34
10 Block2      lure       B     56
11 Block2      lure       B     76
12 Block2      lure       B     45
13 Block2       new       B    100
14 Block2       new       B    105

Can anyone help with this code as I am still learning and I am struggling quite a lot?

3
  • I guess it's easy to do achieve what you want, but please post expected final output. Commented Nov 11, 2017 at 14:06
  • I have just edited the post including the expected final output Commented Nov 11, 2017 at 14:23
  • Perhaps you need library(data.table); setDT(loris_data)[phase == "Block2", condition := c('new', 'old', 'lure')[as.integer(factor(1 + 2*(setAorB == "A") + 4 * (setAorB == "B" & WordID %in% loris_data$WordID[loris_data$setAorB=="A"])))] ] convert the column condition to character or use condition <- rep(NA_character_, times = 14) Commented Nov 11, 2017 at 14:29

3 Answers 3

1

Quick and dirty solution using data.table:

library(data.table)
setDT(loris_data)
loris_data[, condition := ifelse(setAorB == "A", "old", "new")]
loris_data[phase != "Block2", condition := NA]
loris_data[phase == "Block2" & setAorB == "B" & WordID %in% loris_data[phase == "Block1", WordID], condition := "lure"]
Sign up to request clarification or add additional context in comments.

3 Comments

@LorisNaspi Happy to help :-)
Would you suggest data.tables over data.frames?
@LorisNaspi if your data is big, then I would go for data.table
0

A solution using dplyr. loris_data2 is the final output.

library(dplyr)

loris_data2 <- loris_data %>%
  group_by(WordID) %>%
  mutate(WordID_count = row_number()) %>%
  ungroup() %>%
  mutate(condition = case_when(
    phase %in% "Block2" & setAorB %in% "A"                        ~ "old",
    phase %in% "Block2" & setAorB %in% "B" & WordID_count > 1     ~ "lure",
    phase %in% "Block2" & setAorB %in% "B" & WordID_count == 1    ~ "new",
    TRUE                                                          ~ NA_character_
  )) %>%
  select(-WordID_count)

loris_data2
# # A tibble: 14 x 4
#     phase condition setAorB WordID
#    <fctr>     <chr>  <fctr>  <dbl>
#  1 Block1      <NA>       A     23
#  2 Block1      <NA>       A     34
#  3 Block1      <NA>       A     56
#  4 Block1      <NA>       A     76
#  5 Block1      <NA>       A     45
#  6 Block1      <NA>       A     88
#  7 Block1      <NA>       A     99
#  8 Block2       old       A     23
#  9 Block2       old       A     34
# 10 Block2      lure       B     56
# 11 Block2      lure       B     76
# 12 Block2      lure       B     45
# 13 Block2       new       B    100
# 14 Block2       new       B    105

Explanation

My solution first creates a new column called WordID_count, which shows the appearance times of a WordID. This task is achieved by the following.

loris_data %>%
  group_by(WordID) %>%
  mutate(WordID_count = row_number()) %>%
  ungroup()

# # A tibble: 14 x 5
#     phase condition setAorB WordID WordID_count
#    <fctr>     <lgl>  <fctr>  <dbl>        <int>
#  1 Block1        NA       A     23            1
#  2 Block1        NA       A     34            1
#  3 Block1        NA       A     56            1
#  4 Block1        NA       A     76            1
#  5 Block1        NA       A     45            1
#  6 Block1        NA       A     88            1
#  7 Block1        NA       A     99            1
#  8 Block2        NA       A     23            2
#  9 Block2        NA       A     34            2
# 10 Block2        NA       B     56            2
# 11 Block2        NA       B     76            2
# 12 Block2        NA       B     45            2
# 13 Block2        NA       B    100            1
# 14 Block2        NA       B    105            1

After that, it is ready to fill in the condition column. This task is achieved by the following.

mutate(condition = case_when(
    phase %in% "Block2" & setAorB %in% "A"                        ~ "old",
    phase %in% "Block2" & setAorB %in% "B" & WordID_count > 1     ~ "lure",
    phase %in% "Block2" & setAorB %in% "B" & WordID_count == 1    ~ "new",
    TRUE                                                          ~ NA_character_
  ))

mutate is the function to create a or update a new column. case_when is an alternative to multiple ifelse statement. The code did the following work:

  1. if phase matches Block2 and setAorB matches A, the conditon is old.

  2. if phase matches Block2 and setAorB matches A and WordID_count is larger than 1, the conditon is lure.

  3. if phase matches Block2 and setAorB matches A and WordID_count is larger is 1, the conditon is new.

  4. if none of the above situations, the conditon should be NA.

The final piece of the code is select(-WordID_count), which simply removes the WordID_count column as it is not part of the original data frame.

1 Comment

Thank you very much
0

Have you tried an ifelse approach? ifelse is a built-in function in R that runs a vectorised if test and gives a determinate output. For example:

loris_data$resutl <- ifelse(test = loris_data[, "condition"] == "A",
       yes =  "old", 
       no = "new")

However, if you would like to nest another ifelse inside the no, that would be perfectly fine! Let me know if it works.

1 Comment

loris_data$condition <- ifelse(test = loris_data[, "setAorB"] == "A", yes = "old", no = "new") This code does not discriminate between "lure" and "new" item, tough. Please, look my expected output

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.