0

I have the following dataframe:


 df <- data.frame(part = c(604, 604, 604, 604, 604, 604, 604, 604, 604, 604, 604, 604, 604, 604, 604, 604, 604, 604, 604, 604), trialN = c(10,13,14,17,19,21,23,31,32,34,35,37,39,41,44,45,47,49,51,53), goal = c(83, 83, 83, 83, 83, 83, 83, 83, 83, 83, 84, 84,84,84,84,84,84,84,84,84), task = c(200, 200,200,200,200,200,200,200,200,200,200,200,200,200,200,200,200,200,200,200), choice = c( 13,13,13,14,14,13,14,14,13,13,14,14,13,13,14,14,13,14,14,14), rt = c(5.566418,5.565599,5.205317,4.686274,5.132267,6.082986,5.874290,3.181723,3.556449,4.257331,5.494879,3.760212,4.260871,4.150411,3.395041,4.917050,2.693578,3.724043,5.593926,3.796483), maxValueL = c(86,95,34,27,66,85,42,99,95,59,36,96,71,98,38,31,98,7,92,64), maxValueR = c(62,99,32,85,38,60,82,65,78,13,47,91,5,43,89,33,10,99,17,49))


    part trialN goal task choice       rt maxValueL maxValueR
1   604     10   83  200     13 5.566418        86        62
2   604     13   83  200     13 5.565599        95        99
3   604     14   83  200     13 5.205317        34        32
4   604     17   83  200     14 4.686274        27        85
5   604     19   83  200     14 5.132267        66        38
6   604     21   83  200     13 6.082986        85        60
7   604     23   83  200     14 5.874290        42        82
8   604     31   83  200     14 3.181723        99        65
9   604     32   83  200     13 3.556449        95        78
10  604     34   83  200     13 4.257331        59        13
11  604     35   84  200     14 5.494879        36        47
12  604     37   84  200     14 3.760212        96        91
13  604     39   84  200     13 4.260871        71         5
14  604     41   84  200     13 4.150411        98        43
15  604     44   84  200     14 3.395041        38        89
16  604     45   84  200     14 4.917050        31        33
17  604     47   84  200     13 2.693578        98        10
18  604     49   84  200     14 3.724043         7        99
19  604     51   84  200     14 5.593926        92        17
20  604     53   84  200     14 3.796483        64        49

My aim is to convert column 3 ("goal"), 4 ("task") and 5 ("choice) from their actual values to either 1 or 0 like this:

# 13=1,14=0 and 83=1, 84=0 and 200=1, 201=0

    part trialN goal task choice     rt maxValueL maxValueR
1   604     10   1     1     1  5.566418        86        62
2   604     13   1    1     1  5.565599        95        99
3   604     14   1     1     1  5.205317        34        32
4   604     17   1     1     0  4.686274        27        85
5   604     19   1     1     0  5.132267        66        38
6   604     21   1     1     1  6.082986        85        60
7   604     23   1     1     0  5.874290        42        82
8   604     31   1     1     0  3.181723        99        65
9   604     32   1     1     1  3.556449        95        78
10  604     34   1     1     1  4.257331        59        13
11  604     35   0     1     0  5.494879        36        47
12  604     37   0     1     0 3.760212        96        91
13  604     39   0     1     1  4.260871        71         5
14  604     41   0     1     1 4.150411        98        43
15  604     44   0     1     0 3.395041        38        89
16  604     45   0     1     0 4.917050        31        33
17  604     47   0     1     1  2.693578        98        10
18  604     49   0     1     0 3.724043         7        99
19  604     51   0     1     0 5.593926        92        17
20  604     53   0     1     0 3.796483        64        49

I tried the following code but it doesn't work:

for(i in 1:nrow(choices_part)){
  if(choices_part[i, 1] == c("goal", "task", "choice")){
    choices_part[i, 3:5] <- 1
  } 
  else {
    choices_part[i,length(choices_part)] <- choices_part[i, length(choices_part)]
  }
}

Can anyone help me with this?

1
  • @RonakShah I just want to put those 3 variables in the standard dummy format for analysis. When I run your code I get this: Error: cols must evaluate to column positions or names, not a list. Any idea? Commented Mar 4, 2020 at 8:58

2 Answers 2

3

You can do :

library(dplyr)
cols <- c('goal', 'task', 'choice')

df %>% mutate_at(vars(cols), ~as.integer(. %in% c(13, 83, 200)))

#   part trialN goal task choice       rt maxValueL maxValueR
#1   604     10    1    1      1 5.566418        86        62
#2   604     13    1    1      1 5.565599        95        99
#3   604     14    1    1      1 5.205317        34        32
#4   604     17    1    1      0 4.686274        27        85
#5   604     19    1    1      0 5.132267        66        38
#6   604     21    1    1      1 6.082986        85        60
#...

Or in base R :

df[cols] <- lapply(df[cols], function(x) as.integer(x %in% c(13, 83, 200)))

This is assuming the values present in columns goal, task and choice do not have common values as shown in the example shared.

Sign up to request clarification or add additional context in comments.

2 Comments

great solution thanks so much very elegant! One question for me to understand what you did in the first solution : what does ~ do? What does . do?
~ is another way of writing anonymous function (the one in lapply), called as formula syntax whereas . refers to individual column values.
2

You can convert dataframe to datatable and can do it like this:

library(data.table)
setDT(df)
df[ ,':=' (goal  = ifelse(goal == 83, 1, 0),
                         task = ifelse(task == 200, 1, 0),
           choice = ifelse(choice == 13, 1, 0 ))]

2 Comments

nice solution! Easier to understand for my experience with R. What does ':=' mean?
':=' used when creating multiple columns, if need to create just single column one can use goal := instead of initial ':='.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.