I have dataset with multiple timepoints for e.g. col1, col2, col3, col4 with each column having similar categories (A,B,C,D)
|ID | col1 | col2| col3| col4| ColA
------------------------------
|1 |A |A |B |B
------------------------------
|2 |A |B |B |C
-------------------------------
|3 |B |B |C |C
-------------------------------
|4 |B |C |C |D
-------------------------------
|5 |C |C |D |D
--------------------------------
Now I want to create a new variable ("colA") such that if A is found in any of the col1-col4 then it will be assigned value 1 else value 0. Similar for 1 for B new in "colB", C for "colC", and D for "colD". Here ColA will have 1,1,0,0,0 and ColB will have 1,1,1,1,0 values so on.
I tried multiple options which I found here For eg. (without loop)
df %>%
mutate(colA=case_when(starts_with("col")=="A") ~ 1, TRUE~0)
or
(with a loop)
for (I in 1:4) {
df[,"colA"] <- ifelse(unlist(df[,paste0("col",i)]=="A",1,0)
}
but it did not work. in for loop, the data gets overwritten at the end of the loop, so colA reads as 0,0,0,0,0 as there are no A's in col4. I am using multiple datasets where range of i may vary from 1:2 to 1:4.
structure(list(id = 1:6, col1 = c(age11 = "A", age12 = "A", age13 = "B",
age14 = "A", age15 = "C", age16 = "B"), col2 = c(age21 = "A",
age22 = NA, age23 = "C", age24 = "A", age25 = "C", age26 = "B"
), col3 = c(age31 = "B", age32 = "B", age33 = NA, age34 = NA,
age35 = "C", age36 = "B"), col4 = c(age41 = "B", age42 = NA,
age43 = NA, age44 = NA, age45 = "D", age46 = "C")), row.names = c(NA,
-6L), class = c("tbl_df", "tbl", "data.frame"))
dput(head(MY_DATA))and paste the output into the body of your question? That will create a code recipe to make an exact copy of the first 6 rows of your data.