I am converting extensive stata code to R.
In stata, if I have a series of variables such as var1, var2, var3, etc., and I want to change (recode) a specific value if it appears in any of the variables, I can do that using the statement recode var*(9999 = -9). In this case, I want to change all 9999's in the variable series to NA.
HERE IS THE CODE I TRIED
data = data.frame(var1=c(5, 56, 9999, 56, 78, 51),
var2=c( 9999, 56, 43, 56, 78, 9999),
var3=c(5, 34, 56, 78, 76, 79))
varlist=gsub(" ","",paste("data$var",1:3,sep=""))
varlist
summary(data$var2)
for (v in varlist){
v[v=="9999"] = NA
}
summary(data$var2)
data$var2[data$var2==9999] = NA
summary(data$var2)
LOOP DIDN'T WORK, SINGLE ASSIGNMENT (data$var2[data$var2==9999] = NA) DID AS SHOWN BY WHAT THE THREE summary(data$var2) SHOW:
Min. 1st Qu. Median Mean 3rd Qu. Max. 43 56 67 3372 7519 9999
Min. 1st Qu. Median Mean 3rd Qu. Max. 43 56 67 3372 7519 9999
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's 43.00 52.75 56.00 58.25 61.50 78.00 2
THE CONTENTS OF VARLIST IS [1] "data$var1" "data$var2" "data$var3"
I ALSO TRIED THE FOLLOWING LOOP BASED ON WHAT I FOUND ON STACKOVERFLOW (https://forum.posit.co/t/using-variables-names-in-loops/128653/2):
for (i in 1:3) {
variable = paste0("data$var", i)
variable[variable==9999] = NA
}
IT ALSO DIDN'T WORK.
CLEARLY I'M MISSING SOMETHING HERE.
mutate(across(starts_with("var"), \(x) ifelse(x == 9999, NA, x))should give you what you want.