1

This question is related to a previous topic: How to use custom function to create new binary variables within existing dataframe?

I would like to use a similar function but be able to use a vector to specify ICD9 diagnosis variables within the dataframe to search for (e.g., "diag_1", "diag_2","diag_1", etc )

I tried

y<-c("diag_1","diag_2","diag_1") 

diagnosis_func(patient_db, y, "2851", "Anemia")

but I get the following error:

Error in `[[<-`(`*tmp*`, i, value = value) : 
  recursive indexing failed at level 2 

Below is the working function by Benjamin from the referenced post. However, it works only from 1 diagnosis variable at a time. Ultimately I need to create a new binary variable that indicates if a patient has a specific diagnosis by querying the 25 diagnosis variables of the dataframe.

*targetcolumn is the icd9 diagnosis variables "diag_1"..."diag_20" is the one I would like to input as vector

diagnosis_func <- function(data, target_col, icd, new_col){
  pattern <- sprintf("^(%s)", 
                 paste0(icd, collapse = "|"))

  data[[new_col]] <- grepl(pattern = pattern, 
                       x = data[[target_col]]) + 0L
  data
}

diagnosis_func(patient_db, "diag_1", "2851", "Anemia")

This non-function version works for multiple diagnosis. However I have not figured out how to use it in a function version as above.

 pattern = paste("^(", paste0("2851", collapse = "|"), ")", sep = "")

df$anemia<-ifelse(rowSums(sapply(df[c("diag_1","diag_2","diag_3")], grepl, pattern = pattern)) != 0,"1","0")

Any help or guidance on how to get this function to work would be greatly appreciated.

Best, Albit

3
  • Probably better to feed the vector to lapply. Something like lapply(y, function(i) diagnosis_func(data=df, target_col=i, icd=icd, newcol=i)). Maybe you'd have to tweek you function a bit, but this would be the better route, I suspect. Commented Mar 14, 2017 at 14:22
  • Thanks lmo! will try this Commented Mar 14, 2017 at 14:37
  • Albit, the problem is that the grepl in Benjamin's function will work on one column of your data frame. Let's say you have a multiple columns, target_col <- c("diag_1", "diag_2", "diag_3"). In order to apply grepl you can try something like this : apply(data[target_col], 2, function(x) grepl(pattern=pattern, x)). Let me know if this works. Commented Mar 15, 2017 at 1:35

1 Answer 1

1

Try this modified version of Benjamin's function:

diagnosis_func <- function(data, target_col, icd, new_col){
  pattern <- sprintf("^(%s)", 
                     paste0(icd, collapse = "|"))

  new <- apply(data[target_col], 2, function(x) grepl(pattern=pattern, x)) + 0L
  data[[new_col]] <- ifelse(rowSums(new)>0, 1,0)
  data
}
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.