test.vector <- c("jdoe","John Doe","jodoe","Sarah Scarlet","sscarlet","scarlet")
test.df <- data.frame("Full.Name" = c("John Doe","Sarah Scarlet"),
"alias1" = c("jdoe","sscarlet"),
"alias2" = c("jodoe","scarlet"))
want.vector <- c("John Doe","John Doe","John Doe","Sarah Scarlet","Sarah Scarlet","Sarah Scarlet")
> test.vector
[1] "jdoe" "John Doe" "jodoe" "Sarah Scarlet" "sscarlet" "scarlet"
> test.df
Full.Name alias1 alias2
1 John Doe jdoe jodoe
2 Sarah Scarlet sscarlet scarlet
> want.vector
[1] "John Doe" "John Doe" "John Doe" "Sarah Scarlet" "Sarah Scarlet" "Sarah Scarlet"
All the search results like this one have exactly one matching, and merge() or join() is used.
However, in this case, there are multiple possibilities, and I am not sure how I can approach this.
Few things I tried were (with butchered syntax):
str_replace(test.vector,test.df[,-1],test.df[.1])recode(test.vector,test.df)- join with
by = c(test.df[,-1], test.vector)after changing test.vector into df
One thing to note is that the actual test.df I have for the project has multiple columns that are quite sparse (since each alias relates to a specific location/position). Not sure if it will cause significant difference with the example above.