I have a large data grouped by two identifiers (Group and ID), Initial column that shows in an initial time period, and a Post column to show elements that occur following the initial time period. A working examples is below:
SampleDF<-data.frame(Group=c(0,0,1),ID=c(2,2,3),
Initial=c('F28D,G06F','F24J ,'G01N'),
Post=c('G06F','H02G','F23C,H02G,G01N'))
I want to compare elements in Initial and Post for each Group/ID combination to find out when elements match, when only new elements exist, and when both pre-existing and new elements exist. Ideally, I would like to end up with a new Type variable with the following output:
SampleDF<-cbind(SampleDF, 'Type'=rbind(0,1,2))
where (relative to Initial) 0 indicates that there are no new element(s) in Post, 1 indicates that there are only new element(s) in Post, and 2 indicates that there are both pre-existing and new element(s) in Post.