I have the following dataframe in R:
df <- data.frame(Sample_name = c("01_00H_NA_DNA", "01_00H_NA_RNA", "01_00H_NA_S", "01_00H_NW_DNA", "01_00H_NW_RNA", "01_00H_NW_S", "01_00H_OM_DNA", "01_00H_OM_RNA", "01_00H_OM_S", "01_00H_RL_DNA", "01_00H_RL_RNA", "01_00H_RL_S"),
Pair = c("","", "S1","","","S2","","","S3","", "","S5"))
I would like to generate a new variable Label such that similar strings in Sample_name until the last _ before DNA/RNA or S get matched to give a similar label Id number. While each row may not start with 01_00H, there will always be similar strings until the last underscore to group for the label variable.
Furthermore, I would like to also fill the pair variable with similar values, S1 for all identical labels and so on. The existing Pair values are not continuous i.e S3 is followed by S5 and so on.
Resulting dataframe will look something like this:
This has been incredibly hard to do, I followed How to create new column in dataframe based on partial string matching other column in R but it helped me only partially for direct 1:1 renaming.
Any solutions from useRs will be much appreciated, Thanks!

factor(tmp <- sub("(^.+)_(DNA|RNA|S)$", "\\1", df$Sample_name), labels=seq_along(unique(tmp)))for instance work for your real data?Labelin my df How do I complete the second part of the question, where all identical labels get a correspondingPairvalue (i.e the missing rows inPairget the samePairID S1 or S2 or S3 as the one available value for that group in the original df?