R replace values in column based on match between columns

Question

I have two dataframes, each with the same columns. Some columns have the same values in the same order in both dataframes (X1, X2 below). Other columns have the same values, but in a different order (Y1). This is only a problem for some levels of first variables (here, the order of rows in Y1 differs for X1 == "a", but not X1 == "b"). Example:

df1 <- data.frame("X1" = c("a", "a", "a", "b", "b", "b"),
                  "X2" = c("1", "2", "3", "1", "2", "3"),
                  "Y1" = c("d", "d", "f", "g", "h", "i"))
df2 <- data.frame("X1" = c("a", "a", "a", "b", "b", "b"),
                  "X2" = c("1", "2", "3", "1", "2", "3"),
                  "Y1" = c("f", "d", "d", "g", "h", "i"))

I would like to change the values of df2$X1 and df2$X2 such that the two dataframes are matched on values of Y1.

I would like to change X1 and X2 rather than Y1 because there are many Y variables. I would like to do this only for df$X1 == "a".

The output should looks like this:

df2 <- data.frame("X1" = c("a", "a", "a", "b", "b", "b"),
                  "X2" = c("3", "1", "2", "1", "2", "3"),
                  "Y1" = c("f", "d", "d", "g", "h", "i"))

Well, I want to arrange the X1, X2 columns differently so that they align with the Y1 column the same way they do in the other dataframe. — simoncolumbus
– simoncolumbus, Commented Oct 25, 2019 at 23:40
could you please provide your desired output as a data.frame — mnist
– mnist, Commented Oct 26, 2019 at 0:06
Your duplicate values make this more complicated (you have two values of d for Y1 which correspond to different values of X2) — prosoitos
– prosoitos, Commented Oct 26, 2019 at 3:19

prosoitos · Accepted Answer · 2019-10-26 03:38:47Z

What is a little tricky in your situation is that you have duplicates in the Y1 columns which correspond to different values in the X2 columns. So you will have to make these unique.

First, make sure that your Y1 columns are character vectors and not factors:

df1 <- data.frame("X1" = c("a", "a", "a", "b", "b", "b"),
                  "X2" = c("1", "2", "3", "1", "2", "3"),
                  "Y1" = c("d", "d", "f", "g", "h", "i"),
                  stringsAsFactors = F)

df2 <- data.frame("X1" = c("a", "a", "a", "b", "b", "b"),
                  "X2" = c("1", "2", "3", "1", "2", "3"),
                  "Y1" = c("f", "d", "d", "g", "h", "i"),
                  stringsAsFactors = F)

Give unique names to your Y1 duplicates:

df1$Y1uniq <- make.unique(df1$Y1)

df2$Y1uniq <- make.unique(df2$Y1)

Then you can use match() using those uniques values (and remove that column once you don't need it anymore):

df1[match(df2$Y1uniq, df1$Y1uniq), ][ , 1:3]

Output:

Collectives™ on Stack Overflow

R replace values in column based on match between columns

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related