1

I have two dataframes, each with the same columns. Some columns have the same values in the same order in both dataframes (X1, X2 below). Other columns have the same values, but in a different order (Y1). This is only a problem for some levels of first variables (here, the order of rows in Y1 differs for X1 == "a", but not X1 == "b"). Example:

df1 <- data.frame("X1" = c("a", "a", "a", "b", "b", "b"),
                  "X2" = c("1", "2", "3", "1", "2", "3"),
                  "Y1" = c("d", "d", "f", "g", "h", "i"))
df2 <- data.frame("X1" = c("a", "a", "a", "b", "b", "b"),
                  "X2" = c("1", "2", "3", "1", "2", "3"),
                  "Y1" = c("f", "d", "d", "g", "h", "i"))

I would like to change the values of df2$X1 and df2$X2 such that the two dataframes are matched on values of Y1.

I would like to change X1 and X2 rather than Y1 because there are many Y variables. I would like to do this only for df$X1 == "a".

The output should looks like this:

df2 <- data.frame("X1" = c("a", "a", "a", "b", "b", "b"),
                  "X2" = c("3", "1", "2", "1", "2", "3"),
                  "Y1" = c("f", "d", "d", "g", "h", "i"))
5
  • Do you just want to arrange them differently? Commented Oct 25, 2019 at 22:32
  • Well, I want to arrange the X1, X2 columns differently so that they align with the Y1 column the same way they do in the other dataframe. Commented Oct 25, 2019 at 23:40
  • could you please provide your desired output as a data.frame Commented Oct 26, 2019 at 0:06
  • Done - apologies for the unclear description. Commented Oct 26, 2019 at 0:50
  • Your duplicate values make this more complicated (you have two values of d for Y1 which correspond to different values of X2) Commented Oct 26, 2019 at 3:19

1 Answer 1

1

What is a little tricky in your situation is that you have duplicates in the Y1 columns which correspond to different values in the X2 columns. So you will have to make these unique.

First, make sure that your Y1 columns are character vectors and not factors:

df1 <- data.frame("X1" = c("a", "a", "a", "b", "b", "b"),
                  "X2" = c("1", "2", "3", "1", "2", "3"),
                  "Y1" = c("d", "d", "f", "g", "h", "i"),
                  stringsAsFactors = F)

df2 <- data.frame("X1" = c("a", "a", "a", "b", "b", "b"),
                  "X2" = c("1", "2", "3", "1", "2", "3"),
                  "Y1" = c("f", "d", "d", "g", "h", "i"),
                  stringsAsFactors = F)

Give unique names to your Y1 duplicates:

df1$Y1uniq <- make.unique(df1$Y1)

df2$Y1uniq <- make.unique(df2$Y1)

Then you can use match() using those uniques values (and remove that column once you don't need it anymore):

df1[match(df2$Y1uniq, df1$Y1uniq), ][ , 1:3]

Output:

  X1 X2 Y1
3  a  3  f
1  a  1  d
2  a  2  d
4  b  1  g
5  b  2  h
6  b  3  i
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.