0

I have two dataframes with different dimensions,

df1 <- data.frame(names= sample(LETTERS[1:10]), duration=sample(0:100, 10))

>df1
   names duration
1      J       97
2      G       57
3      H       53
4      A       23
5      E      100
6      D       90
7      C       73
8      F       60
9      B       37
10     I       67

df2 <- data.frame(names= LETTERS[1:5], names_new=letters[1:5])

> df2
  names names_new
1     A         a
2     B         b
3     C         c
4     D         d
5     E         e

I want to replace in df1 the values that match df1$names and df2$names but using the df2$names_new. My desired output would be:

> df1
   names duration
1      J       97
2      G       57
3      H       53
4      a       23
5      e      100
6      d       90
7      c       73
8      F       60
9      b       37
10     I       67

This is the code I'm using but I wonder if there is a cleaner way to do it with no so many steps,

df2[,1] <- as.character(df2[,1])
df2[,2] <- as.character(df2[,2])
df1[,1] <- as.character(df1[,1])    

match(df1[,1], df2[,1]) -> id
which(!is.na(id)==TRUE) -> idx
id[!is.na(id)] -> id

df1[idx,1] <- df2[id,2]

Many thanks

1
  • 1
    You should set the seed to have a reproducible example.(set.seed(1)) Commented Jun 10, 2014 at 17:51

5 Answers 5

5

Here's an approach from qdapTools:

library(qdapTools)
df1$names <- df1$names %lc+% df2

The %l+% is a binary operator version of lookup. The left are the terms and the right side is the lookup table. The + means that any noncomparables will revert back to the original. This is a wrapper for the data.table package and is pretty speedy.

Here is the output including set.seed(1) for reproducibility:

set.seed(1)
df1 <- data.frame(names= sample(LETTERS[1:10]), duration=sample(0:100, 10),stringsAsFactors=F)
df2 <- data.frame(names= LETTERS[1:5], names_new=letters[1:5],stringsAsFactors=F)

library(qdapTools)
df1$names <- df1$names %lc+% df2

df1

##    names duration
## 1      c       20
## 2      d       17
## 3      e       68
## 4      G       37
## 5      b       74
## 6      H       47
## 7      I       98
## 8      F       93
## 9      J       35
## 10     a       71
Sign up to request clarification or add additional context in comments.

1 Comment

Extremely slick and fast solution!
2

Are all names in df2 also in df1? And do you intent to keep them as a factor? If so, you might find this solution helpful.

idx <- match(levels(df2$names), levels(df1$names))
levels(df1$names)[idx] <- levels(df2$names_new)

Comments

1

This works but requires that names and names_new are character and not factor.

set.seed(1)
df1 <- data.frame(names= sample(LETTERS[1:10]), duration=sample(0:100, 10),stringsAsFactors=F)
df2 <- data.frame(names= LETTERS[1:5], names_new=letters[1:5],stringsAsFactors=F)


rownames(df1) <- df1$names
df1[df2$name,]$names <- df2$names_new

Comments

0

Another option using merge:

transform(merge(df1,df2,all.x=TRUE),
          names=ifelse(is.na(names_new),as.character(names),
                                        as.character(names_new)))

Comments

0

Another way using match would be (if df1$names and df1$names are characters of course)

df1[match(df2$names, df1$names), "names"] <- df2$names_new

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.