2

I have a huge dataframe df containing Numbers in the column "a" I also have a dataFrame name that contains the names corresponding to these numbers.

df:                          
 a   b     c                   name:
 1   val1  val2                  1  cat
 1   val1  val2                  2  dog
 2   val1  val2                  3  rabbit
 3   val1  val2
 3   val1  val2
 3   val1  val2

Now I want to replace the numbers with the names. The new dataFrame should look like this:

df:                                      
   a        b     c      
   cat      val1  val2                  
   cat      val1  val2                  
   dog      val1  val2                  
   rabbit   val1  val2
   rabbit   val1  val2
   rabbit   val1  val2

I realized this like that. It works but I am not content, because I hardCode the names ...

  df$a<-replace(df$a, df$a==1, "cat" )
  df$a<-replace(df$a, df$a==2, "dog" )
  df$a<-replace(df$a, df$a==3, "rabbit" )

How can I get the new values out of my dataframe name ?

Thank you!

3
  • 1
    Just a hint : You can look at some join fonctions. in dplyr or ``data.table` for example. Commented Jul 15, 2017 at 18:27
  • 2
    df$a <- factor(df$a, levels = names[, 1], labels = names[, 2]) Commented Jul 15, 2017 at 18:29
  • Thank you very much that helped :-) Commented Jul 15, 2017 at 18:37

3 Answers 3

2

data:

df = data.frame(a = c(1,1,2,3,3,3), b = rep('val1', 6), c = rep('val2', 6))

replace values with characters:

df$a = c('cat', 'dog', 'rabbit')[ match(df$a, sort(unique(df$a))) ]

output

df
#       a    b    c
#1    cat val1 val2
#2    cat val1 val2
#3    dog val1 val2
#4 rabbit val1 val2
#5 rabbit val1 val2
#6 rabbit val1 val2
Sign up to request clarification or add additional context in comments.

2 Comments

This solution implicity assumes that 'cat' is to be replaced by the smallest (once sorted) value of df$a. Woudl it be possible to have pairs, e.g., c(1, 'cat') which indicates what value is to be replaced with what value?
For that case you will have to merge the data by index as is done in the solution of @manotheshark And if your dataset is big then replace the data.frame with data.table and then perform merging
2

This is merging the two data.frames. This does not require hard coding any values, but only adding new values to the data.frames

df <- data.frame(a = c(1,1,2,3,3,3), b = "val1", c = "val2")
df.name <- data.frame(a = 1:3, name=c("cat", "dog", "rabbit"))

df1 <- merge(df, df.name, by = "a")  # merge two data.frames by `a`

Some cleanup is required if you want the name to be stored in column a

df1$a <- df1$name
df1$name <- NULL

       a    b    c
1    cat val1 val2
2    cat val1 val2
3    dog val1 val2
4 rabbit val1 val2
5 rabbit val1 val2
6 rabbit val1 val2

Comments

1

sample data:

df = data.frame(a = c(1,1,2,3,3,3), b = rep('val1', 6), c = rep('val2', 6))
df

#   a    b    c
# 1 1 val1 val2
# 2 1 val1 val2
# 3 2 val1 val2
# 4 3 val1 val2
# 5 3 val1 val2
# 6 3 val1 val2

using dplyr's recode(), you can achieve this:

df %>% mutate(a = recode(a, '1' = 'cat', '2' = 'dog', '3' = 'rabbit'))

#        a    b    c
# 1    cat val1 val2
# 2    cat val1 val2
# 3    dog val1 val2
# 4 rabbit val1 val2
# 5 rabbit val1 val2
# 6 rabbit val1 val2

1 Comment

The difficulty with this solution is that the replacements must be manually typed '1' = 'cat' etc. Would it be possible to have two lists, one indicating what is to be replaced, and the second indicating it is to be replaced by what?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.