1

I want to create a function which renames specific values in a column to something else, which is specified by the function, something like this (although in reality there would be much more to rename):

func <- function(x) x %>%
                    mutate(col_name = ifelse(col_name =="something","something else", 
                                      ifelse(col_name == "something2","something_else2")))

Note that it isn't the column names that I want to change, it is the values themselves in the column. However, I would like this to work regardless of which column the values are in (e.g. the function works all over the data frame). Also, this only works if the values named in the function is present, and I would like it to ignore the ones that aren't present in the columns. here is a small reproducible example: (column values are arbitrary)

col1 <- c("a","b","c","d","e")
col2 <- c("b","f","d","c","g")

df <- data.frame(col1, col2)

col3 <- c("a","h","i","b","c")
col4 <- c("c","d","j","a","g")

df2 <- data.frame(col3, col4)

Which looks like this:

df1:
  col1 col2
1    a    b
2    b    f
3    c    d
4    d    c
5    e    g

df2:
  col3 col4
1    a    c
2    h    d
3    i    j
4    b    a
5    c    g

Say that i want to rename like this:

df1:

   col1 col2
1  can  chi
2  chi  pig
3  equ  she
4  she  equ
5  fox  bov

df2:

   col3 col4
1  can  equ
2  avi  she
3  tyr  asp
4  chi  can
5  equ  bov

So what I was hoping to get was a function that changes the names of multiple values in data frame columns regardless of its position in the data frame, and that it ignores the values not found in the data frame by the function.

6
  • Could you edit the question with the desired output? Commented Oct 6, 2017 at 13:26
  • 2
    I'm confused; you want to rename the variables, but not change the column names? In a usual data frame, each column is a variable, so those would be the same. Could you add to your reproducible example what you want the output to be? Commented Oct 6, 2017 at 13:28
  • I think you are confusing a "variable" with a "record/value/entry/element" of a column. In R, a column is a variable. If you want to refer to things inside a column, you would usually call them either "elements" or "values". Commented Oct 6, 2017 at 13:45
  • Thanks for feedback, I'm quite new to programming. I have edited the question and changed some words, and added the result I want to get. Commented Oct 6, 2017 at 13:59
  • @Haakonkas Good job on providing a reproducible example! See if my answer is satisfactory. Btw, it seems that your df2 has a typo, the first row of col4 should be equ? Commented Oct 6, 2017 at 14:36

1 Answer 1

1

Recode all columns

library(dplyr)
func = function(x, originals = letters[1:10], 
                rename_tos = c("can", "chi", "equ", "she", "fox", "pig", "bov", "avi", "tyr", "asp")){
  names(rename_tos) = originals
  x %>%
    mutate_if(is.factor, as.character) %>%
    lapply(function(y){
      y = rename_tos[y]
    }) %>%
    data.frame(row.names = NULL) 
}

Results:

> func(df)
  col1 col2
1  can  chi
2  chi  pig
3  equ  she
4  she  equ
5  fox  bov

> func(df2)
  col3 col4
1  can  equ
2  avi  she
3  tyr  asp
4  chi  can
5  equ  bov

Notes:

The method I used is basically to create a lookup table (named vector) for the renames and index the rename_tos vector with column values. Here, I've set the originals and renames as the default of the function, but you can also supply your own.

User-supplied column names

If you want to be able to rename columns specified and leave the other columns the same, you can do something like the following:

library(dplyr)
library(rlang)

func = function(x, ..., originals = letters[1:10], 
                rename_tos = c("can", "chi", "equ", "she", "fox", "pig", "bov", "avi", "tyr", "asp")){
  names(rename_tos) = originals
  dots = quos(...)
  x %>%
    mutate_at(vars(!!! dots), as.character) %>%
    mutate_at(vars(!!! dots), funs(rename_tos[.])) %>%
    data.frame(row.names = NULL) 
}

Result:

> func(df, col2)
  col1 col2
1    a  chi
2    b  pig
3    c  she
4    d  equ
5    e  bov

> func(df2, col3, col4)
  col3 col4
1  can  equ
2  avi  she
3  tyr  asp
4  chi  can
5  equ  bov

> func(df2, c(col3, col4))
  col3 col4
1  can  equ
2  avi  she
3  tyr  asp
4  chi  can
5  equ  bov

Notes:

Here, I added the ... argument to allow the user to input their own column names. I used quos from rlang to quote the ... arguments and later unquoted them inside vars to mutate_at using !!!. For example, if the user supplied func(df, col2), the first argument of mutate_at evaluates to vars(col2). This works with multiple arguments as well as a vector of arguments as one can see in the results.

Sign up to request clarification or add additional context in comments.

3 Comments

Thank you for answering! This might seem to work. However, my data frame contains other columns with data as well, and this seems to convert all other values to NA instead of ignoring them. Also, it saved the data frame as a value. How would you go on and make it ignore other values in the data frame, perhaps use gsub(regEx to match multiple columns), or maybe add something that makes it only apply this to the columns containing the values in "originals"?
@Haakonkas See my updates. This should give you what you want.
This is brilliant! Works perfectly! Thank you!!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.