Replacing occurrences of a number in multiple columns of data frame with another value in R

Question

ETA: the point of the below, by the way, is to not have to iterate through my entire set of column vectors, just in case that was a proposed solution (just do what is known to work once at a time).

There's plenty of examples of replacing values in a single vector of a data frame in R with some other value.

And also how to replace all values of NA with something else:

How to replace all values in a data.frame with another ( not 0) value

What I'm looking for is analogous to the last question, but basically trying to replace one value with another. I'm having trouble generating a data frame of logical values mapped to my actual data frame for cases where multiple columns meet a criteria, or simply trying to do the actions from the first two questions on more than one column.

An example:

data <- data.frame(name = rep(letters[1:3], each = 3), var1 = rep(1:9), var2 = rep(3:5, each = 3))

data
  name var1 var2
1    a    1    3
2    a    2    3
3    a    3    3
4    b    4    4
5    b    5    4
6    b    6    4
7    c    7    5
8    c    8    5
9    c    9    5

And say I want all of the values of 4 in var1 and var2 to be 10.

I'm sure this is elementary and I'm just not thinking through it properly. I have been trying things like:

data[data[, 2:3] == 4, ]

That doesn't work, but if I do the same with data[, 2] instead of data[, 2:3], things work fine. It seems that logical test (like is.na()) work on multiple rows/columns, but that numerical comparisons aren't playing as nicely?

Anthony Damico · Accepted Answer · 2013-02-06 20:14:58Z

76

you want to search through the whole data frame for any value that matches the value you're trying to replace. the same way you can run a logical test like replacing all missing values with 10..

data[ is.na( data ) ] <- 10

you can also replace all 4s with 10s.

data[ data == 4 ] <- 10

at least i think that's what you're after?

and let's say you wanted to ignore the first row (since it's all letters)

# identify which columns contain the values you might want to replace
data[ , 2:3 ]

# subset it with extended bracketing..
data[ , 2:3 ][ data[ , 2:3 ] == 4 ]
# ..those were the values you're going to replace

# now overwrite 'em with tens
data[ , 2:3 ][ data[ , 2:3 ] == 4 ] <- 10

# look at the final data
data

edited Feb 6, 2013 at 20:14

answered Feb 6, 2013 at 20:09

Anthony Damico

5,6847 gold badges52 silver badges84 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Hendy Over a year ago

I flipping swear I tried this and it wasn't working for me before. I hope to get to the point where I don't kick myself everytime I post to SO... By the way -- you're the 1min R video guy, aren't you!? Those rock.

liuminzhao · Accepted Answer · 2013-02-06 20:12:09Z

7

Basically data[, 2:3]==4 gave you the index for data[,2:3] instead of data:

R > data[, 2:3] ==4
       var1  var2
 [1,] FALSE FALSE
 [2,] FALSE FALSE
 [3,] FALSE FALSE
 [4,]  TRUE  TRUE
 [5,] FALSE  TRUE
 [6,] FALSE  TRUE
 [7,] FALSE FALSE
 [8,] FALSE FALSE
 [9,] FALSE FALSE

So you may try this:

R > data[,2:3][data[, 2:3] ==4]
[1] 4 4 4 4

answered Feb 6, 2013 at 20:12

liuminzhao

2,45519 silver badges28 bronze badges

1 Comment

Hendy Over a year ago

Thanks for this; also works. I just think the one from Anthony is a tad simpler. Big thanks for explaining why mine wasn't working though; after playing around some more, I see what you mean: me trying to apply values to data based on a comparison that was also subsetting makes a lot more sense.

Dinre · Accepted Answer · 2013-02-06 20:35:46Z

2

Just to provide a different answer, I thought I would write up a vector-math approach:

You can create a transformation matrix (really a data frame here, but will work the same), using a the vectorized 'ifelse' statement and multiply the transformation matrix and your original data, like so:

df.Rep <- function(.data_Frame, .search_Columns, .search_Value, .sub_Value){
   .data_Frame[, .search_Columns] <- ifelse(.data_Frame[, .search_Columns]==.search_Value,.sub_Value/.search_Value,1) * .data_Frame[, .search_Columns]
    return(.data_Frame)
}

To replace all values 4 with 10 in the data frame 'data' in columns 2 through 3, you would use the function like so:

# Either of these will work.  I'm just showing options.
df.Rep(data, 2:3, 4, 10)
df.Rep(data, c("var1","var2"), 4, 10)

#   name var1 var2
# 1    a    1    3
# 2    a    2    3
# 3    a    3    3
# 4    b   10   10
# 5    b    5   10
# 6    b    6   10
# 7    c    7    5
# 8    c    8    5
# 9    c    9    5

answered Feb 6, 2013 at 20:35

Dinre

4,22619 silver badges26 bronze badges

1 Comment

Anthony Damico Over a year ago

test should be data, no? :)

agstudy · Accepted Answer · 2013-02-06 20:12:44Z

1

Just for continuity

    data[,2:3][ data[,2:3] == 4 ] <- 10

But it looks ugly, So do it in 2 steps is better.

answered Feb 6, 2013 at 20:12

agstudy

122k18 gold badges205 silver badges265 bronze badges

Comments

LMc · Accepted Answer · 2024-08-07 22:40:40Z

0

Tidyverse

Here is a dplyr solution:

library(dplyr)

data |> 
  mutate(across(var1:var2, \(x) replace(x, x == 4, 10)))
#   name var1 var2
# 1    a    1    3
# 2    a    2    3
# 3    a    3    3
# 4    b   10   10
# 5    b    5   10
# 6    b    6   10
# 7    c    7    5
# 8    c    8    5
# 9    c    9    5

The first argument of across() is the columns you want to modify with a function. There are a number of handy tidy-selection helpers so you can easily pick multiple columns to modify.

Here I used a range from var1 to var2 (which are right next to each other). This could have been written as c(var1, var2) if, for example, these columns were not next to one another.

edited Aug 7, 2024 at 22:40

answered Aug 7, 2024 at 22:34

LMc

19k4 gold badges41 silver badges54 bronze badges

Collectives™ on Stack Overflow

Replacing occurrences of a number in multiple columns of data frame with another value in R

5 Answers 5

1 Comment

1 Comment

1 Comment

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

1 Comment

1 Comment

1 Comment

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related