0

I would like to replace a full row of data in one dataframe with matching rows from another dataframe. I have a reproducible example with only a couple of columns, but in practice I have a dataframe with dozens of columns.

# main dataframe
df1 <- tibble(id = letters[1:5], v1 = seq(1,5), v2 = seq(1,5), v3 = seq(1,5))
>df1
# A tibble: 5 x 4
  id       v1    v2    v3
  <chr> <int> <int> <int>
1 a         1     1     1
2 b         2     2     2
3 c         3     3     3
4 d         4     4     4
5 e         5     5     5
# values to replace
df2 <- tibble(id = letters[3:4], v1 = rep(0,2), v2 = rep(0,2), v3 = rep(0,2))
> df2
# A tibble: 2 x 4
  id       v1    v2    v3
  <chr> <dbl> <dbl> <dbl>
1 c         0     0     0
2 d         0     0     0
# what the final result should look like
result <- tibble(id = c("a", "b", "c", "d", "e"), v1 = c(1, 2, 0, 0, 5), v2 = c(1, 2, 0, 0, 5), v3 = c(1, 2, 0, 0, 5))
>result
# A tibble: 5 x 4
  id       v1    v2    v3
  <chr> <dbl> <dbl> <dbl>
1 a         1     1     1
2 b         2     2     2
3 c         0     0     0
4 d         0     0     0
5 e         5     5     5
3
  • 3
    df1[match(df2$id, df1$id),] <- df2 Commented Mar 20, 2020 at 0:35
  • 1
    Using dplyr df2 %>% bind_rows(df1) %>% distinct(id, .keep_all = T) %>% arrange(id) Commented Mar 20, 2020 at 0:35
  • @H1 Those this may work on the example I provided I get an error. Error in [<-.data.frame(*tmp*, match(df2$id, df1$id), , value = list( : missing values are not allowed in subscripted assignments of data frames Seems like missing values could break this. Commented Mar 20, 2020 at 1:06

3 Answers 3

2

Here is one solution using tidyverse

library(tidyverse)
df1 %>%
  #Stay with the rows that are not found in df2 according to its id
  filter(! id %in% df2$id) %>%
  #bind rows of df2
  bind_rows(df2) %>%
  #Order data according to id values
  arrange(id)
Sign up to request clarification or add additional context in comments.

Comments

1

Based on your comment, if you have ids that exist in df2 but not in df1 you can do:

df1[na.omit(match(df2$id, df1$id)),] <- df2[df2$id %in% df1$id,]

Comments

0

This is a straightforward solution

df3 <- left_join(df1, df2, by = "id", suffix = c("", ".x"))
df3[!is.na(df3$v1.x), 2:4] <- df3[!is.na(df3$v1.x), 5:7]
df3[, 5:7] <- NULL

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.