2

I want to replace the values of one element of a list with the values of a second element of a list. Specifically,

  • I have a list containing multiple data sets.
  • Each data set has 2 variables
  • The variables are factors
  • The n'th element of the second variable of each data set needs to be replaced with the n'th element of the first variable in each data set
  • Also, the replaced value should be called "replaced"
  • dat1 <- data.frame(names1 =c("a", "b", "c", "f", "x"),values= c("val1_1", "val2_1", "val3_1", "val4_1", "val5_1"))
       dat1$values <- as.factor(dat1$values)
    dat2 <- data.frame(names1 =c("a", "b", "f2", "s5", "h"),values= c("val1_2", "val2_2", "val3_2", "val4_2", "val5_2"))
       dat2$values <- as.factor(dat2$values)
    list1 <- list(dat1, dat2)
    

    The result should be the same list, but just with the 5th value replaced.

    [[1]]
         names1  values
    1         a  val1_1
    2         b  val2_1
    3         c  val3_1
    4         f  val4_1
    5  replaced       x
    [[2]]
         names1  values
    1         a  val1_2
    2         b  val2_2
    3        f2  val3_2
    4        s5  val4_2
    5  replaced       h
    
    1
    • This is a simplified example. I have more than 4500 data sets. Commented Feb 3, 2019 at 14:55

    2 Answers 2

    3

    A base R approach using lapply, since both the columns are factors we need to add new levels first before replacing them with new values otherwise those value would turn as NAs.

    n <- 5
    
    lapply(list1, function(x) {
       levels(x$values) <- c(levels(x$values), as.character(x$names1[n]))
       x$values[n] <- x$names1[n]
       levels(x$names1) <- c(levels(x$names1), "replaced")
       x$names1[n] <- "replaced"
       x
    })
    
    #[[1]]
    #    names1 values
    #1        a val1_1
    #2        b val2_1
    #3        c val3_1
    #4        f val4_1
    #5 replaced      x
    
    #[[2]]
    #    names1 values
    #1        a val1_2
    #2        b val2_2
    #3       f2 val3_2
    #4       s5 val4_2
    #5 replaced      h
    

    There is also another approach where we can convert both the columns to characters, then replace the values at required position and again convert them back to factors but since every dataframe in the list can be huge we do not want to convert all the values to characters and then back to factor just to change one value which could be computationally very expensive.

    Sign up to request clarification or add additional context in comments.

    Comments

    3

    Here is one option with tidyverse. Loop through the list with map, slice the row of interest (in this case, it is the last row, so n() can be used), mutate the column value and bind with the original data without the last row

    library(tidyverse)
    map(list1, ~ .x %>% 
                   slice(n()) %>%
                   mutate(values = names1, names1 = 'replaced') %>% 
                   bind_rows(.x %>% slice(-n()), .))
    #[[1]]
    #    names1 values
    #1        a val1_1
    #2        b val2_1
    #3        c val3_1
    #4        f val4_1
    #5 replaced      x
    
    #[[2]]
    #    names1 values
    #1        a val1_2
    #2        b val2_2
    #3       f2 val3_2
    #4       s5 val4_2
    #5 replaced      h
    

    Or it can be made more compact with fct_c from forcats. Different factor levels can be combined together with fct_c for the 'values' and 'names1' column

    library(forcats)
    map(list1, ~ .x %>% 
            mutate(values = fct_c(values[-n()], names1[n()]), 
                   names1 = fct_c(names1[-n()], factor('replaced'))))
    

    Or using similar approach with base R where we loop through the list with lapply, then convert the data.frame to matrix, rbind the subset of matrix i.e. the last row removed with the values of interest, and convert to data.frame (by default, stringsAsFactors = TRUE - so it gets converted to factor)

    lapply(list1,  function(x)  as.data.frame(rbind(as.matrix(x)[-5, ], 
                  c('replaced',  as.character(x$names1[5])))))
    

    3 Comments

    How do you get to the row number, Akrun? is n <- 5 used from the above answer?
    @tobiassch No, Here, you have only 5 rows, and n() is the last row. If you have a custom n, then use that in slice(n)
    Yes! I see, will try!

    Your Answer

    By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

    Start asking to get answers

    Find the answer to your question by asking.

    Ask question

    Explore related questions

    See similar questions with these tags.