1

I have a dataset with variables named as I10AA to I10ZZ and I11AA to I11ZZ. I want to create new variables IAA to IZZ, so that IAA = function(I10AA,I11AA).

As an highly simplified example.

set.seed(0)

df <- data.frame(I10AA=floor(runif(10,1,5)),I10AB=floor(runif(10,1,5)),
             I11AA=floor(runif(10,1,5)),I11AB=floor(runif(10,1,5)))

fun <- function(x,y) (x+y)

results <- df %>% mutate(IAA = fun(I10AA,I11AA),IAB = fun(I10AB,I11AB))

print(results)

results is the final dataset I want.

Is there a way to do this with tidyverse?

In the original dataset, the variables are arranged as:

 colnames(original_data) = "ID","I1AA", "I1AB", "I1AC", ... , "I1ZZ", "I2AA","I2AB",...,"I2ZZ",...,"I10AA",...,"I10ZZ","I11AA",..."I11ZZ"
5
  • Can you tell us how the columns are arranged in the orignial dataset? Commented Aug 5, 2018 at 20:06
  • There is no issue with the function, but I do not know how to loop over I10AA to I11ZZ Commented Aug 5, 2018 at 20:08
  • Of course. Edited. Commented Aug 5, 2018 at 20:17
  • Please check the solution posted. It should work Commented Aug 5, 2018 at 20:20
  • 1
    I really like the answer Commented Aug 5, 2018 at 20:24

1 Answer 1

1

We can loop through the column names, use transmute to create new columns, rename the columns with the substring of the column names and bind with the original data

library(tidyverse)
i1 <- grepl("10", names(df))
nm1 <- sub("\\d+", "", names(df)[i1])
i2 <- !i1

map2(names(df)[i1], names(df)[i2], ~
        df %>% 
          transmute(fun(!! rlang::sym(.x), !!rlang::sym(.y)))) %>% 
          bind_cols %>% 
          rename_all(., ~ nm1) %>%
  bind_cols(df, .)
#    I10AA I10AB I11AA I11AB IAA IAB
#1      4     1     4     2   8   3
#2      2     1     4     2   6   3
#3      2     1     1     3   3   4
#4      3     3     3     2   6   5
#5      4     2     1     1   5   3
#6      1     4     2     4   3   8
#7      4     2     2     3   6   5
#8      4     3     1     4   5   7
#9      3     4     2     1   5   5
#10     3     2     4     3   7   5

Or another option is to create place the subset of datasets in a list and use reduce to pass the +

list(df %>% 
        select(names(.)[i1]),
     df %>%
        select(names(.)[i2])) %>% 
  reduce(`+`) %>% 
  rename_all(., ~ nm1) %>% 
  bind_cols(df, .)

An easier option would be

df[nm1] <- df[i1] + df[i2]
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.