Concatenate column names in one column conditional on using mutate, across and case_when

Question

I would like to:

Use across and case_when to check if columns A1-A3 == 1
Concatenate the column names of the columns where A1-A3 == 1 and
mutate a new column with the concatenated column names

My dataframe:

df <- tribble(
~ID,    ~A1,    ~A2,    ~A3,
1, 0, 1, 1, 
2, 0, 1, 1, 
3, 1, 1, 1, 
4, 1, 0, 1, 
5, 0, 1, 0)

Desired Output:

# A tibble: 5 x 5
     ID    A1    A2    A3 New_Col 
  <dbl> <dbl> <dbl> <dbl> <chr>   
1     1     0     1     1 A2 A3   
2     2     0     1     1 A2 A3   
3     3     1     1     1 A1 A2 A3
4     4     1     0     1 A1 A3   
5     5     0     1     0 A2

So far I have tried:

df %>% 
  rowwise() %>% 
  mutate(New_Col = across(A1:A3, ~ case_when(. == 1 ~ paste0("colnames(.)", collapse = " "))))

Not working Output:

     ID    A1    A2    A3 New_Col$A1  $A2         $A3        
  <dbl> <dbl> <dbl> <dbl> <chr>       <chr>       <chr>      
1     1     0     1     1 NA          colnames(.) colnames(.)
2     2     0     1     1 NA          colnames(.) colnames(.)
3     3     1     1     1 colnames(.) colnames(.) colnames(.)
4     4     1     0     1 colnames(.) NA          colnames(.)
5     5     0     1     0 NA          colnames(.) NA

What I want to learn:

Is it possible to use across to check for conditions across multiple columns
If yes how looks the part after ~ of case_when to get the specific colnames
How can I get only one column after using mutate, across and case_when and not 3 like here.

I thought I already was able to master this task, but somehow I lost it...

Ronak Shah · Accepted Answer · 2021-05-30 11:12:10Z

11

To use across with case_when you can do -

library(dplyr)
library(tidyr)

df %>% 
  mutate(across(A1:A3, ~case_when(. == 1 ~ cur_column()), .names = 'new_{col}')) %>%
  unite(New_Col, starts_with('new'), na.rm = TRUE, sep = ' ')

#    ID    A1    A2    A3 New_Col 
#  <dbl> <dbl> <dbl> <dbl> <chr>   
#1     1     0     1     1 A2 A3   
#2     2     0     1     1 A2 A3   
#3     3     1     1     1 A1 A2 A3
#4     4     1     0     1 A1 A3   
#5     5     0     1     0 A2

across creates 3 new columns named new_A1, new_A2 and new_A3 with the column name if the value is 1 or NA otherwise. Using unite we combine the 3 columns into one New_col.

Also we can use rowwise with c_across -

df %>% 
  rowwise() %>% 
  mutate(New_Col = paste0(names(.[-1])[c_across(A1:A3) == 1], collapse = ' '))

answered May 30, 2021 at 11:12

Ronak Shah

391k20 gold badges173 silver badges237 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

AnilGoyal Over a year ago

Ronak, instead of names() can we use cur_column here somehow directly?

Ronak Shah Over a year ago

You mean in rowwise or group_by ID right? I don't think we can do that since cur_column can be used within across only.

AnilGoyal Over a year ago

Yes, It returns this error only. Thanks for explaining :)

AnilGoyal · Accepted Answer · 2021-05-30 11:17:33Z

7

without rowwise/ across you may also obtain same using cur_data()

df %>% group_by(ID) %>%
  mutate(new_col = paste0(names(df[-1])[as.logical(cur_data())], collapse = ' '))

# A tibble: 5 x 5
# Groups:   ID [5]
     ID    A1    A2    A3 new_col 
  <dbl> <dbl> <dbl> <dbl> <chr>   
1     1     0     1     1 A2 A3   
2     2     0     1     1 A2 A3   
3     3     1     1     1 A1 A2 A3
4     4     1     0     1 A1 A3   
5     5     0     1     0 A2

a . instead of df inside mutate will also do

df %>% group_by(ID) %>%
  mutate(new_col = paste0(names(.[-1])[as.logical(cur_data())], collapse = ' '))

answered May 30, 2021 at 11:17

AnilGoyal

26.3k4 gold badges34 silver badges50 bronze badges

2 Comments

Karthik S Over a year ago

Awesome Anil ji and Ronak, Have one query, here cur_data is each group, will it work even if there are more than 1 row for each group? Because I tried as.logical(df[-1]) and expecting a DF of TRUE and FALSE but got this error: Error: 'list' object cannot be coerced to type 'logical'. And what's the difference between cur_data and cur_group

AnilGoyal Over a year ago

Hi @KarthikS, you may call me Anil, see some explanation here. cur_data returns the current data (grouped of course) and cur_group represents group keys. So cur_data will return binary values here and cur_group will return ids. Hope this is clear

akrun · Accepted Answer · 2021-05-30 19:57:43Z

5

Using base R

df$New_Col <- apply(df[-1], 1, \(x) paste(names(x)[as.logical(x)], collapse=' '))
df$New_Col
#[1] "A2 A3"    "A2 A3"    "A1 A2 A3" "A1 A3"    "A2"

Or using tidyverse

library(dplyr)
library(purrr)
library(stringr)
df %>%
   mutate(New_Col = across(A1:A3, ~ c('', cur_column())[. + 1] ) %>% 
                       invoke(str_c, .))

edited May 30, 2021 at 19:57

answered May 30, 2021 at 19:43

akrun

891k38 gold badges590 silver badges700 bronze badges

Comments

tmfmnk · Accepted Answer · 2021-05-30 11:18:01Z

3

One option involving also purrr could be:

df %>%
 mutate(New_Col = pmap_chr(across(-ID), 
                           ~ paste(names(c(...))[which(c(...) == 1)], collapse = " ")))

     ID    A1    A2    A3 New_Col 
  <dbl> <dbl> <dbl> <dbl> <chr>   
1     1     0     1     1 A2 A3   
2     2     0     1     1 A2 A3   
3     3     1     1     1 A1 A2 A3
4     4     1     0     1 A1 A3   
5     5     0     1     0 A2

answered May 30, 2021 at 11:18

tmfmnk

40.4k4 gold badges54 silver badges73 bronze badges

Collectives™ on Stack Overflow

Concatenate column names in one column conditional on using mutate, across and case_when

4 Answers 4

3 Comments

2 Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

3 Comments

2 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related