How to turn a list of strings into data frame names with for loop in R?

Question

I wish to have some advice on this problem in R. I have a data frame "my_fruits_data" with many columns including the index columns as below in name_cols. I want to filter those index columns one by one with a for loop and store the filtered records in respective data frames with their names listed in df_fruits for post-processing. Apparently, it doesn't work as df_fruits elements are strings rather than actual data frame names. I've searched and got a few hints but none of them actually helped.

# column names
name_cols <- c("Index_apple",  
             "Index_pear",
             "Index_orange",  
             "Index_watermelon",
             "Index_strawberry"
         )
# dataframe names for filtered result 
df_fruits <- c("df_apple",  
             "df_pear",
             "df_orange",  
             "df_watermelon",
             "df_strawberry")

for (i in name_cols) 
{  
    df_fruits[i] <- my_fruits_data %>% 
           filter (.data[[name_cols[i]]] ==1) 
    ......
}

Thanks chase77

It helps to have usable data for questions, making it a complete "minimal working example"; please include sample data (reprex) that we can use, preferably with dput(x); see stackoverflow.com/q/5963269, minimal reproducible example, and stackoverflow.com/tags/r/info. Ultimately, I feel a for loop is unlikely to be the preferred method for this, can you show what you're intending to have at the end of all of this processing? It's likely R has a more-efficient way to approach what you need. — r2evans
– r2evans, Commented Dec 20, 2021 at 6:24
This is simply data splitting/ data grouping. You do not need to use for-loops. Give an example of your data and the expected output. Also what do you mean as further processing? IF you are going to do almost similar post process for each fruit dataset, You should rather group the whole dataset than having it in different fruit datasets. — Onyambu
– Onyambu, Commented Dec 20, 2021 at 6:29

kybazzi · Accepted Answer · 2021-12-20 07:24:54Z

1

I understood that you want to split your data based on the type of fruit, which is provided by separate index columns. Here is how to do that with an example dataset.

library(tidyverse)
my_fruits_data = tribble(
  ~ index_apple, ~ index_pear, ~index_banana, ~ x1,
  1, 0, 0, 10,
  1, 0, 0, 11,
  0, 1, 0, 12,
  0, 0, 1, 13,
  0, 0, 1, 14, 
  0, 0, 1, 15
)

The example data:

> my_fruits_data
# A tibble: 6 x 4
  index_apple index_pear index_banana    x1
        <dbl>      <dbl>        <dbl> <dbl>
1           1          0            0    10
2           1          0            0    11
3           0          1            0    12
4           0          0            1    13
5           0          0            1    14
6           0          0            1    15

First you can transform the data to have a single fruit column that mentions the type of fruit:

fruit_data = my_fruits_data %>% 
  pivot_longer(
    cols = starts_with("index_"), 
    names_prefix = "index_", 
    names_to = "fruit",
    values_to = "fruit_ind"
  ) %>% 
  filter(fruit_ind == 1) %>% 
  select(-fruit_ind)

The result:

> fruit_data
# A tibble: 6 x 2
     x1 fruit 
  <dbl> <chr> 
1    10 apple 
2    11 apple 
3    12 pear  
4    13 banana
5    14 banana
6    15 banana

Finally, as @Onyambu mentioned, you could consider grouping this data by our new variable fruit. If you wanted to do different processing for different fruits, you could split() the data to get a list of separate data frames for each fruit:

> split(fruit_data, fruit_data$fruit)
$apple
# A tibble: 2 x 2
     x1 fruit
  <dbl> <chr>
1    10 apple
2    11 apple

$banana
# A tibble: 3 x 2
     x1 fruit 
  <dbl> <chr> 
1    13 banana
2    14 banana
3    15 banana

$pear
# A tibble: 1 x 2
     x1 fruit
  <dbl> <chr>
1    12 pear

answered Dec 20, 2021 at 7:24

kybazzi

1,0304 silver badges7 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

chase77 Over a year ago

Thank you so much Kybazzi for the detailed demo to get around the problem and also to Onyambu and r2evens for the ideas. I'll try - it should work. But this problem prompted me to search for a way to turn a string into a data frame name and only got an idea of using function assign():

chase77 Over a year ago

Thank you so much Kybazzi for the detailed demo to get around the problem and also to Onyambu and r2evens for the ideas. I'll try - it should work. But this problem prompted me to search for a way to turn a string into a data frame name and only got an idea of using function assign(): assign(string, df_apple %>% filter(.data[[Index_fruits[1]]] ==1)). But this method doesn't work conveniently for my case. Would like to have some generic ideas for assigning a string to data frame name.

kybazzi Over a year ago

I don't think it's a recommended approach to try using assign() in this way - why do you want to do that instead of something similar to the solution I've showed here?

chase77 Over a year ago

Because there are following analysis e.g. using summarise(). I don't want to copy the same set of codes multiple times for different fruits (over 50 types in my actual case). That's why I try to use a loop.

kybazzi Over a year ago

In my code, you can summarize results on fruit_data, such as fruit_data %>% group_by(fruit) %>% summarise(x = mean(x1)). I still don't understand why you want to create a large number of variables using assign().

|

Collectives™ on Stack Overflow

How to turn a list of strings into data frame names with for loop in R?

1 Answer 1

6 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

6 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related