3

I want to convert a dataframe in this format (tbl) to the following nested list (tbllst):

library(tidyr)

tbl <- tribble(
  ~Col1, ~Col2, ~Col3,
  "Var1", "Var1_1", "Var1_1_1", 
  "Var1", "Var1_1", "Var1_1_2", 
  "Var1", "Var1_2", "Var1_2_1", 
  "Var1", "Var1_2", "Var1_2_2", 
)

tbllst <- list(
  Col1 = list(
    "Var1" = list(
      Col2 = list(
        "Var1_1" = list(
          Col3 = c(
            "Var1_1_1", 
            "Var1_1_2"
          )
        ),
        "Var1_2" = list(
          Col3 = c(
            "Var1_2_1", 
            "Var1_2_2"
          )
        )
      )
    )
  )
)

Is there an automated way of achieving this?

2 Answers 2

3

The function rrapply() in the rrapply-package has an option how = "unmelt" that converts a melted data.frame to a nested list, where each row in the data.frame becomes a node path in the nested list.

To apply this function, we first need to transform the tbldata.frame to the input format that is required by rrapply():

library(purrr)
library(dplyr)
library(rrapply)

## put data.frame in format for rrapply-function
tbl1 <- imap_dfc(tbl, ~bind_cols(.y, .x)) %>%
  group_by(across(num_range(prefix = "...", range = 1:5))) %>%
  summarize(`...6` = list(c(`...6`)))

tbl1
#> # A tibble: 2 x 6
#> # Groups:   ...1, ...2, ...3, ...4 [2]
#>   ...1  ...2  ...3  ...4   ...5  ...6     
#>   <chr> <chr> <chr> <chr>  <chr> <list>   
#> 1 Col1  Var1  Col2  Var1_1 Col3  <chr [2]>
#> 2 Col1  Var1  Col2  Var1_2 Col3  <chr [2]>

## unmelt to nested list
ls_tbl <- rrapply(tbl1, how = "unmelt")

str(ls_tbl)
#> List of 1
#>  $ Col1:List of 1
#>   ..$ Var1:List of 1
#>   .. ..$ Col2:List of 2
#>   .. .. ..$ Var1_1:List of 1
#>   .. .. .. ..$ Col3: chr [1:2] "Var1_1_1" "Var1_1_2"
#>   .. .. ..$ Var1_2:List of 1
#>   .. .. .. ..$ Col3: chr [1:2] "Var1_2_1" "Var1_2_2"

Note that the purpose of the group_by() and summarize() operations is only to get multiple var1_%_% under a single Col3 node. The following is considerably easier (but does not produce exactly the same result):

ls_tbl <- rrapply(imap_dfc(tbl, ~bind_cols(.y, .x)), how = "unmelt")

str(ls_tbl)
#> List of 1
#>  $ Col1:List of 1
#>   ..$ Var1:List of 1
#>   .. ..$ Col2:List of 2
#>   .. .. ..$ Var1_1:List of 2
#>   .. .. .. ..$ Col3: chr "Var1_1_1"
#>   .. .. .. ..$ Col3: chr "Var1_1_2"
#>   .. .. ..$ Var1_2:List of 2
#>   .. .. .. ..$ Col3: chr "Var1_2_1"
#>   .. .. .. ..$ Col3: chr "Var1_2_2"
Sign up to request clarification or add additional context in comments.

Comments

2

Here is another option using data.table + rrapply

library(data.table)
library(rrapply)

dt <- setDT(tbl)[, Map(function(...) list2DF(.(...)), names(.SD), .SD)]
rrapply(dt[, lapply(.SD, list), c(head(names(dt), -1))], how = "unmelt")

which gives

$Col1
$Col1$Var1
$Col1$Var1$Col2
$Col1$Var1$Col2$Var1_1
$Col1$Var1$Col2$Var1_1$Col3
[1] "Var1_1_1" "Var1_1_2"


$Col1$Var1$Col2$Var1_2
$Col1$Var1$Col2$Var1_2$Col3
[1] "Var1_2_1" "Var1_2_1"

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.