0

I need to change the order (dynamically) of the columns in a dataframe from this: enter image description here [![current dataframe]]

to this: enter image description here [![required dataframe]

The problem is, that I have a lot of NA in the vales and also within 1st row of the data. Is it possible to change 2 level- header in R?

current_dataframe <- data.frame(X1 = c(NA, NA, "banana", "strawberry"), X2 = c("Supermarkt", "Turnover", 1, 5),X3 = c(NA, "Turnover in EUR", 2, 6),X4 = c(NA, "Turnover  absolut", 3, 7),X5 = c(NA, "Sales Weeks", 4, 8),X6 = c("Bakery", "Turnover", 20, 24),X7 = c(NA, "Turnover in EUR", 21, 25),X8 = c(NA, "Turnover  absolut", 22, 26),X9 = c(NA, "Sales Weeks", 23, 27))
required_dataframe <- data.frame(X1 = c(NA, NA, "banana", "strawberry"),X2 = c("Turnover", "Supermarkt", 1, 5),X3 = c(NA, "Bakery", 20, 24),X4 = c("Turnover in EUR", "Supermarkt", 2, 6),X5 = c(NA, "Bakery", 21, 25),X6 = c("Turnover  absolut", "Supermarkt", 3, 7),X7 = c(NA, "Bakery", 22, 26),X8 = c("Sales Weeks", "Supermarkt", 4, 8),X9 = c(NA, "Bakery", 23, 27))

1 Answer 1

1

While this is an incorrect way to structure your data, with mixed types etc, here is a trial on what to do:

d <- current_dataframe
row1 <-  zoo::na.locf0(unlist(d[1,]))
e <- d[, order(match(b <- unlist(d[2,]), b))]
row2 <- replace(f<-unlist(e[2,]), duplicated(f), NA)
rbind(row2, row1, e[-(1:2),])
         X1         X2         X6              X3         X7                X4     X8          X5     X9
1       <NA>   Turnover       <NA> Turnover in EUR       <NA> Turnover  absolut   <NA> Sales Weeks   <NA>
2       <NA> Supermarkt Supermarkt      Supermarkt Supermarkt            Bakery Bakery      Bakery Bakery
3     banana          1         20               2         21                 3     22           4     23
4 strawberry          5         24               6         25                 7     26           8     27
Sign up to request clarification or add additional context in comments.

4 Comments

Thank you! That's helpful, after executing line with row2, I'm getting an error: Error in e[2, ] : incorrect number of dimensions Do you maybe know how to debug it?
@Fendi when you copy paste the code exactly how it is yu get an error? I do not have an error message so cannot really tell why you get an error message
Im applying the code on my original dataset with multiple columns and rows
@Fendi try the logic. Note that the structure of data is incorrect. Youd rather have data with column names separated by _. Eg Turnover_Supermarket. But you have the names within the data. Not very tidy

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.