0

In my current project, I am trying to read data from csv file and trying to create hierarchical JSON array based on the data from csv file in R. Sample data is shown below:

Added data sample data (Reduced the dataset for simplicity):

    Country   Provider   2 G Data   3 G Data    LTE     FP0   anfang0   2G  3G  FP1 anfang1
     ABC        A1          n          n         n      fp0      j      NA  NA  NA  NA
     ABC        A2          NA         NA        NA      NA      NA      j   j  fp1 n
     ABC        A3          n          n         n       fp0     j      NA  NA  NA  NA
     DEF        A7          j          j         j       fp0     n       j   j  fp1 n

Understanding of data: n stands for value is no, j stands for value is yes and NA stands for value is missing. FP0 and FP1 represent the information about the same provider but in a different area. There are two types of data in a single row i.e.2 G Data, 3 G Data, LTE, FP0, anfang 0 belong to 1 group and 2G, 3G, FP1, anfang 1 belong to other group. If all information is n i.e. no then we have to consider corresponding anfang0 or anfang1 value.

The sample output is shown below (based on the above explanation):

        {
      "ABC": {
        "fp0":[
          {
            "provider": "A1",
            "anfrage": "j"
          },
          {
            "provider": "A3",
            "anfrage": "j"
          }
        ],
        "fp1": [
          {
            "provider": "A2",
            "2G": "j",
            "3G": "j"
          }
        ]
      },
      "DEF": {
        "fp1": [
          {
            "provider": "A7",
            "2G": "j",
            "3G": "j"
          }  
        ],
        "fp0": [
          {
            "provider": "A7",
            "2G": "j",
            "3G": "j",
            "LTE": "j"
          }  
        ]       
      }  
    }

In the above json format, for each Country there should be only single json block as shown above. So far I tried to follow this link but couldn't find any working solution.

for(i in 1:nrow(data)){
   a=c(a,jsonlite::toJSON(list(list('fp0' = 
   list("provider"=data$Provider[i],"2g"=data$`2 G Data`[i],"3g"=data$`3 G 
   Data`[i],"LTE"=data$LTE[i]))), pretty = TRUE))
}
toJSON(a, pretty = TRUE, auto_unbox = TRUE)

Kindly let me know in case you need more clarity.

2
  • Reproducible data would be nice instead of a picture of a table. Commented Jun 18, 2018 at 9:02
  • @snoram- I have added sample data could you please check. Commented Jun 18, 2018 at 9:15

1 Answer 1

1

One of the approach could be

library(dplyr)
library(jsonlite)

#data pre-processing (bind different areas' data in row)
df1 <- df[, 1:7] %>%                          #dataframe having data for one area - i.e. fp0
  na.omit() %>%
  `colnames<-`(c("country", "provider", "2G", "3G", "LTE", "fp", "anfang")) %>%
  bind_rows(
    df[, c(1:2, 8:ncol(df))] %>%              #dataframe having data for another area - i.e. fp1
      na.omit() %>%
      `colnames<-`(c("country", "provider", "2G", "3G", "fp", "anfang"))
    )
df1[df1 == 'n'] <- NA                         #convert all "n" to NA as we are not concerened about it in the final output

#convert processed dataframe to a list
dfList <- lapply(split(df1, df1$country), 
                 function(x) split(x[, c("provider", "2G", "3G", "LTE", "anfang")], x$fp))

#final result (convert list to JSON)
json_out <- toJSON(dfList, auto_unbox = T)

which gives

> json_out
{"ABC":{"fp0":[{"provider":"A1","anfang":"j"},{"provider":"A3","anfang":"j"}],"fp1":[{"provider":"A2","2G":"j","3G":"j"}]},"DEF":{"fp0":[{"provider":"A7","2G":"j","3G":"j","LTE":"j"}],"fp1":[{"provider":"A7","2G":"j","3G":"j"}]}}


Sample data:

df <- structure(list(Country = c("ABC", "ABC", "ABC", "DEF"), Provider = c("A1", 
"A2", "A3", "A7"), `2 G Data` = c("n", NA, "n", "j"), `3 G Data` = c("n", 
NA, "n", "j"), LTE = c("n", NA, "n", "j"), FP0 = c("fp0", NA, 
"fp0", "fp0"), anfang0 = c("j", NA, "j", "n"), `2G` = c(NA, "j", 
NA, "j"), `3G` = c(NA, "j", NA, "j"), FP1 = c(NA, "fp1", NA, 
"fp1"), anfang1 = c(NA, "n", NA, "n")), .Names = c("Country", 
"Provider", "2 G Data", "3 G Data", "LTE", "FP0", "anfang0", 
"2G", "3G", "FP1", "anfang1"), class = "data.frame", row.names = c(NA, 
-4L))

#  Country Provider 2 G Data 3 G Data  LTE  FP0 anfang0   2G   3G  FP1 anfang1
#1     ABC       A1        n        n    n  fp0       j <NA> <NA> <NA>    <NA>
#2     ABC       A2     <NA>     <NA> <NA> <NA>    <NA>    j    j  fp1       n
#3     ABC       A3        n        n    n  fp0       j <NA> <NA> <NA>    <NA>
#4     DEF       A7        j        j    j  fp0       n    j    j  fp1       n
Sign up to request clarification or add additional context in comments.

2 Comments

Let me look at this and back to you. Thanks for the effort.
Thanks @Prem for the perfect answer.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.