1

I have a quite specific question on how to convert JSON data to R. I am dealing with data from a reaction time test. The data contains some basic formation on id, gender, and age and is in csv format. However, the data for the reaction task is delivered as JSON-array with the following structure: ["stimulus 1", "stimulus 2", answer chosen, reaction time].

This is an example of how the data looks like, just to give you a basic idea of it (with the exception that the JSON array is in fact much longer in the original data)

id     gender   age    reaction_task

HU3    male     34     [["prime1", "target2", 1, 1560], ["prime7", "target6", 2, 1302], ["prime4", "target5", 2, 996]]

I am quite a novice in R and looking for a method to convert this JSON-array into multiple R columns - for instance like this:

trial1_stimulus1     trial1_stimulus2   trial1_answer     trial1_time     trail2_stimulus1    trial2_stimulus2    etc

prime1               target2            1                 1560             prime7              target2

I found out how to separate the data from another using the following command:


df <- cbind(df, read.table(text = as.character(df$reaction_task), sep = ",", fill=TRUE) )

It worked, but turned out to be quite laborious, as I stilled had to eliminate the [] from the data manually. So I was wondering wether there is a smoother way to deal with this task?

I was trying the following code as well, but got an error message:

purrr::map_dfr(sosci$A101oRAW, jsonlite::fromJSON)
Fehler: parse error: premature EOF
                                       
                     (right here) ------^

Thanks for your help!

Edit: Thanks a lot to Maydin for the answer provided! It works well for the example data, but when the data frame contains more than one person, I get almost the same error warning as before:

id <- c("HU3", "AB0", "IO9")
gender <- c("male", "female", "male")
age <-c(34, 87, 23)
task <- c("[[\"prime1\", \"target2\", 2, 1529], [\"prime7\", \"target6\", 2, 829], [\"prime4\", \"target5\", 1, 1872]]", "[[\"prime1\", \"target2\", 1, 1560], [\"prime7\", \"target6\", 2, 1302], [\"prime4\", \"target5\", 2, 996]]","[[\"prime1\", \"target2\", 1, 679], [\"prime7\", \"target6\", 1, 2090], [\"prime4\", \"target5\", 1, 528]]")
                                                                                                                                                                                                                                                                                                                                                                                                                 
df <- data.frame(id, gender, age, task)

library(jsonlite)
library(dplyr)

df2 <- data.frame(df[,1:3],fromJSON(as.character(df[,"task"])))
parse error: trailing garbage
          rime4", "target5", 1, 1872]] [["prime1", "target2", 1, 1560]
                     (right here) ------^

1 Answer 1

3
library(jsonlite)

df2 <- lapply(1:nrow(df), function(x) {

     data.frame(df[x,1:3],fromJSON(as.character(df[x,"task"])),
        row.names = NULL) })

df2 <- do.call(rbind,df2)

df2

    
   id gender age     X1      X2 X3   X4
1 HU3   male  34 prime1 target2  2 1529
2 HU3   male  34 prime7 target6  2  829
3 HU3   male  34 prime4 target5  1 1872
4 AB0 female  87 prime1 target2  1 1560
5 AB0 female  87 prime7 target6  2 1302
6 AB0 female  87 prime4 target5  2  996
7 IO9   male  23 prime1 target2  1  679
8 IO9   male  23 prime7 target6  1 2090
9 IO9   male  23 prime4 target5  1  528

I think the output above is in a nicer format, but if you like to convert this into columns,

library(tidyr)

pivot_wider(data = df2, 
            id_cols = c("id","gender","age"), 
            names_from = c("X1","X2","X3","X4"), 
            values_from =c("X1","X2","X3","X4")) %>% as.data.frame()

You can change the names of the columns if you want later on by using colnames() etc.

Data:

df <- structure(list(id = structure(1L, .Label = "HU3", class = "factor"), 
    gender = structure(1L, .Label = "male", class = "factor"), 
    age = structure(1L, .Label = "34", class = "factor"), reaction_task = structure(1L, .Label = "[[\"prime1\", \"target2\", 1, 1560], [\"prime7\", \"target6\", 2, 1302], [\"prime4\", \"target5\", 2, 996]]", class = "factor")), class = "data.frame", row.names = c(NA, 
-1L))
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks a lot for the answer. It worked well for the example. However, when working with the real dataframe, it does give me the same error warning as before. I created an example dataframe containing three instead of only one person and edited it to my original question

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.