0

I have data in one of dataframe's column with the following schema

<type 'list'>: [StructField(data,StructType(List(StructField(account,StructType(List(StructField(Id,StringType,true),StructField(Name,StringType,true),StructField(books,ArrayType(StructType(List(StructField(bookTile,StringType,true),StructField(bookId,StringType,true),StructField(bookName,StringType,true))),true),true)))))))]

I want to interate them extract each value out of it and create a new dataframe. Is there any inbuilt functions in pyspark supports this or I should iterate them? Any efficient way?

3
  • There is an explode function that will put each element of the array on its own row. Is that what you want? Commented Oct 23, 2019 at 9:03
  • I tried it but it gave me "due to data type mismatch: input to function explode should be array or map type, not struct" Commented Oct 23, 2019 at 9:11
  • 1
    Ah, I may have missunderstood you. It would be a bit clearer if you can add an example input/expected output dataframe to the question. However, it could be that you are looking for how to expand a struct: stackoverflow.com/questions/38753898/… or maybe this: stackoverflow.com/questions/39275816/… Commented Oct 23, 2019 at 9:32

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.