How to parse rows of a small DataFrame as json strings?

Question

I have a DataFrame df is the result of some pre-processing. The size of df is around 10,000 rows. I save this DataFrame in CSV as follows: df.coalesce(1).write.option("sep",";").option("header","true").csv("output/path")

Now I want to save this DataFrame as txt file in which is row is a JSON string. So, the column names should be passed to attributes in JSON strings.

For example:

df =
  col1   col2   col3
  aa     34     55
  bb     13     77

json_txt =
{"col1": "aa", "col2": "34", "col3": "55"}
{"col1": "bb", "col2": "13", "col3": "77"}

Which is the best way to do it?

yes of course . try it , test it and if failed then let me know — Anahcolus
– Anahcolus, Commented Jan 26, 2018 at 17:08
@RameshMaharjan: Let me test it to check if I get what I want simply using df.coalesce(1).write.json("path") — Markus
– Markus, Commented Jan 26, 2018 at 18:17

Anahcolus · Accepted Answer · 2018-01-27 05:35:53Z

1

You can use write.json api to save a dataframe in json format as

df.coalesce(1).write.json("output path of json file")

Above code would create a json file. But if you want a text format (json text) then you can use toJSON api as

df.toJSON.rdd.coalesce(1).saveAsTextFile("output path to text file")

I hope the answer is helpful

answered Jan 27, 2018 at 5:35

Anahcolus

42.1k6 gold badges75 silver badges101 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

How to parse rows of a small DataFrame as json strings?

1 Answer 1

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related