0

So i am trying to import json data from file and want to export in CSV file. Only few tags like "authors" and "title" work fine with this code but when i try that for "abstract" it split every word of abstract in new column of csv. Before I try split() it was doing the same for every character

here is my code

import json
import csv
filename="abc.json"
csv_file= open('my.csv', 'w',encoding="utf-8")
csvwriter = csv.writer(csv_file)
with open(filename, 'r') as f:
     for line in f:
         data = json.loads(line)
         if 'abstract' in data:
             csvwriter.writerow(data['abstract'].split())
         elif 'authors' in data:
               csvwriter.writerow(data['authors'])
         else:
              f="my"

sample json file can be downloaded from here http://s000.tinyupload.com/?file_id=28925213311182593120

1
  • 1
    Can you copy and paste a sample of the JSON file that you're using as text in your question and a sample of both the actual and expected csv from that sample JSON. You're asking people to click on a random link on the internet to a file that they don't know what it is. Commented Aug 1, 2018 at 13:42

2 Answers 2

1

Like Ben said, it would be great to see a sample from the JSON file, but the issue could be with how your trying to split your abstract data. With what you're doing now, you're asking it to split at every space. Try something like this if you're wanting to split by line:

if 'abstract' in data:
         csvwriter.writerow(data['abstract'].split(","))
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks but that did half the job. Thr issue is abstract itself contains "," character in sentences. This split the sentence into column whenever a "," character comes
1

The reason this happened in abstract is because the value of abstract is a string (in contrast, the value of authors is a list). writerow receives an iterable, and when you iterate over a string in python you get a letter each time.

So before you used split, python took the string and divided it into letters, thereby giving you one letter per column. When you used split, you transformed the string into a list of words, so when you iterate over it you get a word each time.

If you want to split abstract by periods, just do the same thing with .split('.')

2 Comments

Thanks but that did half the job. Thr issue is abstract itself contains "," character in sentences. This split the sentence into column whenever a "," character comes
if you do .split('.') then the commas won't matter

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.