0

I am using the python csv module to create a csv where some of the values are json strings. However the csv module's quoting is totally breaking the json:

import csv
import json
writer = csv.writer(open('tmp', 'w'))
writer.writerow([json.dumps([{'a' : 'b'}])])

The outputted json is broken, as you can see:

cat tmp
> "[{""a"": ""b""}]"

import json
json.loads("[{""a"": ""b""}]")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.7/json/__init__.py", line 326, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python2.7/json/decoder.py", line 366, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib/python2.7/json/decoder.py", line 382, in raw_decode
    obj, end = self.scan_once(s, idx)
ValueError: Expecting property name: line 1 column 2 (char 2)

And csv objects to turning quoting off:

import csv
import json
writer = csv.writer(open('tmp', 'w'), quoting=csv.QUOTE_NONE)
writer.writerow([json.dumps([{u'a' : u'b'}])])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
_csv.Error: need to escape, but no escapechar set

Has anyone else encountered this? Do json and csv just not play well together? (It's not my idea to store json stirngs in csv files.. something I just need to deal with right now). Unfortunately, these csvs I am creating contain hash digests and all sorts of other complicated stuff so all the sed or awkish type solutions to fix the json I've tried have failed or messed up something else..

5
  • Drop quoting=csv.QUOTE_NONE argument. Commented Sep 27, 2013 at 4:00
  • Right, but then the json is still broken by quoting. Commented Sep 27, 2013 at 4:01
  • Can't you change the quote character to '? Commented Sep 27, 2013 at 4:05
  • 1
    @qwwqwwq, You will get correct json if you use csv.reader. Commented Sep 27, 2013 at 4:08
  • you're right, the other script that was throwing the exception wasn't using csv.reader.. what a mess.. Commented Sep 27, 2013 at 4:21

1 Answer 1

2

don't use " as your quote character. Use something else:

with open('tmp', 'w') as fout:
    writer = csv.writer(fout, quotechar="'")

Really, this just tidy's things up a bit. When you read the data back in, you first need to "unquote" it by reading the data via csv.reader. That should give you back the strings you put in which are valid json.

Sign up to request clarification or add additional context in comments.

1 Comment

setting quotechar to anything other than '"' fixes the issue, my other problem was that the other script that reads this is was NOT using csv.reader, which also solves this

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.