1

I am trying to convert a nested JSON object file into CSV. Here is the sample of JSON

{
   "total_hosts" : [
      {
         "TYPE" : "AGENT",
         "COUNT" : 6
      }
   ],
   "installed" : [
      {
         "ID" : "admin-4.0",
         "VERSION" : 4,
         "ADDON_NAME" : "Administration"
      },
      {
         "ID" : "admin-2.0",
         "VERSION" : 2,
         "ADDON_NAME" : "Administration"
      },
      {
         "ID" : "ch-5.0",
         "VERSION" : "5",
         "ADDON_NAME" : "Control Host"
      }
   ],
   "virtual_machine" : [
      {
         "COUNT" : 4,
         "TYPE" : "VM"
      }

TYPE,COUNT,ID,VERSION like these are columns but the problem is not each object have values in it some have 1 object with these values or some have more so,I write in row , so i am trying to write blank space when there is no value for that column.

the code to write it into CSV

json_input = open('all.json')
try:
    decoded = json.load(json_input)
# tell computer where to put CSV
    outfile_path='Path to CSV'
# open it up, the w means we will write to it
    writer = csv.writer(open(outfile_path,'w'))


       for index in range(len(decoded['installed'])):
            row = []

            if decoded['total_hosts'][index]['TYPE'] is None:
                row.append(str(''))
            else:
                row.append(str(decoded['total_hosts'][index]['TYPE']))
            if decoded['total_hosts'][index]['COUNT'] is None:
                row.append(str(''))
            else:
                row.append(str(decoded['total_hosts'][index]['COUNT']))

            writer.writerow(row)

I am getting Index out of range error , I even tried True/False condition for if.

Can anyone help me with this?

Updated : Expected Output :

TYPE,COUNT,ID,VERSION,ADDON_NAME,COUNT,TYPE
AGENT,6,admin-4.0,4,Administration,4,VM
 , ,admin-2.0,2,Administration, , 
 , ,cd-5.0,5,Control Host, , 

So basically i need blank spaces when there is no value for that column.

Quesion Modified : OUTPUT :

AGENT,6,,,
 , ,admin-4.0,4,Administration
 , ,admin-2.0,2,Administration
 , ,ch-5.0,5,Control Host

Expected OUTPUT :

AGENT,6,admin-4.0,4,Administration
 , ,admin-2.0,2,Administration
 , ,ch-5.0,5,Control Host

Updated : I even tried

            row.append(str(entry.get('TYPE', '')))
            row.append(str(entry.get('COUNT', '')))
            row.append(str(entry.get('ID', '')))
            row.append(str(entry.get('VERSION', '')))
            row.append(str(entry.get('ADDON_NAME', '')))
            writer.writerow(row)

Still got the same output as above. :(

4
  • Your installed, and total_hosts lists don't have the same length; you are looping over range(len(decoded['installed'])) but then use the index in the decoded['total_hosts'] and decoded['_hosts'] lists (the latter is probably a typo). Commented Apr 14, 2014 at 14:34
  • you should include a complete example of input and expected output. Commented Apr 14, 2014 at 14:37
  • Yes it was typo :) Actually i want to loop it for all the elements in the file , but as they are separate objects/Arrays so i took array with maximum number elements and loop through it, And that's why i put IF condition so if there is no value it should append blank space. so i can maintain the column structure. Commented Apr 14, 2014 at 14:39
  • @user3520135: it's the decoded['total_hosts'][index] operation that throws the exception, no trying to access ['TYPE'] (which would throw a KeyError exception instead). Commented Apr 14, 2014 at 14:42

1 Answer 1

2

There are two mistakes here:

  1. You use the length of decoded['installed'] to generate an index you then use for the decoded['total_hosts'] list. This will generate index errors because decoded['total_hosts'] doesn't have as many entries.

  2. Accessing a key that doesn't exist will throw a KeyError; use the dict.get() method instead to retrieve a value or a default.

It's much simpler to just loop directly over a list, no need to generate an index:

for host in decoded['total_hosts']:
    row = [host.get('TYPE', ''), host.get('COUNT', '')]
    writer.writerow(row)

You can extend this to handle more than one key:

for key in ('total_hosts', 'installed', 'virtual_machine'):
    for entry in decoded[key]:
        row = [entry.get('TYPE', ''), entry.get('COUNT', '')]
        writer.writerow(row)

If you needed to combine the output of two entries, use itertools.izip_longest() to pair up the lists, using a default value for when the shorter list runs out:

from itertools import izip_longest

for t, i, v in izip_longest(decoded['total_hosts'], decoded['installed'], decoded['version'], fillvalue={}):
    row = [t.get('TYPE', ''), t.get('COUNT', ''), 
           i('ID', ''), i('VERSION', ''), i.get('ADDON_NAME', ''),
           v.get('COUNT', ''), v.get('TYPE', '')]
    writer.writerow(row)

This allows for any one of the three lists to be shorter than the others.

For Python versions before 2.6 (which added itertools.izip_longest) you'd have to assume that installed was always longest, and then use:

for i, installed in decoded['installed']:
    t = decoded['types'][i] if i < len(decoded['types']) else {}
    v = decoded['version'][i] if i < len(decoded['version']) else {}
    row = [t.get('TYPE', ''), t.get('COUNT', ''), 
           installed['ID'], installed['VERSION'], installed['ADDON_NAME'],
           v.get('COUNT', ''), v.get('TYPE', '')]
    writer.writerow(row)
Sign up to request clarification or add additional context in comments.

10 Comments

@user3520135: that error usually indicates you forgot a closing ) or ] on the preceding line.
@user3520135: note, the row = [] line is entirely redundant.
Yes you were right, Thanks again but there is one small problem.
@user3520135: you didn't specify you wanted to merge the total_hosts and installed lists; what happens if there are more than one total_hosts entry?
Sorry about that, i am trying to get out put in column,row format so if there are any more values it will get written in it's corresponding column, if not there will be blank space.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.