3

I want to write some data from python to xlsx. I currently have it stored as JSON, but it doesn't matter what it is going out of Python. Here's what the JSON for a single article would look like:

{ 
   'Word Count': 50
   'Key Words': { 
                  ['Blah blah blah', 'Foo', ... ] }
   'Frequency': {
                  [9, 12, ... ] }
   'Proper Nouns': { 
                  ['UN', 'USA', ... ] }
   'Location': 'Mordor'
}

I checked out the XlsxWriter module but can't figure out how to translate hierarchical data that is not necessarily the same size (note the number of proper nouns between the two data "objects").

What I want the data to look like:

Excel screenshot

Any pointers?

2
  • Could you edit the question to include a sample of the JSON you have? Commented Feb 23, 2016 at 17:36
  • Sure, i just wrote some out on my phone to show formatting. Commented Feb 24, 2016 at 20:27

1 Answer 1

5

As your structures can be arbitrarily nested, I would suggest using recursion to achieve this:

from collections import OrderedDict
import xlsxwriter
import json

def json_to_excel(ws, data, row=0, col=0):
    if isinstance(data, list):
        row -= 1
        for value in data:
            row = json_to_excel(ws, value, row+1, col)
    elif isinstance(data, dict):
        max_row = row
        start_row = row
        for key, value in data.iteritems():
            row = start_row
            ws.write(row, col, key)
            row = json_to_excel(ws, value, row+1, col)
            max_row = max(max_row, row)
            col += 1
        row = max_row
    else:
        ws.write(row, col, data)

    return row

text = """
[
    {
        "Source ID": 123,
        "WordCount": 50,
        "Key Words": ["Blah blah blah", "Foo"],
        "Frequency": [9, 12, 1, 2, 3],
        "Proper Nouns": ["UN", "USA"],
        "Location": "Mordor"
    },
    {
        "Source ID": 124,
        "WordCount": 50,
        "Key Words": ["Blah blah blah", "Foo"],
        "Frequency": [9, 12, 1, 2, 3],
        "Proper Nouns": ["UN", "USA"],
        "Location": "Mordor"
    }
]
"""

data = json.loads(text, object_pairs_hook=OrderedDict)
wb = xlsxwriter.Workbook("output.xlsx")
ws = wb.add_worksheet()
json_to_excel(ws, data)
wb.close()  

This would give you an output file looking like:

Excel screenshot

Sign up to request clarification or add additional context in comments.

4 Comments

Thanks! What is the necessity of this piece of code when defining data? object_pairs_hook=OrderedDict I have all my data in JSON not text already, would I even need this?
It is a trick to ensure the dictionary that is returned remains in the same order as the JSON data.
Cool! Learn something new every day. I'll test this out and let you know.
Yep! Winner winner. Thanks for the help!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.