1

new one for me -- I'm assuming this is fairly easy, but I've never worked with arrays before, so I'm curious as to how it works.

I have a dict of dicts, like so:

{'bob': {'a':1, 'b':2, ...}, 'joe': {'a':2, 'c':3, ...} ...}

I'd like to turn it into an array so I can write it to a CSV and then turn it into a heatmap using R. I was trying to cheat and just write each nested dict to an individual row, but of course that won't work because not every key is present in every nested dict. simple, right?

desired output would look like (in tabular form):

,a,b,c
bob,1,2,0
joe,2,0,3
4
  • 2
    You have put a sample of your input. Can you give an example representation of the desired output? Commented Dec 1, 2012 at 22:18
  • Your sample output at present does not explicitly address how to handle the case where a key is absent from one of the nested dictionaries. Do you want us to treat its value as zero? Commented Dec 1, 2012 at 22:22
  • yes, please. sorry for lack of clarity. Commented Dec 1, 2012 at 22:23
  • see python csv library :csv Commented Dec 1, 2012 at 22:25

3 Answers 3

3

If your columns are fixed, you could simply do something like:

cols = ['a', 'b', 'c']
csv.writerow([''] + cols)
for name, values in data.iteritems():
    csv.writerow([name] + [values.get(c, 0) for c in cols])
Sign up to request clarification or add additional context in comments.

2 Comments

+1 - there is also csv.DictWriter from the stdlib
okay -- and I could presumably get the names of all of the columns before this step by doing for q,c in dictofdicts.iteritems(): coltitles.append(c.keys()) coltitles = list(set(coltitles))
1

let's suppose you have 3 predefined keys, you can use the get function of the dict to get the value or a default one if the key is not in the dict:

headers = ('a', 'b', 'c')
for key, values in dict.item():
     print ','.join([values.get(h, '') for h in headers])

Comments

1

Others have already answered the printing, but assume fixed headers. To get the column headers from the dict:

columns = sorted(set(column for subdict in dict_of_dicts.itervalues() for column in subdict))

Or, more verbosely:

column_set = set()
for subdict in dict_of_dicts.itervalues():
  for column in subdict:
    column_set.add(column)
columns = sorted(column_set)

To create the array in one long line, just for fun, not recommended:

array = [[''] + columns] + [[key] + [subdict.get(column, 0) for column in columns] for key, subdict in dict_of_dicts.iteritems()]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.