0

I am new to python. Just starting to experiment it with small programs. I saw this question where :

Input will be of json can be of this format :

'[ ["a","b","c"], [1,2,null], [null,3,4], [5,null,6] ]' 

or :

'[ { "a":1, "b":2 }, { "b":3, "c":4 }, { "c":6, "a":5 } ]'

We should convert it into :

output = '{ "a": [1,null,5], "b": [2,3,null], "c": [null,4,6] }'

So far I can think of is checking each element and appending to the result. Is there any easy or better way to do this in python. Please enlighten me.

6
  • You mean None instead of null right? Commented Jan 21, 2016 at 7:04
  • 1
    @IronFist seems like null is valid in json. If you call json.loads on that string it converts it to None. Presumably it also works the other way round. Commented Jan 21, 2016 at 7:05
  • 1
    @PaulRooney sure it does: json.dumps(None) == 'null' Commented Jan 21, 2016 at 7:07
  • @PaulRooney .. correct .. :) ..thought it was in the dictionary not in the json output...didn't pay attention to that Commented Jan 21, 2016 at 7:08
  • There are two tasks involved here: Firstly, read the JSON. Secondly, change the structure of the resulting data to resemble the desired output. Which task is giving you problems? Commented Jan 21, 2016 at 7:24

3 Answers 3

2

Use defaultdict from collections module, this way:

>>> import json
>>> s = '[ { "a":1, "b":2 }, { "b":3, "c":4 }, { "c":6, "a":5 } ]'
>>> 
>>> dic = json.loads(s)
>>> dic
[{'a': 1, 'b': 2}, {'b': 3, 'c': 4}, {'a': 5, 'c': 6}]
>>> kys = set(k for sub_d in d for k in sub_d) #creates uniques keys of dictionary d
>>> kys
{'a', 'b', 'c'}
>>>
>>> from collections import defaultdict
>>> my_dict = defaultdict(list)
>>> for d in dic:
        for k in kys:
            my_dict[k].append(d.get(k, None))
>>> my_dict
defaultdict(<class 'list'>, {'a': [1, None, 5], 'b': [2, 3, None], 'c': [None, 4, 6]})

As for the other situation:

>>> s = '[ ["a","b","c"], [1,2,null], [null,3,4], [5,null,6] ]'
>>> d = json.loads(s)
>>> d
[['a', 'b', 'c'], [1, 2, None], [None, 3, 4], [5, None, 6]]
>>> my_dict = dict(zip(d[0], zip(*d[1:])))
>>> my_dict
{'a': (1, None, 5), 'b': (2, 3, None), 'c': (None, 4, 6)}

If you don't want tuples as values, then:

>>> my_dict = defaultdict(list)
>>> for k,v in zip(d[0], zip(*d[1:])):
        my_dict[k].extend(v)

Finally, to group both cases into one function:

import json
from collections import defaultdict

def parse_data(data):
    data = json.loads(data) 
    my_dict = defaultdict(list)
    if isinstance(data[0], list):
        for k,v in zip(data[0], zip(*data[1:])):
            my_dict[k].extend(v)
    elif isinstance(data[0], dict):
        kys = set(k for sub_d in data for k in sub_d)
        for d in data:
            for k in kys:
                my_dict[k].append(d.get(k, None))
    return my_dict

s1 = '[ ["a","b","c"], [1,2,null], [null,3,4], [5,null,6] ]' 
d1 = parse_data(s1)
s2 = '[ { "a":1, "b":2 }, { "b":3, "c":4 }, { "c":6, "a":5 } ]'
d2 = parse_data(s2)
Sign up to request clarification or add additional context in comments.

3 Comments

Can you explain this line : set(k for sub_d in d for k in sub_d) What is sub_d here ?
set will return unique element from any collections or list and because d is list of dictionaries, I had to use set comprehension for that, as if to say: for every sub_d (every dictionary in list) then for every k in that sub_d ...
@Sarah .. Check my edit as I've grouped both situations into one function.
1

Try this with Python 3:

def get_elements(json_txt):
import json
arr = json.loads(json_txt)
new = {}
list_of_keys = []
list_of_keys_from_dicts = [list(elem.keys()) for elem in arr]
# getting keys from json
for keys in list_of_keys_from_dicts:
    for key in keys:
        if key not in list_of_keys:
            list_of_keys.append(key)
for key in list_of_keys:
    new[key] = []
for element in arr:
    for key in list_of_keys:
        if key in element:
            new[key].append(element[key])
        else:
            new[key].append(None)
return json.dumps(new)

Comments

1

It will works for both formates

import json


def convert_format(json_data):
    convert_to_py_obj = json.loads(json_data)
    new_format = dict()
    if isinstance(convert_to_py_obj, list) and len(convert_to_py_obj) > 1:
        index_0_keys = convert_to_py_obj[0]
        if isinstance(index_0_keys, list):
            for i, key in enumerate(index_0_keys):
                new_format[key] = []
                for sub_list in convert_to_py_obj[1:]:
                    new_format[key].append(sub_list[i])
        elif isinstance(index_0_keys, dict):
            for sub_dict in convert_to_py_obj:
                for key, val in sub_dict.iteritems():
                    if key in new_format:
                        new_format[key].append(val)
                    else:
                        new_format[key] = [val]
                none_keys = set(new_format.keys()) - set(sub_dict.keys())
                for key in none_keys:
                    new_format[key].append(None)
    return json.dumps(new_format)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.