0

I have a JSON data as below.

input_list = [["Richard",[],{"children":"yes","divorced":"no","occupation":"analyst"}],
["Mary",["testing"],{"children":"no","divorced":"yes","occupation":"QA analyst","location":"Seattle"}]]

I have another list where I have the prospective keys present

list_keys = ['name', 'current_project', 'details']

I am trying to create a dic using both to make the data usable for metrics

I have summarized the both the list for the question but it goes on forever, there are multiple elements in the list. input_list is a nested list which has 500k+ elements and each list element have 70+ elements of their own (expect the details one) list_keys also have 70+ elements in it.

I was trying to create a dict using zip but that its not helping given the size of data, also with zip I am not able to exclude the "details" element from

I am expecting output something like this.

[
  {
    "name": "Richard",
    "current_project": "",
    "children": "yes",
    "divorced": "no",
    "occupation": "analyst"
    },
  {
    "name": "Mary",
    "current_project" :"testing",
    "children": "no",
    "divorced": "yes",
    "occupation": "QA analyst",
    "location": "Seattle"
    }
]

I have tried this so far

>>> for line in input_list:
...     zipbObj = zip(list_keys, line)
...     dictOfWords = dict(zipbObj)
...
>>> print dictOfWords
{'current_project': ['testing'], 'name': 'Mary', 'details': {'location': 'Seattle', 'children': 'no', 'divorced': 'yes', 'occupation': 'QA analyst'}}

but with this I am unable to to get rid of nested dict key "details". so looking for help with that

2
  • How does your existing code look like? If it is working but you are looking into improving it, Code Review stack exchange may be a better place to ask. Commented Aug 20, 2019 at 23:15
  • @selcuk updated the question. Commented Aug 20, 2019 at 23:21

3 Answers 3

2

Seems like what you wanted was a list of dictionaries, here is something i coded up in the terminal and copied in here. Hope it helps.

>>> list_of_dicts = []
>>> for item in input_list:
...     dict = {}
...     for i in range(0, len(item)-2, 3):
...             dict[list_keys[0]] = item[i]
...             dict[list_keys[1]] = item[i+1]
...             dict.update(item[i+2])
...     list_of_dicts.append(dict)
...
>>> list_of_dicts
[{'name': 'Richard', 'current_project': [], 'children': 'yes', 'divorced': 'no', 'occupation': 'analyst'
}, {'name': 'Mary', 'current_project': ['testing'], 'children': 'no', 'divorced': 'yes', 'occupation': '
QA analyst', 'location': 'Seattle'}]

I will mention it is not the ideal method of doing this since it relies on perfectly ordered items in the input_list.

Sign up to request clarification or add additional context in comments.

2 Comments

Hi @Cavenfish, Thanks for checking on this. This code is working fine but I have multiple elements close to 75, adding the index for each of them will make the code complex.
Hey @yahoo_it, you can still use this style of code even with the large element size in the list_keys (so long as the order of the input_list is correct). Just nest another loop in the like this: for key in list_keys: then the insertion would be dict[key] = item[i] from within the nested loop.
0
people = input_list = [["Richard",[],{"children":"yes","divorced":"no","occupation":"analyst"}],
["Mary",["testing"],{"children":"no","divorced":"yes","occupation":"QA analyst","location":"Seattle"}]]
list_keys = ['name', 'current_project', 'details']
listout = []
for person in people:
    dict_p = {}
    for key in list_keys:
        if not key == 'details':
            dict_p[key] = person[list_keys.index(key)]
        else:
            subdict = person[list_keys.index(key)]
            for subkey in subdict.keys():
                dict_p[subkey] = subdict[subkey]

    listout.append(dict_p)
listout

The issue with using zip is that you have that additional dictionary in the people list. This will get the following output, and should work through a larger list of individuals:

[{'name': 'Richard',
  'current_project': [],
  'children': 'yes',
  'divorced': 'no',
  'occupation': 'analyst'},
 {'name': 'Mary',
  'current_project': ['testing'],
  'children': 'no',
  'divorced': 'yes',
  'occupation': 'QA analyst',
  'location': 'Seattle'}]

1 Comment

This worked fine in the terminal but when I trying to run the script with the data its throwing error ``` for subkey in subdict.keys(): AttributeError: 'str' object has no attribute 'keys' ```
0

This script will go through every item of input_list and creates new list where there aren't any list or dictionaries:

input_list = [
    ["Richard",[],{"children":"yes","divorced":"no","occupation":"analyst"}],
    ["Mary",["testing"],{"children":"no","divorced":"yes","occupation":"QA analyst","location":"Seattle"}]
]

list_keys = ['name', 'current_project', 'details']

out = []
for item in input_list:
    d = {}
    out.append(d)
    for value, keyname in zip(item, list_keys):
        if isinstance(value, dict):
            d.update(**value)
        elif isinstance(value, list):
            if value:
                d[keyname] = value[0]
            else:
                d[keyname] = ''
        else:
            d[keyname] = value

from pprint import pprint
pprint(out)

Prints:

[{'children': 'yes',
  'current_project': '',
  'divorced': 'no',
  'name': 'Richard',
  'occupation': 'analyst'},
 {'children': 'no',
  'current_project': 'testing',
  'divorced': 'yes',
  'location': 'Seattle',
  'name': 'Mary',
  'occupation': 'QA analyst'}]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.