0

I have a dataframe like this

org.iden.account,org.iden.id,adress.city,adress.country,person.name.fullname,person.gender,person.birthYear,subs.id,subs.subs1.birthday,subs.subs1.org.address.country,subs.subs1.org.address.strret1,subs.org.buyer.email.address,subs.org.buyer.phone.number
account123,id123,riga,latvia,laura,female,1990,subs123,1990-12-14T00:00:00Z,latvia,street 1,[email protected]|[email protected],+371401234567
account123,id000,riga,latvia,laura,female,1990,subs456,1990-12-14T00:00:00Z,latvia,street 1,[email protected],+371401234567
account123,id456,riga,latvia,laura,female,1990,subs789,1990-12-14T00:00:00Z,latvia,street 1,[email protected],+371401234567

And I need to convert this into a nested JSON based on the column separated by a dot(.). So for the first row the expected result should be

{
    "org": {
        "iden": {
            "account":  "account123",
            "id": "id123"
        }
    },
    "address": {
        "city": "riga",
        "country": "country"
    },
    "person": {
        "name": {
            "fullname": laura,
        },
        "gender": "female",
        "birthYear": 1990
    },
    "subs": {
        "id": "subs123",
        "subs1": {
            "birthday": "1990-12-14T00:00:00Z",
            "org": {
                "address": {
                    "country": "latvia",
                    "street1": "street 1"
                }
            }
        },
        "org": {
            "buyer": {
                "email": {
                    "address": "[email protected]|[email protected]"
                },
            "phone": {
                "number": "+371401234567"
                }
            }
        }
    }

}

And then of course all the records as a list. I have tried to use simple pandas .to_json() but it didn't help and I get the following which doesn't have the nested structure I need.

[{"org.iden.account":"account123","org.iden.id":"id123","adress.city":"riga","adress.country":"latvia","person.name.fullname":"laura","person.gender":"female","person.birthYear":1990,"subs.id":"subs123","subs.subs1.birthday":"1990-12-14T00:00:00Z","subs.subs1.org.address.country":"latvia","subs.subs1.org.address.strret1":"street 1","subs.org.buyer.email.address":"[email protected]|[email protected]","subs.org.buyer.phone.number":371401234567},{"org.iden.account":"account123","org.iden.id":"id000","adress.city":"riga","adress.country":"latvia","person.name.fullname":"laura","person.gender":"female","person.birthYear":1990,"subs.id":"subs456","subs.subs1.birthday":"1990-12-14T00:00:00Z","subs.subs1.org.address.country":"latvia","subs.subs1.org.address.strret1":"street 1","subs.org.buyer.email.address":"[email protected]","subs.org.buyer.phone.number":371407654321},{"org.iden.account":"account123","org.iden.id":"id456","adress.city":"riga","adress.country":"latvia","person.name.fullname":"laura","person.gender":"female","person.birthYear":1990,"subs.id":"subs789","subs.subs1.birthday":"1990-12-14T00:00:00Z","subs.subs1.org.address.country":"latvia","subs.subs1.org.address.strret1":"street 1","subs.org.buyer.email.address":"[email protected]","subs.org.buyer.phone.number":371407654321}]

Any help in this would be highly appreciated!

2
  • Are you ok with a solution that works directly on your json data and not through pandas? Commented Jun 9, 2021 at 16:11
  • @Axe319 yes of course. Pandas actually provide a lot of flexibility on the dataframes but other solution are definitely welcome Commented Jun 10, 2021 at 7:42

2 Answers 2

1
def df_to_json(row):
    tree = {}
    for item in row.index:
        t = tree
        for part in item.split('.'):
            prev, t = t, t.setdefault(part, {})
        prev[part] = row[item]
    return tree
>>> df.apply(df_to_json, axis='columns').tolist()

[{'org': {'iden': {'account': 'account123', 'id': 'id123'}},
  'adress': {'city': 'riga', 'country': 'latvia'},
  'person': {'name': {'fullname': 'laura'},
   'gender': 'female',
   'birthYear': 1990},
  'subs': {'id': 'subs123',
   'subs1': {'birthday': '1990-12-14T00:00:00Z',
    'org': {'address': {'country': 'latvia', 'strret1': 'street 1'}}},
   'org': {'buyer': {'email': {'address': '[email protected]|[email protected]'},
     'phone': {'number': 371401234567}}}}},
 {'org': {'iden': {'account': 'account123', 'id': 'id000'}},
  'adress': {'city': 'riga', 'country': 'latvia'},
  'person': {'name': {'fullname': 'laura'},
   'gender': 'female',
   'birthYear': 1990},
  'subs': {'id': 'subs456',
   'subs1': {'birthday': '1990-12-14T00:00:00Z',
    'org': {'address': {'country': 'latvia', 'strret1': 'street 1'}}},
   'org': {'buyer': {'email': {'address': '[email protected]'},
     'phone': {'number': 371401234567}}}}},
 {'org': {'iden': {'account': 'account123', 'id': 'id456'}},
  'adress': {'city': 'riga', 'country': 'latvia'},
  'person': {'name': {'fullname': 'laura'},
   'gender': 'female',
   'birthYear': 1990},
  'subs': {'id': 'subs789',
   'subs1': {'birthday': '1990-12-14T00:00:00Z',
    'org': {'address': {'country': 'latvia', 'strret1': 'street 1'}}},
   'org': {'buyer': {'email': {'address': '[email protected]'},
     'phone': {'number': 371401234567}}}}}]
Sign up to request clarification or add additional context in comments.

Comments

0

Assuming your json structure looks something like this

json_data = [
    {
        "org.iden.account": "account123",
        "org.iden.id": "id123",
        "adress.city": "riga",
        "adress.country": "latvia",
        "person.name.fullname": "laura",
        "person.gender": "female",
        "person.birthYear": 1990,
        "subs.id": "subs123",
        "subs.subs1.birthday": "1990-12-14T00:00:00Z",
        "subs.subs1.org.address.country": "latvia",
        "subs.subs1.org.address.strret1": "street 1",
        "subs.org.buyer.email.address": "[email protected]|[email protected]",
        "subs.org.buyer.phone.number": 371401234567
    },
    {
        "org.iden.account": "account123",
        "org.iden.id": "id000",
        "adress.city": "riga",
        "adress.country": "latvia",
        "person.name.fullname": "laura",
        "person.gender": "female",
        "person.birthYear": 1990,
        "subs.id": "subs456",
        "subs.subs1.birthday": "1990-12-14T00:00:00Z",
        "subs.subs1.org.address.country": "latvia",
        "subs.subs1.org.address.strret1": "street 1",
        "subs.org.buyer.email.address": "[email protected]",
        "subs.org.buyer.phone.number": 371407654321
    },
    {
        "org.iden.account": "account123",
        "org.iden.id": "id456",
        "adress.city": "riga",
        "adress.country": "latvia",
        "person.name.fullname": "laura",
        "person.gender": "female",
        "person.birthYear": 1990,
        "subs.id": "subs789",
        "subs.subs1.birthday": "1990-12-14T00:00:00Z",
        "subs.subs1.org.address.country": "latvia",
        "subs.subs1.org.address.strret1": "street 1",
        "subs.org.buyer.email.address": "[email protected]",
        "subs.org.buyer.phone.number": 371407654321
    }
]

You could nest it on a dict by dict basis.

def nestify(unnested):
    nested = dict()
    for k, v in unnested.items():
        current_dict = nested
        parts = k.split('.')
        for i in parts[:-1]:
            if i not in current_dict:
                current_dict[i] = dict()
            current_dict = current_dict[i]
        current_dict[parts[-1]] = v
    return nested

This function takes one of the unnested dicts, iterates through the keys and assigns the value to the final depth.

Commented version

def nestify(unnested):
    # this will be our return value
    nested = dict()
    for k, v in unnested.items():
        # current_dict is the current dict were operating on
        # gets reset to the base dict on each unnested key
        current_dict = nested
        parts = k.split('.')
        # only create dicts up to the final period
        # for example, current_dict is the base
        # and creates an empty dict under the org key
        # then current_dict is under the org key
        # and creates an empty dict under the iden key
        # then current_dict is under the iden key
        for i in parts[:-1]:
            # no reason to create an empty dict if it was
            # already created for a prior key
            if i not in current_dict:
                current_dict[i] = dict()
            current_dict = current_dict[i]
        # assign the value of the unnested dict
        # to each final current_dict
        # for example, the final part of the first key is "account"
        # so rather than assign an empty dict, assign it "account123" 
        current_dict[parts[-1]] = v
    return nested

Then you can just call it on each element of the json_data list in a comprehension.

nested = [nestify(i) for i in json_data]

Full code:

json_data = [
    {
        "org.iden.account": "account123",
        "org.iden.id": "id123",
        "adress.city": "riga",
        "adress.country": "latvia",
        "person.name.fullname": "laura",
        "person.gender": "female",
        "person.birthYear": 1990,
        "subs.id": "subs123",
        "subs.subs1.birthday": "1990-12-14T00:00:00Z",
        "subs.subs1.org.address.country": "latvia",
        "subs.subs1.org.address.strret1": "street 1",
        "subs.org.buyer.email.address": "[email protected]|[email protected]",
        "subs.org.buyer.phone.number": 371401234567
    },
    {
        "org.iden.account": "account123",
        "org.iden.id": "id000",
        "adress.city": "riga",
        "adress.country": "latvia",
        "person.name.fullname": "laura",
        "person.gender": "female",
        "person.birthYear": 1990,
        "subs.id": "subs456",
        "subs.subs1.birthday": "1990-12-14T00:00:00Z",
        "subs.subs1.org.address.country": "latvia",
        "subs.subs1.org.address.strret1": "street 1",
        "subs.org.buyer.email.address": "[email protected]",
        "subs.org.buyer.phone.number": 371407654321
    },
    {
        "org.iden.account": "account123",
        "org.iden.id": "id456",
        "adress.city": "riga",
        "adress.country": "latvia",
        "person.name.fullname": "laura",
        "person.gender": "female",
        "person.birthYear": 1990,
        "subs.id": "subs789",
        "subs.subs1.birthday": "1990-12-14T00:00:00Z",
        "subs.subs1.org.address.country": "latvia",
        "subs.subs1.org.address.strret1": "street 1",
        "subs.org.buyer.email.address": "[email protected]",
        "subs.org.buyer.phone.number": 371407654321
    }
]


def nestify(unnested):
    nested = dict()
    for k, v in unnested.items():
        current_dict = nested
        parts = k.split('.')
        for i in parts[:-1]:
            if i not in current_dict:
                current_dict[i] = dict()
            current_dict = current_dict[i]
        current_dict[parts[-1]] = v
    return nested

nested = [nestify(i) for i in json_data]
print(nested)

Output:

[
    {
        'adress': {
            'city': 'riga', 
            'country': 'latvia'
        },
        'org': {
            'iden': {
                'account': 'account123', 
                'id': 'id123'
            }
        },
        'person': {
            'birthYear': 1990,
            'gender': 'female',
            'name': {
                'fullname': 'laura'
            }
        },
        'subs': {
            'id': 'subs123',
            'org': {
                'buyer': {
                    'email': {
                        'address': '[email protected]|[email protected]'
                    },
                    'phone': {
                        'number': 371401234567
                    }
                }
            },
            'subs1': {
                'birthday': '1990-12-14T00:00:00Z',
                'org': {
                    'address': {
                        'country': 'latvia',
                        'strret1': 'street 1'
                    }
                }
            }
        }
    },
    {
        'adress': {
            'city': 'riga', 
            'country': 'latvia'
        },
        'org': {
            'iden': {
                'account': 'account123', 
                'id': 'id000'
            }
        },
        'person': {
            'birthYear': 1990,
            'gender': 'female',
            'name': {
                'fullname': 'laura'
            }
        },
        'subs': {
            'id': 'subs456',
            'org': {
                'buyer': {
                    'email': {
                        'address': '[email protected]'
                    },
                    'phone': {
                        'number': 371407654321
                    }
                }
            },
            'subs1': {
                'birthday': '1990-12-14T00:00:00Z',
                'org': {
                    'address': {
                        'country': 'latvia',
                        'strret1': 'street 1'
                    }
                }
            }
        }
    },
    {
        'adress': {
            'city': 'riga', 
            'country': 'latvia'
        },
        'org': {
            'iden': {
                'account': 'account123', 
                'id': 'id456'
            }
        },
        'person': {
            'birthYear': 1990,
            'gender': 'female',
            'name': {
                'fullname': 'laura'
            }
        },
        'subs': {
            'id': 'subs789',
            'org': {
                'buyer': {
                    'email': {
                        'address': '[email protected]'
                    },
                    'phone': {
                        'number': 371407654321
                    }
                }
            },
            'subs1': {
                'birthday': '1990-12-14T00:00:00Z',
                'org': {
                    'address': {
                        'country': 'latvia',
                        'strret1': 'street 1'
                    }
                }
            }
        }
    }
]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.