2

with the help over StackOverflow, I was able to get thus far with this. Need some more help converting JSON to SQL table. Any help is highly appreciated.

{
    "Volumes": [{
        "AvailabilityZone": "us-east-1a",
        "Attachments": [{
            "AttachTime": "2013-12-18T22:35:00.000Z",
            "InstanceId": "i-1234567890abcdef0",
            "VolumeId": "vol-049df61146c4d7901",
            "State": "attached",
            "DeleteOnTermination": true,
            "Device": "/dev/sda1",

            "Tags": [{
                "Value": "DBJanitor-Private",
                "Key": "Name"
            }, {
                "Value": "DBJanitor",
                "Key": "Owner"
            }, {
                "Value": "Database",
                "Key": "Product"
            }, {
                "Value": "DB Janitor",
                "Key": "Portfolio"
            }, {
                "Value": "DB Service",
                "Key": "Service"
            }]
        }],
            "Ebs": {
                                "Status": "attached",
                                "DeleteOnTermination": true,
                                "VolumeId": "vol-049df61146c4d7901",
                                "AttachTime": "2016-09-14T19:49:11.000Z"
                            },
        "VolumeType": "standard",
        "VolumeId": "vol-049df61146c4d7901"
    }]
}

With the help over StackOverFlow, I was able to solve until Tags. Cant figure out how to solve Ebs piece. I'm pretty new to coding and any help is deeply appreaciated.

In [1]: fn = r'D:\temp\.data\40454898.json'

In [2]: with open(fn) as f:
   ...:     data = json.load(f)
   ...:

In [14]: t = pd.io.json.json_normalize(data['Volumes'],
    ...:                               ['Attachments','Tags'],
    ...:                               [['Attachments', 'VolumeId'],
    ...:                                ['Attachments', 'InstanceId']])
    ...:

In [15]: t
Out[15]:
         Key              Value Attachments.InstanceId   Attachments.VolumeId
0       Name  DBJanitor-Private    i-1234567890abcdef0  vol-049df61146c4d7901
1      Owner          DBJanitor    i-1234567890abcdef0  vol-049df61146c4d7901
2    Product           Database    i-1234567890abcdef0  vol-049df61146c4d7901
3  Portfolio         DB Janitor    i-1234567890abcdef0  vol-049df61146c4d7901
4    Service         DB Service    i-1234567890abcdef0  vol-049df61146c4d7901

Thanks

1 Answer 1

2

json_normalize expects a list of dictionaries and in case of Ebs - it's just a dictionary, so we should preprocess the JSON data:

In [88]: with open(fn) as f:
    ...:     data = json.load(f)
    ...:

In [89]: for r in data['Volumes']:
    ...:     if 'Ebs' not in r: # add 'Ebs' dict if it's not in the record...
    ...:         r['Ebs'] = []
    ...:     if not isinstance(r['Ebs'], list): # wrap 'Ebs' in a list if it's not a list 
    ...:         r['Ebs'] = [r['Ebs']]
    ...:

In [90]: data
Out[90]:
{'Volumes': [{'Attachments': [{'AttachTime': '2013-12-18T22:35:00.000Z',
     'DeleteOnTermination': True,
     'Device': '/dev/sda1',
     'InstanceId': 'i-1234567890abcdef0',
     'State': 'attached',
     'Tags': [{'Key': 'Name', 'Value': 'DBJanitor-Private'},
      {'Key': 'Owner', 'Value': 'DBJanitor'},
      {'Key': 'Product', 'Value': 'Database'},
      {'Key': 'Portfolio', 'Value': 'DB Janitor'},
      {'Key': 'Service', 'Value': 'DB Service'}],
     'VolumeId': 'vol-049df61146c4d7901'}],
   'AvailabilityZone': 'us-east-1a',
   'Ebs': [{'AttachTime': '2016-09-14T19:49:11.000Z',
     'DeleteOnTermination': True,
     'Status': 'attached',
     'VolumeId': 'vol-049df61146c4d7901'}],
   'VolumeId': 'vol-049df61146c4d7901',
   'VolumeType': 'standard'}]}

NOTE: 'Ebs': {..} has been replaced with 'Ebs': [{..}]

In [91]: e = pd.io.json.json_normalize(data['Volumes'],
    ...:                               ['Ebs'],
    ...:                               ['VolumeId'],
    ...:                               meta_prefix='parent_')
    ...:


In [92]: e
Out[92]:
                 AttachTime DeleteOnTermination    Status               VolumeId        parent_VolumeId
0  2016-09-14T19:49:11.000Z                True  attached  vol-049df61146c4d7901  vol-049df61146c4d7901
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.