-1

dealing with a nasty bit of JSON. I am using json.load to write into a file and have it stored is a dict type , printed below. In python, how would I go about getting a list of just the "dimension" values starting after ""false_value"" (as they first dimension value is not actually a value I want).

I tried kind of a hacky way, but feel like someone may have a perspective on how to do this in a more eloquent fashion.

Goal, make list of all the dimension values (outside the first) such as ( '100', '121' ...)

{
    "reports": [
        {
            "columnHeader": {
                "dimensions": [
                    "ga:clientId"
                ],
                "metricHeader": {
                    "metricHeaderEntries": [
                        {
                            "name": "blah",
                            "type": "INTEGER"
                        }
                    ]
                }
            },
            "data": {
                "rows": [
                    {
                        "dimensions": [
                            "false_value"
                        ],
                        "metrics": [
                            {
                                "values": [
                                    "2"
                                ]
                            }
                        ]
                    },
    {
                        "dimensions": [
                            "100"
                        ],
                        "metrics": [
                            {
                                "values": [
                                    "2"
                                ]
                            }
                        ]
                    },
                    {
                        "dimensions": [
                            "121"
                        ],
                        "metrics": [
                            {
                                "values": [
                                    "1"
                                ]
                            }
                        ]
                    },
                    {
                        "dimensions": [
                            "1212"
                        ],
                        "metrics": [
                            {
                                "values": [
                                    "1"
                                ]
                            }
                        ]
                    }, ],
                "totals": [
                    {
                        "values": [
                            "10497"
                        ]
                    }
                ],
                "rowCount": 9028,
                "minimums": [
                    {
                        "values": [
                            "0"
                        ]
                    }
                ],
                "maximums": [
                    {
                        "values": [
                            "9"
                        ]
                    }
                ],
                "isDataGolden": true
            },
            "nextPageToken": "1000"
        }
    ]
}
5
  • 1
    I believe you can iterate through all the keys/subkeys/values and dump them into a list. See here: stackoverflow.com/questions/45974937 Commented Feb 12, 2021 at 2:20
  • Did you mean True rather than true (i.e. True/False for Python Boolean)? Commented Feb 12, 2021 at 2:34
  • @DarrylG this is JSON, not Python code. Commented Feb 12, 2021 at 2:37
  • A simple recursive function that uses isinstance() in conditions should do the trick. Commented Feb 12, 2021 at 2:39
  • @DannyVarod--when OP says "it stored is a dict type , printed below" I assume OP was displaying as a dictionary. If we consider it as a string, then json.loads(...) gives structural errors (which is also elicited by a json lint validator for the string). If you change the true to True, then it works as a dictionary. Commented Feb 12, 2021 at 2:42

3 Answers 3

2

First, you should put your json object in a better textual readable form. Use something like Black to clean up the spaces. Then just transverse the keys till you find your required value, this post will help you.

You should end up with something like this:

dimensions = [row["dimensions"][0] for row in json["reports"][0]["data"]["rows"]]
Sign up to request clarification or add additional context in comments.

2 Comments

very cool and neat , ty so much. Any tips on how to ignore the first value?
0

Using recursive function to find values with two conditions

  • Parent key was dimensions
  • Take only the numeric values

Code

def find_dims(d, inside = False, results = None):
    '''
        Recursive processing of structure
        inside  = True when parent was "dimensions"
    '''
    if results is None:
        results = []
        
    if isinstance(d, dict):
        for k, v in d.items():
            find_dims(v, k=="dimensions" or inside, results)
    elif isinstance(d, list):
        for k in d:
            find_dims(k, inside, results)
    else:
        if inside and d.isdigit():
            # inside dimensions with a number
            results.append(int(d))
            
    return results

Test

OP Dictinary (changed true to True)

d = {
    "reports": [
        {
            "columnHeader": {
                "dimensions": [
                    "ga:clientId"
                ],
                "metricHeader": {
                    "metricHeaderEntries": [
                        {
                            "name": "blah",
                            "type": "INTEGER"
                        }
                    ]
                }
            },
            "data": {
                "rows": [
                    {
                        "dimensions": [
                            "false_value"
                        ],
                        "metrics": [
                            {
                                "values": [
                                    "2"
                                ]
                            }
                        ]
                    },
    {
                        "dimensions": [
                            "100"
                        ],
                        "metrics": [
                            {
                                "values": [
                                    "2"
                                ]
                            }
                        ]
                    },
                    {
                        "dimensions": [
                            "121"
                        ],
                        "metrics": [
                            {
                                "values": [
                                    "1"
                                ]
                            }
                        ]
                    },
                    {
                        "dimensions": [
                            "1212"
                        ],
                        "metrics": [
                            {
                                "values": [
                                    "1"
                                ]
                            }
                        ]
                    }, ],
                "totals": [
                    {
                        "values": [
                            "10497"
                        ]
                    }
                ],
                "rowCount": 9028,
                "minimums": [
                    {
                        "values": [
                            "0"
                        ]
                    }
                ],
                "maximums": [
                    {
                        "values": [
                            "9"
                        ]
                    }
                ],
                "isDataGolden": True
            },
            "nextPageToken": "1000"
        }
    ]
}

print(find_dims(d)) # Output: [100, 121, 1212]

Comments

0

Like stated in the comments u can just use a simple recursive function, for example:

all_dimensions = []
search_key = 'dimensions'
def searchDimensions(data):
    if isinstance(data, dict):
        for (key, sub_data) in data.items():
            if key == search_key: all_dimensions.extend(sub_data)
            else: all_dimensions.extend(searchDimensions(sub_data))

    elif isinstance(data, list):
        for sub_data in data:
            all_dimensions.extend(searchDimensions(sub_data))

    return []

searchDimensions(example)
false_value_index = all_dimensions.index('false_value') + 1
output = all_dimensions[false_value_index:]
print(output)
>>> ['100', '121', '1212']

And then filter the values that u don't want (eg. starting from false_value)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.