8

I'm trying to parse some data in Python I have some JSON:

{
    "data sources": [
        "http://www.gcmap.com/"
    ],
    "metros": [
        {
            "code": "SCL",
            "continent": "South America",
            "coordinates": {
                "S": 33,
                "W": 71
            },
            "country": "CL",
            "name": "Santiago",
            "population": 6000000,
            "region": 1,
            "timezone": -4
        },
        {
            "code": "LIM",
            "continent": "South America",
            "coordinates": {
                "S": 12,
                "W": 77
            },
            "country": "PE",
            "name": "Lima",
            "population": 9050000,
            "region": 1,
            "timezone": -5
        }
    ]
}

If I wanted to parse the "metros" array into and array of Python class Metro objects, how would I setup the class?

I was thinking:

class Metro(object):
    def __init__(self):
        self.code = 0
        self.name = ""
        self.country = ""
        self.continent = ""
        self.timezone = ""
        self.coordinates = []
        self.population = 0
        self.region = ""

So I want to go through each metro and put the data into a corresponding Metro object and place that object into a Python array of objects...How can I loop through the JSON metros?

1
  • I do not understand the question. When you have JSON you have an object and you can get the metros list from this object Commented Feb 21, 2013 at 19:20

7 Answers 7

15

If you always get the same keys, you can use ** to easily construct your instances. Making the Metro a namedtuple will simplify your life if you are using it simply to hold values:

from collections import namedtuple
Metro = namedtuple('Metro', 'code, name, country, continent, timezone, coordinates,
                   population, region')

then simply

import json
data = json.loads('''...''')
metros = [Metro(**k) for k in data["metros"]]
Sign up to request clarification or add additional context in comments.

5 Comments

this gives me an error : TypeError: string indices must be integers
metros = [Metro(**k) for k in data["metros"]] for that line
...did you get data from json.loads? Because if data is your raw JSON data then of course this won't work.
i used metros = json.dumps[myData["metros"]] then myMetros = [Metro(**k) for k in metros["metros"]]
...??? dumps gives you a string...if you already have myData["metros"] then why are you mucking with json in the first place?
6

It's relatively easy to do since you've read the data with json.load() which will return a Python dictionary for each element in "metros" in this case — just walk though it and create the list ofMetroclass instances. I modified the calling sequence of the Metro.__init__() method you had to make it easier to pass data to it from the dictionary returned from json.load().

Since each element of the "metros" list in the result is a dictionary, you can just pass that to class Metro's constructor using ** notation to turn it into keyword arguments. The constructor can then just update() it's own __dict__ to transfer those values to itself.

By doing things this way, instead of using something like a collections.namedtuple as just a data container, is that Metro is a custom class which makes adding other methods and/or attributes you wish to it trivial.

import json

class Metro(object):
    def __init__(self, **kwargs):
        self.__dict__.update(kwargs)

    def __str__(self):
        fields = ['    {}={!r}'.format(k,v)
                    for k, v in self.__dict__.items() if not k.startswith('_')]

        return '{}(\n{})'.format(self.__class__.__name__, ',\n'.join(fields))


with open('metros.json') as file:
    json_obj = json.load(file)

metros = [Metro(**metro_dict) for metro_dict in json_obj['metros']]

for metro in metros:
    print('{}\n'.format(metro))

Output:

Metro(
    code='SCL',
    continent='South America',
    coordinates={'S': 33, 'W': 71},
    country='CL',
    name='Santiago',
    population=6000000,
    region=1,
    timezone=-4)

Metro(
    code='LIM',
    continent='South America',
    coordinates={'S': 12, 'W': 77},
    country='PE',
    name='Lima',
    population=9050000,
    region=1,
    timezone=-5)

Comments

5

Assuming, you are using json to load the data, I would use a list of namedtuple here to store the data under the key 'metro'

>>> from collections import namedtuple
>>> metros = []
>>> for e in data[u'metros']:
    metros.append(namedtuple('metro', e.keys())(*e.values()))


>>> metros
[metro(code=u'SCL', name=u'Santiago', country=u'CL', region=1, coordinates={u'S': 33, u'W': 71}, timezone=-4, continent=u'South America', population=6000000), metro(code=u'LIM', name=u'Lima', country=u'PE', region=1, coordinates={u'S': 12, u'W': 77}, timezone=-5, continent=u'South America', population=9050000)]
>>> 

1 Comment

I would create the namedtuple in advance. namedtuple does an eval on a class definition, so it is quite heavyweight.
2

Use the json module from the library http://docs.python.org/2/library/json.html to convert the json to a Python dictionary

Comments

1

Maybe something like

import json
data = json.loads(<json string>)
data.metros = [Metro(**m) for m in data.metros]

class Metro(object):
    def __init__(self, **kwargs):
        self.code = kwargs.get('code', 0)
        self.name = kwargs.get('name', "")
        self.county = kwargs.get('county', "")
        self.continent = kwargs.get('continent', "")
        self.timezone = kwargs.get('timezone', "")
        self.coordinates = kwargs.get('coordinates', [])
        self.population = kwargs.get('population', 0)
        self.region = kwargs.get('region', 0)

Comments

0
In [17]: def load_flat(data, inst):
   ....:     for key, value in data.items():
   ....:         if not hasattr(inst, key):
   ....:             raise AttributeError(key)
   ....:         else:
   ....:             setattr(inst, key, value)
   ....:             

In [18]: m = Metro()

In [19]: load_float(data['metros'][0], m)

In [20]: m.__dict__
Out[20]: 
{'code': 'SCL',
 'continent': 'South America',
 'coordinates': {'S': 33, 'W': 71},
 'country': 'CL',
 'name': 'Santiago',
 'population': 6000000,
 'region': 1,
 'timezone': -4}

Not only is it very much readable and very explicit about what it does, but it also provides some basic field validation as well (raising exceptions on mismatched fields, etc)

Comments

-1

I would try ast. Something like:

metro = Metro()
metro.__dict__ = ast.literal_eval(a_single_metro_dict_string)

3 Comments

True, but from what I see in OP's question, this will suffice.
Yeah...but if the question says "json" I would stick with a JSON parser.
And if "right" was a comparative adjective, I would say your answer is "righter"

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.