Python JSON keys without parse

Question

I need to get the main keys (devices) from a JSON formatted text with around 70.000 (sub-)keys/objects It looks like this:

{
   "1":{...........}
   "4":{...........}
   "9":{...........}
}

And I need to get "1","4" and "9". But the way I do it now it takes around 2 minutes to parse the text with

json = json.loads(response.text) #this takes so long!
devices = json.keys()

because i'm running this on a Raspberry Pi!

Is there a better way?

EDIT: I recieve the data from a JSON API running on a server with:

http://.../ZWaveAPI/Run/devices #this is an array

EDIT3:

final working code: (runs for 2-5 seconds! :)

import ijson.backends.python as ijson
import urllib

parser = ijson.parse(urllib.urlopen("http://.../ZWaveAPI/Run/devices"))
list = []
for prefix,event,value in parser:
    if event == "map_key" and len(prefix) == 0:
        list.append(value)
return list

use a database and only query what you need when you need it? — Joran Beasley
– Joran Beasley, Commented Apr 4, 2013 at 20:22
I can't change the data I get... I recieve a text with many keys and I need to get the main keys... Or is there a possibility in the way I get the data? ( see Edit) — TeNNoX
– TeNNoX, Commented Apr 4, 2013 at 20:48

Martijn Pieters · Accepted Answer · 2013-04-04 20:27:02Z

5

You can do it with an stream-oriented iterative JSON parser, but you'll need to install it separately. Try out ijson, it'll emit events for each JSON structure encountered:

for prefix, event, value in parser:
    if event == 'map_key':
        print value

answered Apr 4, 2013 at 20:27

Martijn Pieters

1.1m326 gold badges4.2k silver badges3.4k bronze badges

Sign up to request clarification or add additional context in comments.

10 Comments

TeNNoX Over a year ago

But this will just call events when it parses, it is not faster, or is it?

Leopd Over a year ago

It will give you access to intermediate results faster since you'll get them before the whole thing is loaded. It will also use a lot less memory since you won't be building up a giant data structure filled with things you aren't going to use. So it should be at least somewhat faster.

Martijn Pieters Over a year ago

@TeNNoX: you have to scan over the intermediary results anyway to get to the keys you are interested in. But with a streaming parser you don't need to create python objects for the whole data set, which speeds things up.

TeNNoX Over a year ago

Okay then I will try that. But events aren't really necessary because I need to wait for everything to finish anyways, right?

Martijn Pieters Over a year ago

@TeNNoX: It needs a file-like object; pass in the result of urlopen() without calling .read() yourself.

|

Hai Vu · Accepted Answer · 2013-04-05 00:33:19Z

0

Have you tried to experiment with getting just a single device? With most RESTful web services, if you see an URL like this:

"h ttp://.../ZWaveAPI/Run/devices"

Chances are, you GET individual device by:

"h ttp://.../ZWaveAPI/Run/devices/1"

If it works, it should greatly reduce the amount of data you have to download and parse.

answered Apr 5, 2013 at 0:33

Hai Vu

41.4k16 gold badges75 silver badges106 bronze badges

1 Comment

TeNNoX Over a year ago

Yeah, but I need a valid list of all files. I can't try out all numbers... By the way I reduced the time now to 3 seconds with ijson as seen in EDIT3

Collectives™ on Stack Overflow

Python JSON keys without parse

2 Answers 2

10 Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

10 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related