45

I'm using the json module in Python 2.6 to load and decode JSON files. However I'm currently getting slower than expected performance. I'm using a test case which is 6MB in size and json.loads() is taking 20 seconds.

I thought the json module had some native code to speed up the decoding?

How do I check if this is being used?

As a comparison, I downloaded and installed the python-cjson module, and cjson.decode() is taking 1 second for the same test case.

I'd rather use the JSON module provided with Python 2.6 so that users of my code aren't required to install additional modules.

(I'm developing on Mac OS X, but I getting a similar result on Windows XP.)

2
  • 1
    This is solved in Python 2.7, per the comparison numbers from Tomas, Ivo, TONy.W below. Tagged this python-2.6 Commented Apr 26, 2013 at 11:54
  • (Per TONy.W's numbers, the only issue remaining is that stdlib json encode is still 2x slower in 2.7) Commented Apr 26, 2013 at 12:01

7 Answers 7

31

The new Yajl - Yet Another JSON Library is very fast.

yajl        serialize: 0.180  deserialize: 0.182  total: 0.362
simplejson  serialize: 0.840  deserialize: 0.490  total: 1.331
stdlib json serialize: 2.812  deserialize: 8.725  total: 11.537

You can compare the libraries yourself.

Update: UltraJSON is even faster.

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks for the ujson link ;) It gave me an extra 400 req/sec on my gevent/redis search service.
I tried the test myself and got very competitive results. Didn't get 20x speedup at all. Using compare script (from the link provided) and large JSON (20MB) - still very comparable performance.
23

It may vary by platform, but the builtin json module is based on simplejson, not including the C speedups. I've found simplejson to be as a fast as python-cjson anyway, so I prefer it since it obviously has the same interface as the builtin.

try:
    import simplejson as json
except ImportError:
    import json

Seems to me that's the best idiom for awhile, yielding the performance when available while being forwards-compatible.

5 Comments

fwiw, 11nov 2009 pypi.python.org/packages/source/s/simplejson/… on mac 10.4.11 ppc, gcc 4.2.1 => simplejson/_speedups.c:2256: error: redefinition of ‘PyTypeObject PyEncoderType’ WARNING: The C extension could not be compiled, speedups are not enabled.
If py-yajil and ultrajson are faster, what's the advantage of using simplejson apart from looking the same as json and being pure Python?
@Shurane neither py-yajl nor ultrajson support all the arguments that json does. simplejson is faster than json (due to the C speedups) and a drop-in replacement.
Not anymore. Many new json modules are faster.
Has this been investigated recently? My amateur timing of json decoding in 3.6 gave me that json was faster (with a small file) than either simplejson or ujson (which doesn't seem to be maintained anymore, anyway).
17

I was parsing the same file 10x. File size was 1,856,944 bytes.

Python 2.6:

yajl        serialize: 0.294  deserialize: 0.334  total: 0.627
cjson       serialize: 0.494  deserialize: 0.276  total: 0.769
simplejson  serialize: 0.554  deserialize: 0.268  total: 0.823
stdlib json serialize: 3.917  deserialize: 17.508 total: 21.425

Python 2.7:

yajl        serialize: 0.289  deserialize: 0.312  total: 0.601
cjson       serialize: 0.232  deserialize: 0.254  total: 0.486
simplejson  serialize: 0.288  deserialize: 0.253  total: 0.540
stdlib json serialize: 0.273  deserialize: 0.256  total: 0.528

Not sure why numbers are disproportionate from your results. I guess, newer libraries?

3 Comments

I also notice significant performance difference of stdlib (i.e. built-in) json between python 2.6 and 2.7. So that is another reason why python 2.7 is preferred over 2.6.
The reason is simple: The C speedup for json was added in Python 2.7. From the release notes: "Updated module: The json module was upgraded to version 2.0.9 of the simplejson package, which includes a C extension that makes encoding and decoding faster." docs.python.org/dev/whatsnew/2.7.html
+1 Extremely valuable numbers Tomas. So 2.7 fixes everything.
15

take a look UltraJSON https://github.com/esnme/ultrajson

here my test (code from: https://gist.github.com/lightcatcher/1136415)

platform: OS X 10.8.3 MBP 2.2 GHz Intel Core i7

JSON:

simplejson==3.1.0

python-cjson==1.0.5

jsonlib==1.6.1

ujson==1.30

yajl==0.3.5

JSON Benchmark
2.7.2 (default, Oct 11 2012, 20:14:37)
[GCC 4.2.1 Compatible Apple Clang 4.0 (tags/Apple/clang-418.0.60)]
-----------------------------
ENCODING
simplejson: 0.293394s
cjson: 0.461517s
ujson: 0.222278s
jsonlib: 0.428641s
json: 0.759091s
yajl: 0.388836s

DECODING
simplejson: 0.556367s
cjson: 0.42649s
ujson: 0.212396s
jsonlib: 0.265861s
json: 0.365553s
yajl: 0.361718s

1 Comment

As of Aug 2014, these results are mostly accurate on latest library versions. We tested against some of our random data, and ujson slightly (about 5-10%) edges out cjson which is about x2 simplejson and 3x json. cjson performance is also quite unstable on particular sets of data, so we are sticking with ujson.
3

For those who are parsing output from a request using the requests package, e.g.:

res = requests.request(...)

text = json.loads(res.text)

This can be very slow for larger response contents, say ~45 seconds for 6 MB on my 2017 MacBook. It is not caused by a slow json parser, but instead by a slow character set determination by the res.text call.

You can solve this by setting the character set before you are calling res.text, and using the cchardet package (see also here):

if res.encoding is None:
    res.encoding = cchardet.detect(res.content)['encoding']

This makes the response text json parsing almost instant!

Comments

2

Looking in my installation of Python 2.6.1 on windows, the json package loads the _json module, which is built into the runtime. C source for the json speedups module is here.

>>> import _json
>>> _json
<module '_json' (built-in)>
>>> print _json.__doc__
json speedups
>>> dir(_json)
['__doc__', '__name__', '__package__', 'encode_basestring_ascii', 'scanstring']
>>> 

2 Comments

I also assumed that json with _json would be fast. A benchmark proved me wrong.
The json package uses the _json C module under the hood already. There is little use in accessing it directly.
1

Even though _json is available, I've noticed json decoding is very slow on CPython 2.6.6. I haven't compared with other implementations, but I've switched to string manipulation when inside performance-critical loops.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.