1

Say I have a lot of json lines to process and I only care about the specific fields in a json line.

{blablabla, 'whatICare': 1, blablabla}
{blablabla, 'whatICare': 2, blablabla}
....

Is there any way to extract whatICare from these json lines withoud loads them? Since the json lines are very long it may be slow to build objects from json..

1
  • Another option! if you have huge Json file. Store the Json file to MYSQL DB, you can optimize your queries to get what you care about. However, I am not sure if it's the best way to do it comparing the below-mentioned options. Commented Oct 21, 2014 at 22:41

2 Answers 2

2

Not any reliable way without writing your own parsing code.

But check out ujson! It can be 10x faster than python's built in json library, which is a bit on the slow side.

Sign up to request clarification or add additional context in comments.

1 Comment

In my previous experience with large JSON data, decoding was actually fast enough; the bottleneck was still reading the file from disk.
0

No, you will have to load and parse the JSON before you know what’s inside and to be able to filter out the desired elements.

That being said, if you worry about memory, you could use ijson which is an iterative parser. Instead of loading all the content at once, it is able to load only what’s necessary for the next iteration. So if you your file contains an array of objects, you can load and parse one object at a time, reducing the memory impact (as you only need to keep one object in memory, plus the data you actually care about). But it won’t become faster, and it also won’t magically skip data you are not interested in.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.