2
[[{"date":"January 2004"},true,false,100,null,null,true],[{"date":"February 2004"},false,false,99,null,null,true]]

I have a long string of data that came from a javascript file, like the above. Is there a short cut or library that would parse this into the appropriate data types?

As you can see, it's a list of lists that contain dictionaries, Boolean values, integers and null values.

I mean, I could do this by hand but I don't think I could do it very quickly or efficiently. There must be a better method.

3
  • Are there really no closing } brackets for those { brackets? Commented May 23, 2014 at 2:26
  • No, my mistake. I was trying to simplify the full version. I corrected it above. Commented May 23, 2014 at 2:34
  • Isn't this just json? Not sure why it has True/False instead of true/false though. Commented May 23, 2014 at 2:41

2 Answers 2

5

That's pretty close to valid JSON. The only invalid thing is that False should be false and True should be true. That could be a transcription error (...yep)


Use json:

import json

x = '[[{"date":"January 2004"},true,false,100,null,null,true],[{"date":"February 2004"},false,false,99,null,null,true]]'

json.loads(x)
Out[20]: 
[[{'date': 'January 2004'}, True, False, 100, None, None, True],
 [{'date': 'February 2004'}, False, False, 99, None, None, True]]
Sign up to request clarification or add additional context in comments.

2 Comments

macdonjo says he's getting the output from JavaScript, so I wonder if he capitalized the Boolean values when he was posting the data...
Yes, you guys got me again, I capitalized them just because I was working in Python and it didn't cross my mind that it would make a difference. Yep, they're lowercase! :)
2

I suggest you take a look at PyParsing.

http://pyparsing.wikispaces.com/

You could also take a look at the Python "scanf" library.

sscanf in Python

If you needed to solve this problem just using Python built-ins, I would recommend using a regular expression with capture groups.

EDIT: Hmm, I took another look at this. You did say it was from JavaScript... this looks to me like a legal JSON array. I tried using the json module (specifically, the method function json.loads()) but I couldn't get it to parse.

But! Python syntax is close to JavaScript syntax. Replace a few things and eval() can parse this, or ast.literal_eval(). We need to replace true with True, false with False, and null with None before ast.literal_eval() will accept it.

import ast
s = '[[{"date":"January 2004"},True,False,100,null,null,true],[{"date":"February 2004"},False,False,99,null,null,true]]'
s1 = s.replace("true","True").replace("false","False").replace("null","None")
x = ast.literal_eval(s1)
print(x)

The above will print:

[[{'date': 'January 2004'}, True, False, 100, None, None, True], [{'date': 'February 2004'}, False, False, 99, None, None, True]]

Originally I showed defining variables (like true = True) and using eval() to parse this, but of course eval() is a potential security hole; so if you need to parse text that might come from a web page or any other untrusted source, it's worth the small amount of effort to import ast and use ast.literal_eval() instead.

EDIT: Okay, the json module can parse this; the problem was the use of True instead of true and False instead of false. Just use the str.replace() method function to fix those, and then json.loads() can parse this.

I was just about to post a code fragment with the .replace() method calls, when the question got updated again, and the capitalized True and False became ordinary legal JSON ones.

So my final answer:

s = '[[{"date":"January 2004"},true,false,100,null,null,true],[{"date":"February 2004"},false,false,99,null,null,true]]'

import json

x = json.loads(s)
print(x)

prints:

[[{u'date': u'January 2004'}, True, False, 100, None, None, True], [{u'date': u'February 2004'}, False, False, 99, None, None, True]]

5 Comments

You didn't enter the eval argument as a string. Remember, it's a string.
@macdonjo Thanks for pointing that out. When I tested it, it worked, but when I typed it in here I failed to put the string quotes. Usually I copy/paste from my Python session so I'm putting the correct tested code, but I must not have done that this time; I wonder why not.
As for the source of the data, it's from a very popular website and it's not user entered. It's just a big database. So I guess this should be safe?
It's probably safe, but if you were going to do the "eval" trick I would suggest using ast.literal_eval() anyway. I'll modify my example to use that. But you might as well use json.loads() since it really is legal JSON.
Pyparsing is no longer hosted on wikispaces.com. Go to github.com/pyparsing/pyparsing

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.