1

When using the zmdp solver from here i came across a funky file format that I haven't seen before, it uses => for assignment. I wasn't able to find out what format it was from the package documentation (it says it is a "policy" format, but it must be based on something more generic)

{
  policyType => "MaxPlanesLowerBound",
  numPlanes => 7,
  planes => [
    {
      action => 2,
      numEntries => 3,
      entries => [
        0, 18.7429,
        1, 18.7426,
        2, 21.743
      ]
    },
    ### more entries ###
    {
      action => 3,
      numEntries => 3,
      entries => [
        0, 20.8262,
        1, 20.8261,
        2, 20.8259
      ]
    }
  ]
}

I researched a lot on what would be a straightforward way to parse such files (in Python), and also read this blog post which has a huge variety of options for lexing and parsing (the tools that looked most promising for my example seemed to be parsimonious and parsy).
However, whatever solutions I can think of just feels like I'm re-inventing the wheel, and lexing and parsing seems to be an overkill for what I'm trying to do.
I also found this stackoverflow question which coincidentally seems to also be related to a format that uses =>. However, being lazy and minimalistic when it comes to code, I don't like the regex solution too much. My gut feeling tells me that there must be a 3-4 line solution to write the input file to a python dict or similarly useful format. In particular, I suspect that this is already standard syntax of some format I just am not aware of (it's obviously not csv, json, yaml or xml)

The question therefore is: Is the above a standard file format, and if yes, what is it?
If not, how do I parse this file elegantly and compactly in Python3, i.e. without regexing for every keyword?

5
  • Looking at the source, especially the parsing code, it looks very ad hoc and not the "JSON with =>" you might guess from looking at it. (Also, the code writes a big JSON-incompatible comment section at the top that you probably stripped out.) Commented Nov 25, 2018 at 2:16
  • You can probably read the format by stripping out the header, replacing => with :, and stuffing the result into a JSON parser, but writing the format by dumping JSON and replacing : with => won't work. Commented Nov 25, 2018 at 2:19
  • 1
    (Also, unquoted keys are not standard JSON, and Python's json module doesn't accept them.) Commented Nov 25, 2018 at 2:27
  • @user2357112 true about the unquoted keys. In terms of writing, at the moment i don't have that requirement but it's a good point. Commented Nov 25, 2018 at 2:57
  • for stripping out the comment section, i used policy_str = re.sub(r"[#].*\n?", "", policy_str) Commented Nov 25, 2018 at 3:24

1 Answer 1

1

I don’t see any differences from json here aside from replacing ‘=>’ with ‘:’ and adding a top level key.

filestr.replace(‘=>’, ‘:’)
dictionary = json.loads(filestr)

Edited after seeing comment above.

Unquoted keys are indeed not part of the json standard. To address that, you can use a library as described here or you can regex it.

Sign up to request clarification or add additional context in comments.

3 Comments

Exactly what i meant by elegant and compact! Great, thank you!
it actually required a little more work as pointed out by @user2357112: I had to add quotation marks around the keys, so I ended up using re.sub(r"\S+ =>", lambda m: "\"" + m.group(0).replace(" =>", "\":"), policy_str) followedy by json.loads(policy_str)`
but overall still pretty compact and less work than writing an entire parser.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.