1

I'm working on a Bash script. In it, there are a couple occasions where I need to parse some JSON. My usual approach for that is as follows:

MY_JSON=$(some command that prints JSON to stdout)
RESULT=$(python -c "import json,sys;data=json.load(sys.stdin); python code here that prints out the value(s) I need")

This tends to work well. However, yesterday I ran into a problem. I had the following code:

MY_JSON=$(command that returns JSON containing an array of IDs)
IDS=$(echo "${MY_JSON}" | python -c "import json,sys;data=json.load(sys.stdin); for a in data['array']: print(a['id'])")

When I ran that code, I got "Syntax Error" with the caret pointing at the f in for.

In my googling, everything I found indicated that when you get a syntax error on the very first character of a statement, it usually means that you screwed something up in the previous statement. However, if I remove the for loop entirely, I get no syntax error. So, obviously, the problem is with the loop.

What did I do wrong? How can the syntax error be the first character of valid keyword?

I ended up finding the answer, which I'll post below to help others who are trying to build Python one-liners involving a for loop -- but I'm hoping someone can chime in with a better answer, perhaps using comprehensions (which I don't fully understand) or something else instead of the for loop so that I can actually accomplish this in a single line. Using a language other than Python would also be acceptable, so long as it's something typically available on a Linux host.

To be clear, I'd be looking for solutions using true JSON parsing, not some approximation using your favorite string manipulation tool (sed, awk, etc) that would be fragile with respect to things like whether the JSON is pretty-printed.

0

3 Answers 3

5

Statements in Python's grammar are divided into two groups, simple statements and compound statements:

stmt: simple_stmt | compound_stmt

Only a simple statement can contain ;, and a simple statement is limited to so-called small statements:

simple_stmt: small_stmt (';' small_stmt)* [';'] NEWLINE

Small statements do not include for loops.

small_stmt: (expr_stmt | del_stmt | pass_stmt | flow_stmt |
             import_stmt | global_stmt | nonlocal_stmt | assert_stmt)

A for loop is, rather, a compound statement:

compound_stmt: if_stmt | while_stmt | for_stmt | try_stmt | with_stmt | funcdef | classdef | decorated | async_stmt
Sign up to request clarification or add additional context in comments.

Comments

2

This turns out to have been caused by Python's use of semantic whitespace. Python veterans probably knew that immediately, but I only dabble, so this was confusing. I'll explain.

After extensive searching, I landed on the idea that perhaps indentation was the problem. Other people were getting syntax errors that were caused by indentation issues.

Python uses indentation instead of visible characters to determine the boundaries of a semantic block of code (like the body of a loop). In a C-like language (which is where I'm most comfortable -- Java, C#, etc), we use curly braces for this:

for (var i in myArray) {
    i.doSomething();
    printSomething(i);
}

None of the whitespace is important, so it's easy to turn it into a one-liner (although that's not a common practice in C-style languages:

for (var i in myArray) { i.doSomething(); printSomething(i); }

So, the next thing I tried was taking more care with the spaces after my semicolons. My for loop was "indented" one space, whereas the import and json.load lines had no leading spaces. So I took out that space (I'm leaving off some of the surrounding code for brevity):

python -c "import json,sys;data=json.load(sys.stdin);for a in data['array']: print(a['id'])"

This didn't help. My next thought was that perhaps the semicolons are insufficient and that I needed actual line breaks. I iterated a bit and landed on this, which works:

  MY_JSON=$(command that returns JSON containing an array of IDs)
  IDS=$(echo "${MY_JSON}" | python -c "import json,sys;data=json.load(sys.stdin);
for a in data['array']:
  print(a['server_id'])")

Both the for statement and the loop body need to be on actual separate lines, with the body indented more deeply than for. Note that this chunk of bash script falls within the body of an if, so it's all indented. In my experimentation, I was unable to find an arrangement that worked where the for loop itself began anywhere but the very beginning of the line.

The end result isn't pretty, but it works.

5 Comments

The body of the for, when its a one-liner, should be okay on the same line. It's considered poor style, but it should be legal.
Whitespace isn't really the issue. The problem is that ; cannot join arbitrary statements on a single line. (Though one could argue that the use of indentation is the distinguishing feature between small and compound statements, as described in my answer.)
@chepner newlines are whitespace. The fact is that whitespace (specifically a newline) was required here. The fact that ; can't join arbitrary statements is why a newline was required.
That's the fix, not the source of the problem.
I believe we are agreeing with each other. :)
2

Using a language other than Python would also be acceptable, so long as it's something typically available on a Linux host.

Give JQ a try. It's a full fledged query language with compact syntax, perfect for command-line JSON parsing and manipulation.

IDS=$(jq -r '.array[].id' <<< "$MY_JSON")

Or, safer if the IDs could contain whitespace or other special characters:

readarray -t IDS < <(jq -r '.array[].id' <<< "$MY_JSON")

If they could contain newlines, this super paranoid version delimits items with \0:

readarray -d $'\0' -t IDS < <(jq -r0 '.array[].id' <<< "$MY_JSON")

Depending on what you're doing with IDS you may be able to do that in JQ as well.

1 Comment

For the record, I am iterating over the IDs and issuing an external command to delete the associated entity for each of them.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.