20

I'm trying to parse a JSON object within a shell script into an array.

e.g.: [Amanda, 25, http://mywebsite.com]

The JSON looks like:

{
  "name"       : "Amanda", 
  "age"        : "25",
  "websiteurl" : "http://mywebsite.com"
}

I do not want to use any libraries, it would be best if I could use a regular expression or grep. I have done:

myfile.json | grep name

This gives me "name" : "Amanda". I could do this in a loop for each line in the file, and add it to an array but I only need the right side and not the entire line.

7
  • 4
    Use jq for this. Commented Jul 14, 2016 at 2:04
  • Have a look at [ this ] question and show us some effort on your part to solve this. Commented Jul 14, 2016 at 2:12
  • 1
    This cat myfile.json | grep name | cut -d ':' -f2 might help. Commented Jul 14, 2016 at 4:20
  • 2
    @sjsam: The accepted answer to the linked question demonstrates jq use well, but uses a misguided approach to reading its output into a shell array (as least as of this writing - comment posted). Commented Jul 14, 2016 at 5:29
  • 2
    I'm assuming instead of [Amanda, 25, http://mywebsite.com] you meant ( "Amanda" 25 "http://mywebsite.com"); the latter is what bash's array syntax actually looks like. (Or, as given with declare -p array, this could also be printed as follows: declare -a array='([0]="Amanda" [1]="25" [2]="http://mywebsite.com")') Commented Jul 14, 2016 at 13:19

4 Answers 4

22

If you really cannot use a proper JSON parser such as jq[1] , try an awk-based solution:

Bash 4.x:

readarray -t values < <(awk -F\" 'NF>=3 {print $4}' myfile.json)

Bash 3.x:

IFS=$'\n' read -d '' -ra values < <(awk -F\" 'NF>=3 {print $4}' myfile.json)

This stores all property values in Bash array ${values[@]}, which you can inspect with
declare -p values.

These solutions have limitations:

  • each property must be on its own line,
  • all values must be double-quoted,
  • embedded escaped double quotes are not supported.

All these limitations reinforce the recommendation to use a proper JSON parser.


Note: The following alternative solutions use the Bash 4.x+ readarray -t values command, but they also work with the Bash 3.x alternative, IFS=$'\n' read -d '' -ra values.

grep + cut combination: A single grep command won't do (unless you use GNU grep - see below), but adding cut helps:

readarray -t values < <(grep '"' myfile.json | cut -d '"' -f4)

GNU grep: Using -P to support PCREs, which support \K to drop everything matched so far (a more flexible alternative to a look-behind assertion) as well as look-ahead assertions ((?=...)):

readarray -t values < <(grep -Po ':\s*"\K.+(?="\s*,?\s*$)' myfile.json)

Finally, here's a pure Bash (3.x+) solution:

What makes this a viable alternative in terms of performance is that no external utilities are called in each loop iteration; however, for larger input files, a solution based on external utilities will be much faster.

#!/usr/bin/env bash

declare -a values # declare the array                                                                                                                                                                  

# Read each line and use regex parsing (with Bash's `=~` operator)
# to extract the value.
while read -r line; do
  # Extract the value from between the double quotes
  # and add it to the array.
  [[ $line =~ :[[:blank:]]+\"(.*)\" ]] && values+=( "${BASH_REMATCH[1]}" )
done < myfile.json                                                                                                                                          

declare -p values # print the array

[1] Here's what a robust jq-based solution would look like (Bash 4.x):
readarray -t values < <(jq -r '.[]' myfile.json)

Sign up to request clarification or add additional context in comments.

Comments

4

jq is good enough to solve this problem

paste -s <(jq '.files[].name' YourJsonString) <(jq '.files[].age' YourJsonString) <( jq '.files[].websiteurl' YourJsonString) 

So that you get a table and you can grep any rows or awk print any columns you want

1 Comment

OP literally said no libraries, there are a million other questions with JQ as the answer already.
2

You can use a sed one liner to achieve this:

array=( $(sed -n "/{/,/}/{s/[^:]*:[[:blank:]]*//p;}" json ) )

Result:

$ echo ${array[@]}
"Amanda" "25" "http://mywebsite.com"

If you do not need/want the quotation marks then the following sed will do away with them:

array=( $(sed -n '/{/,/}/{s/[^:]*:[^"]*"\([^"]*\).*/\1/p;}' json) )

Result:

$ echo ${array[@]}
Amanda 25 http://mywebsite.com

It will also work if you have multiple entries, like

$ cat json
{
  "name"       : "Amanda" 
  "age"        : "25"
  "websiteurl" : "http://mywebsite.com"
}

{
   "name"       : "samantha"
   "age"        : "31"
   "websiteurl" : "http://anotherwebsite.org"
}

$ echo ${array[@]}
Amanda 25 http://mywebsite.com samantha 31 http://anotherwebsite.org

UPDATE:

As pointed out by mklement0 in the comments, there might be an issue if the file contains embedded whitespace, e.g., "name" : "Amanda lastname". In this case Amanda and lastname would both be read into seperate array fields each. To avoid this you can use readarray, e.g.,

readarray -t array < <(sed -n '/{/,/}/{s/[^:]*:[^"]*"\([^"]*\).*/\1/p;}' json2)

This will also take care of any globbing issues, also mentioned in the comments.

7 Comments

Please don't parse command output into an array with array=( $(...) ) (even though it happens to work with the sample input): it doesn't work as intended with embedded whitespace and can result in accidental globbing.
To see what your approach does to embedded whitespace, examine the array that results from array=( $(echo ' a b ') ); to see the effects of accidental globbing, try array=( $(echo 'a * is born') ).
For simplicity, try "*" as the JSON property value; focusing on the JSON is a distraction, though, as my echo commands are sufficient to demonstrate the problem: the output from the command substitution, whatever the specific command happens to be, is invariably subject to word splitting and globbing. The larger point is: reading items into an array this way is an antipattern that is best avoided altogether. (You could work around the issues with IFS= and set -f, but at that point it's simpler to use readarray.)
Please consider editing your correction to actually flow with the answer rather than being an addendum at the end; otherwise, someone trying to follow this answer is more likely to use the buggy code than not.
(echo ${array[@]} is also bad form -- even if array=( "Hello" "Test * Example" "World" ), it won't print that as three separate elements despite the contents being correctly stored that way. Consider printf '%s\n' "${array[@]}", with the quotes).
|
1

Pure Bash 3.x+ without dependencies (such as jq, python, grep, etc.):

source <(curl -s -L -o- https://github.com/lirik90/bashJsonParser/raw/master/jsonParser.sh)
read -d '' JSON << EOF
{
  "name"       : "Amanda", 
  "age"        : "25",
  "websiteurl" : "http://mywebsite.com"
}
EOF

JSON=$(minifyJson "$JSON")
name=$(parseJson "$JSON" name)
age=$(parseJson "$JSON" age)
url=$(parseJson "$JSON" websiteurl)
echo "Result: [$name,$age,$url]"

Output:

Result: [Amanda,25,http://mywebsite.com]

Try it.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.