1

I'm working on parsing JSON data using JSON.sh. And I wanted to read data from json file (test.json) whose content will be something like,

{
  "/home/ukrishnan/projects/test.yml": {
    "LOG_DRIVER": "syslog",
    "IMAGE": "mysql:5.6"
  },
  "/home/ukrishnan/projects/mysql/app.xml": {
    "ENV_ACCOUNT_BRIDGE_ENDPOINT": "/u01/src/test/sample.txt"
  }
}

And I try to parse this JSON using JSON.sh by using,

test_parser=`sh ./lib/JSON.sh < test/test.json`
echo $test_parser

It prints,

["/home/ukrishnan/projects/test.yml","LOG_DRIVER"] "syslog" ["/home/ukrishnan/projects/test.yml","IMAGE"] "mysql:5.6" ["/home/ukrishnan/projects/test.yml"] {"LOG_DRIVER":"syslog","IMAGE":"mysql:5.6"} ["/home/ukrishnan/projects/mysql/app.xml","ENV_ACCOUNT_BRIDGE_ENDPOINT"] "/u01/src/test/sample.txt" ["/home/ukrishnan/projects/mysql/app.xml"] {"ENV_ACCOUNT_BRIDGE_ENDPOINT":"/u01/src/test/sample.txt"} [] {"/home/ukrishnan/projects/test.yml":{"LOG_DRIVER":"syslog","IMAGE":"mysql:5.6"},"/home/ukrishnan/projects/mysql/app.xml":{"ENV_ACCOUNT_BRIDGE_ENDPOINT":"/u01/src/test/sample.txt"}}

Whereas, the same command (sh ./lib/JSON.sh < test/test.json), if I run through terminal, it is printing with line breaks,

["/home/ukrishnan/projects/test.yml","LOG_DRIVER"]  "syslog"
["/home/ukrishnan/projects/test.yml","IMAGE"]   "mysql:5.6"
["/home/ukrishnan/projects/test.yml"]   {"LOG_DRIVER":"syslog","IMAGE":"mysql:5.6"}
["/home/ukrishnan/projects/mysql/app.xml","ENV_ACCOUNT_BRIDGE_ENDPOINT"]    "/u01/src/test/sample.txt"
["/home/ukrishnan/projects/mysql/app.xml"]  {"ENV_ACCOUNT_BRIDGE_ENDPOINT":"/u01/src/test/sample.txt"}
[]  {"/home/ukrishnan/projects/test.yml":{"LOG_DRIVER":"syslog","IMAGE":"mysql:5.6"},"/home/ukrishnan/projects/mysql/app.xml":{"ENV_ACCOUNT_BRIDGE_ENDPOINT":"/u01/src/test/sample.txt"}}

I wanted to read this and assign to bash variables like,

file_name='/home/ukrishnan/projects/test.yml'
key='LOG_DRIVER'
value='syslog'

As I'm almost completely new to shell script and grep or awk, I don't have much idea of how to achieve this. Any help on this would be greatly appreciated.

10
  • 1
    Try echoing with double-quotes: echo "$test_parser" Commented Apr 25, 2016 at 11:58
  • @JoaoMorais It prints same like executing in terminal (with line breaks) Commented Apr 25, 2016 at 12:00
  • 1
    Use jq Commented Apr 25, 2016 at 12:14
  • @hek2mgl Hey, Sorry I can't install anything on the server ;( Commented Apr 25, 2016 at 12:25
  • Then use a programming language capable of parsing json. Like python, perl, PHP... I'm sure one if not all of them are installed on the server. Commented Apr 25, 2016 at 12:25

2 Answers 2

1

I wrote a JSON serializer / deserializer for gawk, if you're interested. Save that script and modify it, replacing everything above # === FUNCTIONS === with the following:

#!/usr/bin/gawk -f

# capture JSON string from beginning to end into a scalar variable
{ json = json ORS $0 }

END {
    # objectify JSON string to the multilevel array "obj"
    deserialize(json, obj)

    for (filename in obj) {

        print "file_name=" quote(filename)

        for (key in obj[filename]) {

            # print key="value"
            print key "=" quote(obj[filename][key])
        }
    }
}

Do chmod 755 json.awk and execute it. Output will resemble this:

$ ./json.awk test5.json
file_name="/home/ukrishnan/projects/mysql/app.xml"
ENV_ACCOUNT_BRIDGE_ENDPOINT="/u01/src/test/sample.txt"
file_name="/home/ukrishnan/projects/test.yml"
LOG_DRIVER="syslog"
IMAGE="mysql:5.6"

Hopefully the logic is reasonably easy to follow. If you prefer to output filename=, key=, and value= on every loop iteration, modify the nested for loops accordingly:

for (filename in obj) {
    for (key in obj[filename]) {
        print "file_name=" quote(filename)
        print "key=" quote(key)
        print "value=" quote(obj[filename][key])
    }
}

That change will result in the following output:

$ ./json.awk test5.json
file_name="/home/ukrishnan/projects/mysql/app.xml"
key="ENV_ACCOUNT_BRIDGE_ENDPOINT"
value="/u01/src/test/sample.txt"
file_name="/home/ukrishnan/projects/test.yml"
key="LOG_DRIVER"
value="syslog"
file_name="/home/ukrishnan/projects/test.yml"
key="IMAGE"
value="mysql:5.6"

Anyway, with that output, you can do something silly in BASH like this to populate and act upon the variables:

#!/bin/bash

./test.awk test5.json | while read -r line; do {
    eval $line
    [ "${line/=*/}" = "value" ] && {
        echo "bash: file_name=$file_name"
        echo "bash: key=$key"
        echo "bash: value=$value"
        echo "------"
    }
}; done

It'd probably be more graceful just to do all processing within gawk from start to finish and not mess with the polyglot handoff, though.

Getting back to json.awk, if you prefer to keep json.awk modular for easy reuse in future projects, you could remove everything above # === FUNCTIONS ===, create a separate main.awk containing the code block at the top of this answer, and @include "json.awk" as a helper library pretty much anywhere outside of END {...} (just below the shbang, for example).

Sign up to request clarification or add additional context in comments.

Comments

0

JSON.sh (from http://json.org) offers a nice bash friendly means of flattening out a JSON file. Which you've already provided how it looks in your question. So, the flatten form is the format:

[node] tab value

You have to think in UNIX script in extracting the information you want, you'll note the lines you're interested in actually follow this pattern:

  • ["filename","key"] tab ["value"]

In regex notation, we replace:

  • filename with (.*)
  • key with (.*)
  • tab with \t
  • value with (.*)

We can retrieve the first, second and third matching groups with \1, \2, \3 respectively.

When used in sed we also note that these symbols []() need to be escaped with a backslash \, resulting in the following script:

./lib/JSON.sh < test/test.json | sed 's/\["\(.*\)","\(.*\)\"]\t"\(.*\)"/\1,\2,\3/;t;d'
/home/ukrishnan/projects/test.yml,LOG_DRIVER,syslog
/home/ukrishnan/projects/test.yml,IMAGE,mysql:5.6
/home/ukrishnan/projects/mysql/app.xml,ENV_ACCOUNT_BRIDGE_ENDPOINT,/u01/src/test/sample.txt

Now we put the lines in a loop and for each line, we can extract out filename,key,value:

for line in $(./lib/JSON.sh < test/test.json | sed 's/\["\(.*\)","\(.*\)\"]\t"\(.*\)"/\1,\2,\3/;t;d')
do
  IFS="," read -ra arr <<< $line
  filename=${arr[0]}
  key=${arr[1]}
  value=${arr[2]}
  cat <<EOF
filename : $filename
key      : $key
value    : $value
EOF
done

Which outputs:

filename : /home/ukrishnan/projects/test.yml
key      : LOG_DRIVER
value    : syslog
filename : /home/ukrishnan/projects/test.yml
key      : IMAGE
value    : mysql:5.6
filename : /home/ukrishnan/projects/mysql/app.xml
key      : ENV_ACCOUNT_BRIDGE_ENDPOINT
value    : /u01/src/test/sample.txt

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.