2

I have a curl command which generates json output. I want to add a few characters in generated file to be able to process it further.

Command:

curl -sN --negotiate -u foo:bar "http://hostname/db/tbl_name/" >> db.json

This runs under a for loop which runs it for a db and tbl_name combination. Hence it ends up generating a number of json outputs(one for each table) concatenated together without any delimiter.

Output looks like :

{"columns":[{"name":"tbl_id","type":"varchar(50)"},{"name":"cret_timestmp","type":"timestamp"},{"name":"updt_timestmp","type":"timestamp"},{"name":"frst_nm","type":"varchar(50)"},{"name":"last_nm","type":"varchar(50)"},{"name":"acct_num","type":"varchar(15)"},{"name":"r_num","type":"varchar(15)"},{"name":"pid","type":"decimal(15,0)"},{"name":"ami_id","type":"varchar(30)"},{"name":"ssn","type":"varchar(9)"},{"name":"client_id","type":"varchar(30)"},{"name":"client_nm","type":"varchar(100)"},{"name":"info","type":"timestamp"},{"name":"rmx","type":"varchar(10)"},{"name":"id","type":"decimal(12,0)"},{"name":"ingest_timestamp","type":"string"},{"name":"incr_ingest_timestamp","type":"string"}],"database":"db_i","table":"db_tbl"}{"columns":[{"name":"key","type":"varchar(15)"},{"name":"foo_cd","type":"varchar(10)"},{"name":"foo_nm","type":"varchar(56)"},{"name":"tmc_regn_cd","type":"varchar(10)"},{"name":"tmc_mrkt_cd","type":"varchar(20)"},{"name":"mrkt_grp","type":"varchar(30)"},{"name":"ingest_timestamp","type":"string"},{"name":"incr_ingest_timestamp","type":"string"}],"database":"db_i","table":"ss_mv"}{"columns":[{"name":"bar_src_name","type":"string"},{"name":"bar_ent_name","type":"string"},{"name":"from_src","type":"string"},{"name":"reload","type":"string"},{"name":"column_mismatch","type":"string"},{"name":"xx_src_name","type":"string"},{"name":"xx_ent_name","type":"string"}],"database":"db_i","table":"test_table"}

Desired output is to start and end the output with []. Also I want to include "," between the end and beginning where column list starts.

So for ex: if the curl command runs against 3 tables as shown above, then the three generated jsons should be created like :

 [{json1},{json2},{json3}]

Number 1,2,3 ...etc corresponds to different tables in curl command running in for loop against a particular db whose json should be created in one file but with desired format.

instead of what I'm currently getting :

 {json1}{json2}{json3}

In the output pasted above, JSON 1 is :

{"columns":[{"name":"tbl_id","type":"varchar(50)"},{"name":"cret_timestmp","type":"timestamp"},{"name":"updt_timestmp","type":"timestamp"},{"name":"frst_nm","type":"varchar(50)"},{"name":"last_nm","type":"varchar(50)"},{"name":"acct_num","type":"varchar(15)"},{"name":"r_num","type":"varchar(15)"},{"name":"pid","type":"decimal(15,0)"},{"name":"ami_id","type":"varchar(30)"},{"name":"ssn","type":"varchar(9)"},{"name":"client_id","type":"varchar(30)"},{"name":"client_nm","type":"varchar(100)"},{"name":"info","type":"timestamp"},{"name":"rmx","type":"varchar(10)"},{"name":"id","type":"decimal(12,0)"},{"name":"ingest_timestamp","type":"string"}, {"name":"incr_ingest_timestamp","type":"string"}],"database":"db_i","table":"db_tbl"}

JSON 2 is :

{"columns":[{"name":"key","type":"varchar(15)"},{"name":"foo_cd","type":"varchar(10)"},{"name":"foo_nm","type":"varchar(56)"},{"name":"tmc_regn_cd","type":"varchar(10)"},{"name":"tmc_mrkt_cd","type":"varchar(20)"},{"name":"mrkt_grp","type":"varchar(30)"},{"name":"ingest_timestamp","type":"string"},{"name":"incr_ingest_timestamp","type":"string"}],"database":"db_i","table":"ss_mv"}

JSON 3 is :

{"columns":[{"name":"bar_src_name","type":"string"},{"name":"bar_ent_name","type":"string"},{"name":"from_src","type":"string"},{"name":"reload","type":"string"},{"name":"column_mismatch","type":"string"},{"name":"xx_src_name","type":"string"},{"name":"xx_ent_name","type":"string"}],"database":"db_i","table":"test_table"}

I hope the requirement is clear, thanks in advance, looking to achieve this via bash.

6
  • And what you say that you tried and is not working? Which are your programming doubts? Commented Sep 25, 2018 at 20:50
  • I was trying to use sed to add square brackets at the start and end and add a "," just two positions before everytime a column is encountered but couldn't make it work Commented Sep 25, 2018 at 21:11
  • For ex: I have tried below: ` for i in cat list;do` curl -sN --negotiate -u foo:bar "http://hostname/db/tbl_name/" >> $list.json sed 's/$/,/' $list.json ## to add comma after each json done But if I use sed like this, the command creates a different file for each json line. and then use awk at each generated file like: awk '{print "["$0"]"}' db.json Commented Sep 25, 2018 at 21:33
  • sed 's/$/,/' $list.json would append a comma to every EOL in a file. Is that what you want? Commented Sep 26, 2018 at 0:07
  • Does JSON.sh help? That's probably as close as you'll come to "native" json support in bash, but you'll still fall short for pretty-printing existing JSON. Really, you want to use a language or tool that has native support for this format. jq would be perfect, but any of php, python, ruby, go or even perl would suffice. Commented Sep 27, 2018 at 14:58

4 Answers 4

3

Use jq -s.

--slurp/-s: Instead of running the filter for each JSON object in the input, read the entire input stream into a large array and run the filter just once.

Here's an example:

$ cat file.json
{ "key": "value1" }
{ "key": "value2" }
{ "key":
"value3"}{"key": "value4"}

$ jq -s < file.json
[
  {
    "key": "value1"
  },
  {
    "key": "value2"
  },
  {
    "key": "value3"
  },
  {
    "key": "value4"
  }
]
Sign up to request clarification or add additional context in comments.

1 Comment

thanks for replying, but unfortunately I can't install jq
1

I'm not sure if I got it correctly, but I think you are looking for something like

 echo "[$(cat *.json | paste -sd ',')]" > result.json

This works by creating a string that starts with [ and ends with ], and in the middle, there are the contents of the json files concatenated (cat) and separated by commas (with the help of paste). That string is echoed and written to a new file.

1 Comment

Great answer! Just one objection: you are building a potentially huge (memory hungry) echo command if there are many large JSON files. { echo "["; cat *.json | paste -sd ','; echo "]"; } > result.json does the same and is far more resource-friendly.
1

Presuming input in valid JSONL format (one JSON document per line of input), you can embed a Python script inside your bash script:

slurpjson_py='
import json, sys
json.dump([json.loads(line.strip()) for line in sys.stdin], sys.stdout, indent=4)
sys.stdout.write("\n")
'

slurpjson() { python -c "$slurpjson_py" "$@"; }

If called as:

slurpjson <<EOF
{ "first": "document", "starting": "here" }
{ "second": "document", "ending": "here" }
EOF

...output is correctly:

[
    {
        "starting": "here",
        "first": "document"
    },
    {
        "second": "document",
        "ending": "here"
    }
]

Comments

0

I managed to achieve this by running curl command and adding a "," with every line break using

sed 's/$/,/'

And then remove the last "," and added first and end [] using :

for i in *; do cat $i | sed '$ s/.$//' | awk '{print "["$0"]"}' > $json_dir/$i; done

1 Comment

cat $i is buggy -- try this with filenames with spaces, or a filename that starts with a dash -- and even when changed to cat -- "$i" to be safe with all possible names, it's still a needless performance cost to have an extra program (cat is a separate executable, not part of the shell) in your pipeline. Safer and faster to run for in *; do sed '...' <"$i" | ... > "$json_dir/$i", with double quotes surrounding all expansions.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.