1

I'm trying to curl a webpage and does some processing to it and in final i am trying to print in json format.(which actually needs to be in mongodb input)

so the input (which is read though curl) is

Input:

brendan google engineer
stones microsoft chief_engineer
david facebook tester

for the kind of processing, i'm assigning values to the variables ($name, $emloyer, $designation)

my final command which converts to json is,

echo [{\"Name\":\"$name\"},{\"Employer\":\"$employer\"},{\"dDesignation\":\"$designation\"}]

The current output is,

[{"Name":"brendan","Employer":"google","Designation":"engineer"}]
[{"Name":"stones","Employer":"microsoft","Designation":"chief_engineer"}]
[{"Name":"david","Employer":"facebook","Designation":"tester"}]

but, i want the output in the same line separated by comma and square brackets in the start and end (not on every lines)

Expected output:

  [{"Name":"brendan","Employer":"google","Designation":"engineer"},{"Name":"stones","Employer":"microsoft","Designation":"chief_engineer"},
    {"Name":"david","Employer":"facebook","Designation":"tester"}]

any suggestions.

2
  • 1
    Why do you care where the linebreaks are if you're generating JSON? You should care about what the parse tree and the document model look like -- if you need to care about what kind of whitespace is between the elements, your consumers (whoever's parsing the code you generate) are Doing It Wrong. Commented Sep 3, 2016 at 15:34
  • ...in particular, MongoDB definitely doesn't care whether you have newlines or spaces outside of syntactically-pertinent locations (such as string contents). Commented Sep 3, 2016 at 15:44

3 Answers 3

8

Conventional text-processing tools can't do this right for the general case. There are a bunch of corner cases to JSON -- nonprintable and high-Unicode characters (and quotes) need to be escaped, for instance. Use a tool that's actually built for the job, such as jq:


jq -n -R '
[
  inputs |
  split(" ") |
  { "Name": .[0], "Employer": .[1], "Designation": .[2] }
]' <<EOF
brendan google engineer
stones microsoft chief_engineer
david facebook tester
EOF

...emits as output:

[
  {
    "Name": "brendan",
    "Employer": "google",
    "Designation": "engineer"
  },
  {
    "Name": "stones",
    "Employer": "microsoft",
    "Designation": "chief_engineer"
  },
  {
    "Name": "david",
    "Employer": "facebook",
    "Designation": "tester"
  }
]
Sign up to request clarification or add additional context in comments.

5 Comments

Perfecto!! Had you gotten another choice, what tool you would've suggested?
If installing jq isn't an option, my usual fallback is embedding a snippet of Python, since there's an excellent json module in the standard library there.
I though about perl, but then jq is being more available with unices these days, i guess. Anyways, nice answer ++
Very helpful, first time hearing about JQ, can the input be given from curl? coz I'm doing some processing to the curl file and also adding few more variables which are apart from curl.
Curl writes to standard output, and jq read from standard input; so yes, just replace the here document with a pipe from curl.
0

Something like this?

sep='['
curl "...whatever..." |
while read -r name employer designation; do
    printf '%s{"Name": "%s", "Employer": "%s", "Designation": "%s"}' "$sep" "$name" "$employer" "$designation"
    sep=', '
done
printf ']\n'

I do agree that this is brittle and error-prone; if you can use a JSON-aware tool like jq, by all means do that instead.

3 Comments

@CharlesDuffy indeed, typed that without thinking - fixed now, thanks for diagnosing!
@tripleee This actually works out pretty well, can the same be written as mongodb query? it would be like db.testcollection.insert([{"Name": "brendan","Employer": "google","Designation": "engineer"},{"Name": "stones","Employer": "microsoft","Designation": "chief_engineer"}])
No idea where you want that to go. If you just want it added to the text we print, the change should be obvious.
0

If you have access to jq 1.5 or later, then you can use inputs and may wish to consider using splits(" +") in case the tokens might be separated by more than one space:

jq -n -R '
  [inputs
   | [splits(" +")]
   | { "Name": .[0], "Employer": .[1], "Designation": .[2] }]'

If you do not have ready access to jq 1.5 or later, then please note that the following will work with jq 1.4:

jq -R -s '
  [split("\n")[]
   | select(length>0)
   | split(" ")
   | { "Name": .[0], "Employer": .[1], "Designation": .[2] }]'

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.