2

I have files which have N JSON objects and they are separated by comma (,)

{"a":1},{"b":2},{"c":3},{"d":2},{"e":1},{"f":2} ...

I would like to convert them into one JSON array with N objects using jq

[{"a":1},{"b":2},{"c":3},{"d":2},{"e":1},{"f":2} ...]

I tried jq -R 'split(",")' myfile.json but it gives me an array of N strings

[
  "{\"a\":1}",
  "{\"b\":2}",
  "{\"a\":1}",
  "{\"b\":2}",
  "{\"a\":1}",
  "{\"b\":2}",
  "{\"a\":1}",
  "{\"b\":2}" ....
]

Any idea?

3
  • 1
    It's probably easiest to just wrap your input in [.....] Commented Apr 4, 2018 at 7:41
  • My file contains millions of json object, maybe it's not an efficient way to read whole file ? Commented Apr 4, 2018 at 7:54
  • Please clarify whether any of the JSON objects might contain more than one key, and whether any of the key names or values might contain a comma. Commented Apr 9, 2018 at 11:59

2 Answers 2

1

You are on the right track, you just need to map fromjson to the array, e.g.:

jq -Rc 'split(",") | map(fromjson)' myfile.json

Output:

[{"a":1},{"b":2},{"c":3},{"d":2},{"e":1},{"f":2}]

However, if you are dealing with huge inputs, perhaps use a more streamable command to split the input into chunks, e.g. with tr:

<myfile.json tr ',' '\n' | jq -c .

Output:

{"a":1}
{"b":2}
{"c":3}
{"d":2}
{"e":1}
{"f":2}
Sign up to request clarification or add additional context in comments.

Comments

1

Since you have millions of these JSON objects, let me first suggest an efficient way to produce a stream of them in the JSON-Lines format (i.e., with "newline" as the delimiter).

WARNING: THE FOLLOWING ASSUMES THAT THE OBJECTS DO NOT CONTAIN JSON STRINGS WITH COMMAS.

Let's assume the comma-separated objects are in a file named objects.txt. First, create a file, program.jq, with the following jq program:

def one:
  (try input catch null)
  | if . == 0 then empty elif . == null then one else (., one) end;

one

Then assuming your shell allows it, the invocation:

 (cat objects.txt; echo 0) |
   sed $'s/,/,\\\n/g' | 
   jq -n -c -f program.jq objects.txt

will produce the stream, one JSON object per line. This is a very manageable format. For example, to produce an array, you could pipe the above-mentioned stream into jq -s .

However, if the goal is solely to produce a JSON array, then as pointed out elsewhere, the most efficient approach would be to enclose the comma-separated objects in square brackets, along the lines of:

(echo "["; cat objects.txt; echo "]")

So the relevant question here, perhaps, is: what's the real goal? It seems doubtful that having an unmanageably large array of small JSON objects is likely to more useful than either the original comma-separated sequence, or a simple stream.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.