I am reading this question: Parse JSON Array and load into hive table.
The nested json comprises multiple } and {, but the regex pattern (?<=\\}),(?=\\{) can recognize json elements. Could anyone please explain how this split function works?
select
split(substr('[{"a":{"c":"sss"},"w":123},{"b":2},{"r":{"c":"sss"},"w":555}]',2),'(?<=\\}),(?=\\{)')[0],
split(substr('[{"a":{"c":"sss"},"w":123},{"b":2},{"r":{"c":"sss"},"w":555}]',2),'(?<=\\}),(?=\\{)')[1],
split(substr('[{"a":{"c":"sss"},"w":123},{"b":2},{"r":{"c":"sss"},"w":555}]',2),'(?<=\\}),(?=\\{)')[2]
and the result is:
{"a":{"c":"sss"},"w":123} {"b":2} {"r":{"c":"sss"},"w":555}]
Btw, an array without [ is sent to json_tuple, like {"a":1},{"b":2}]. This is not a json array at all and why json_tuple can work with it?