4

Is there a way to extract data from a text column containing json arrays with a varying number of json objects into a table?

For example if I...

CREATE TABLE tableWithJsonStr (location TEXT, jsonStr TEXT);

INSERT INTO tableWithJsonStr VALUES 
('Home', '[{"animalId":"1","type":"dog", "color":"white","isPet":"1"},{"animalId":"2","type":"cat", "color":"brown","isPet":"1"}]'),
('Farm', '[{"animalId":"8","type":"cow", "color":"brown","isPet":"0"}, {"animalId":"33","type":"pig", "color":"pink","isPet":"0"}, {"animalId":"22","type":"horse", "color":"black","isPet":"1"}]'),
('Zoo', '[{"animalId":"5","type":"tiger", "color":"stripes","isPet":"0"}]');

and

CREATE TABLE animal (
  location TEXT,
  idx INT,
  animalId INT,
  type TEXT,
  color TEXT,
  isPet BOOLEAN
);

I am able to extract tableWithJsonStr.jsonStr by running:

INSERT INTO animal
SELECT location,
       idx AS id,
       TRIM(BOTH'"' FROM JSON_EXTRACT(jsonStr, CONCAT('$[', idx, '].animalId'))) AS animalId,
       TRIM(BOTH'"' FROM JSON_EXTRACT(jsonStr, CONCAT('$[', idx, '].type'))) AS type,
       TRIM(BOTH'"' FROM JSON_EXTRACT(jsonStr, CONCAT('$[', idx, '].color'))) AS color,
       TRIM(BOTH'"' FROM JSON_EXTRACT(jsonStr, CONCAT('$[', idx, '].isPet'))) AS isPet
FROM tableWithJsonStr
JOIN(
  SELECT 0 AS idx UNION
         SELECT 1 AS idx UNION
         SELECT 2 AS idx UNION
         SELECT 3 AS idx
  ) AS indexes
WHERE JSON_EXTRACT(jsonStr, CONCAT('$[', idx, ']')) IS NOT NULL;

The result of the animal table is:

| location | idx | animalId | type  | color   | isPet |
|==========|=====|==========|=======|=========|=======|
| Farm     |   0 |        8 |  cow  |   brown |     0 |
| Farm     |   1 |       33 |  pig  |    pink |     0 |
| Farm     |   2 |       22 | horse |   black |     1 |
| Home     |   0 |        1 |   dog |   white |     1 |
| Home     |   1 |        2 |   cat |   brown |     1 |
| Zoo      |   0 |        5 | tiger | stripes |     0 |

While the solution works, it is not extensible. If I have more than 3 objects in my json array, they will not be accounted for unless I add another SELECT 4 AS idx in my JOIN. Is there a better way to iterate over the objects in the array that doesn't require foreknowledge of the max number of objects that might be in each array?

2
  • Why are you not using the JSON data type as it is designed for the type of processing you desire? Commented Jan 9, 2019 at 23:29
  • This would be ideal but the table the data is written to is out of my control and contains variable data types. To be honest, the db needs to be completely refactored. The short-term requires me to work with the data stored as text Commented Jan 14, 2019 at 18:54

1 Answer 1

5

If you're using MySQL 8.0, you can use the JSON_TABLE command to extract your data from each row of JSON:

SELECT t1.location, farm.*
FROM tableWithJsonStr t1
JOIN JSON_TABLE(t1.jsonStr,
     '$[*]'
     COLUMNS (idx FOR ORDINALITY,
              animalId INT PATH '$.animalId',
              type TEXT PATH '$.type',
              color TEXT PATH '$.color',
              isPet BOOLEAN PATH '$.isPet')
     ) farm
ORDER BY location, idx

Output:

location    idx     animalId    type    color       isPet
Farm        1       8           cow     brown       0
Farm        2       33          pig     pink        0
Farm        3       22          horse   black       1
Home        1       1           dog     white       1
Home        2       2           cat     brown       1
Zoo         1       5           tiger   stripes     0

Demo on dbfiddle

If you are stuck with MySQL 5.7, you can use a stored procedure to extract the data:

DELIMITER $$
CREATE PROCEDURE extract_animals()
BEGIN
  DECLARE idx INT;
  DECLARE finished INT DEFAULT 0;
  DECLARE loc, json VARCHAR(200);
  DECLARE json_cursor CURSOR FOR SELECT location, jsonStr FROM tableWithJsonStr;
  DECLARE CONTINUE HANDLER FOR NOT FOUND SET finished = 1;
  DROP TABLE IF EXISTS animal;
  CREATE TABLE animal (location TEXT, idx INT, animalId INT, type TEXT, color TEXT, isPet BOOLEAN);
  OPEN json_cursor;
  json_loop: LOOP
    FETCH json_cursor INTO loc, json;
    IF finished = 1 THEN
      LEAVE json_loop;
    END IF;
    SET idx = 0;
    WHILE JSON_CONTAINS_PATH(json, 'one', CONCAT('$[', idx, ']')) DO
      INSERT INTO animal VALUES(loc,
        idx,
        JSON_UNQUOTE(JSON_EXTRACT(json, CONCAT('$[', idx, '].animalId'))),
        JSON_UNQUOTE(JSON_EXTRACT(json, CONCAT('$[', idx, '].type'))),
        JSON_UNQUOTE(JSON_EXTRACT(json, CONCAT('$[', idx, '].color'))),
        JSON_UNQUOTE(JSON_EXTRACT(json, CONCAT('$[', idx, '].isPet'))));
      SET idx = idx + 1;
    END WHILE;
  END LOOP json_loop;
END $$

Output:

location    idx     animalId    type    color       isPet
Home        0       1           dog     white       1
Home        1       2           cat     brown       1
Farm        0       8           cow     brown       0
Farm        1       33          pig     pink        0
Farm        2       22          horse   black       1
Zoo         0       5           tiger   stripes     0

Demo on dbfiddle

Sign up to request clarification or add additional context in comments.

2 Comments

This is great. Unfortunately I am stuck with 5.7 for now. I should have provided that in my question. I have started the ball rolling to get or SRE team to upgrade. This seems like an ideal solution though.
@MichaelBadger thanks for reminding me I was halfway through writing a stored procedure to do the job. I've finished it now and included it in my answer.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.