2

I have a query that it's been written a while ago. Basically a Materialized View that uses json_table function.

Recently since we moved to Oracle 19c that MV sometimes works and other times doesn't. I rewrite that query by using oracle json_value function. Looking at the query plan, I see that the query that is using json_table is much slower but I don't understand all that data.

Can someone explain what means the bytes, CPU, time etc.

This is using json_value

EXPLAIN PLAN for
SELECT
    JSON_VALUE(response, '$.ErrorRecord[0].xNumber')                   xNumber,
    JSON_VALUE(response, '$.ErrorRecord[0]."error field"')             ERROR_FIELD,
    JSON_VALUE(response, '$.ErrorRecord[0]."value of field in error"') VALUE_OF_FIELD_IN_ERROR,
    JSON_VALUE(response, '$.ErrorRecord[0]."error description"')       ERROR_DESCRIPTION,
    JSON_VALUE(request, '$.Status')                                    STATUS,
    sf.sv_code                                                        CENTER,
    TO_CHAR(arr.created_date_time, 'YYYYMMDD' )                        DATE_OCCURANCE
    
FROM
    aud_request_response arr , 
    person p,
    rep_mapper_svc_fco sf,
    rep_mapper_interface_error re
WHERE 
    JSON_VALUE(response, '$.ErrorRecord[0].xNumber') = p.registration_number (+)
    AND arr.response.Status = 'Error'
    AND arr.request.interfaceName = 'CLAIMS'
    AND JSON_VALUE(request, '$.DataRecord[0].ACO') = sf.fco_code(+)
    AND arr.request.interfaceName = re.interface_name 
    AND coalesce(sf.svc_code,'ATH')
        IN ('XS','YS','XZ','ZS','ASD')
GROUP BY 
    sf.sv_code,
    JSON_VALUE(request, '$.DataRecord[0].ACO'),
    arr.request.interfaceName,
    JSON_VALUE(request, '$.Status'), 
    JSON_VALUE(response, '$.ErrorRecord[0]."error field"'),
    arr.created_date_time,
    arr.updated_date_time,
    JSON_VALUE(response, '$.ErrorRecord[0]."value of field in error"'),
    JSON_VALUE(response, '$.ErrorRecord[0]."error description"'),
    JSON_VALUE(response, '$.ErrorRecord[0].xNumber') ;

SELECT * FROM table(DBMS_XPLAN.DISPLAY);
Plan hash value: 241534218
 
------------------------------------------------------------------------------------------------------------------
| Id  | Operation                | Name                          | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |                               |  1094 |   871K|       |  1877K  (1)| 00:01:14 |
|   1 |  HASH GROUP BY           |                               |  1094 |   871K|  4688K|  1877K  (1)| 00:01:14 |
|   2 |   NESTED LOOPS OUTER     |                               |  5259 |  4190K|       |  1877K  (1)| 00:01:14 |
|*  3 |    FILTER                |                               |       |       |       |            |          |
|*  4 |     HASH JOIN RIGHT OUTER|                               |  5259 |  4139K|       |  1866K  (1)| 00:01:13 |
|   5 |      TABLE ACCESS FULL   | REP_MAPPER_SVC_FCO         |    85 |   680 |       |     3   (0)| 00:00:01 |
|*  6 |      HASH JOIN RIGHT SEMI|                               |  5259 |  4098K|       |  1866K  (1)| 00:01:13 |
|*  7 |       TABLE ACCESS FULL  | REP_MAPPER_INTERFACE_ERROR |    33 |   363 |       |     5   (0)| 00:00:01 |
|*  8 |       TABLE ACCESS FULL  | AUD_REQUEST_RESPONSE        |  5259 |  4041K|       |  1866K  (1)| 00:01:13 |
|*  9 |    INDEX UNIQUE SCAN     | PER_ANBR_IDX               |     1 |    10 |       |     2   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------------------

By using json_table

EXPLAIN PLAN for
SELECT
    jtresponse.xNumber as xNumber,
    jtresponse.error_field as ERROR_FIELD,
    replace(jtresponse.value_of_field_in_error, ',interfaceName=INTERFACES','') as VALUE_OF_FIELD_IN_ERROR,
    jtresponse.error_description as ERROR_DESCRIPTION,
    trim(arr.response.Status) as STATUS,
    sf.sv_code  as CENTER,
    TO_CHAR(arr.created_date_time, 'YYYYMMDD' ) as DATE_OCCURANCE
from 
    aud_request_response arr,
    person p,
    rep_mapper_svc_fco sf,
    rep_mapper_interface_error re,
    json_table(response, '$'
        COLUMNS (
        nested path '$.ErrorRecord[*]' columns (
            aNumber path '$.xNumber' null on error,
            error_field path '$."error field"' null on error,
            value_of_field_in_error path '$."value of field in error"' null on error,
            error_description path '$."error description"' null on error
    ))) jtresponse
    ,json_table(request, '$'
        COLUMNS (
        nested path '$.DataRecord[*]' columns (
            fileControl path '$.ACO' null on error
    ))) jtrequest
where  jtrequest.fileControl =sf.fco_code(+)
    and arr.request.interfaceName = 'CLAIMS'
    and arr.request.interfaceName = re.interface_name
    and jtresponse.xNumber = p.registration_number (+)
    and arr.response.Status='Error'
    and coalesce(sf.sv_code,'ATH') in('XS','YS','XZ','ZS','ASD')
GROUP BY 
    sv_code,
    jtrequest.fileControl,
    arr.request.interfaceName,
    arr.response.Status, 
    jtresponse.error_field,
    arr.created_date_time,
    arr.updated_date_time,
    jtresponse.value_of_field_in_error,
    jtresponse.error_description,
    jtresponse.xNumber;

SELECT * FROM table(DBMS_XPLAN.DISPLAY);

Plan hash value: 834586449
 
--------------------------------------------------------------------------------------------------------------------
| Id  | Operation                  | Name                          | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |                               |   350G|   260T|       |  1908M  (2)| 20:42:27 |
|   1 |  HASH GROUP BY             |                               |   350G|   260T|   290T|  1908M  (2)| 20:42:27 |
|   2 |   NESTED LOOPS             |                               |   350G|   260T|       |  1168M  (1)| 12:40:35 |
|*  3 |    FILTER                  |                               |       |       |       |            |          |
|*  4 |     HASH JOIN RIGHT OUTER  |                               |    42M|    32G|       |  2009K  (1)| 00:01:19 |
|   5 |      TABLE ACCESS FULL     | REP_MAPPER_SVC_FCO         |    85 |   680 |       |     3   (0)| 00:00:01 |
|   6 |      NESTED LOOPS          |                               |    42M|    32G|       |  2009K  (1)| 00:01:19 |
|*  7 |       HASH JOIN RIGHT SEMI |                               |  5259 |  4098K|       |  1866K  (1)| 00:01:13 |
|*  8 |        TABLE ACCESS FULL   | REP_MAPPER_INTERFACE_ERROR |    33 |   363 |       |     5   (0)| 00:00:01 |
|*  9 |        TABLE ACCESS FULL   | AUD_REQUEST_RESPONSE        |  5259 |  4041K|       |  1866K  (1)| 00:01:13 |
|  10 |       JSONTABLE EVALUATION |                               |       |       |       |            |          |
|  11 |    JSONTABLE EVALUATION    |                               |       |       |       |            |          |
--------------------------------------------------------------------------------------------------------------------

Thank you!

4
  • can you please show the before/after queries? That is really bad. Commented Mar 8, 2022 at 20:31
  • @OldProgrammer I added the queries. Please take a look. Thank you! Commented Mar 8, 2022 at 20:53
  • Please edit the question to include a minimal reproducible example with: the CREATE TABLE statement for the tables; the INSERT statements; annotate the SELECT statement with the appropriate table aliases so we can see which table the columns are from; and the expected output for the sample data. Commented Mar 8, 2022 at 21:07
  • Looking at the query, I'm wondering why you use NESTED PATH in the JSON_TABLE statements? Also, (+) joins? Commented Mar 8, 2022 at 21:10

1 Answer 1

1

First up: the two queries are not equivalent!

The json_value query gets the first entries in the DataRecord and ErrorRecord arrays. With json_table the database generates a row for each element in the array.

I see no join between jtrequest and jtresponse. So the query is generating the Cartesian product of these arrays. i.e. it's creating a row for every element from the first array combined with every element from the second for each document.

The rows/bytes/time columns are all estimates. The optimizer thinks this is how many rows/size data/query duration based on the table stats.

The top line in the plan is what's (estimated) the query will return. So for json_table, it's estimating:

  • 350G => 350 billion rows
  • 260T => 260 terabytes of data
  • 20:42:27 => 20+ hours of runtime

These figures could be wrong for many reasons, but even if they're over by a factor of 1000x you're still looking at huge amounts of data.

I think you need to figure out the purpose of the original query - in particular why it's generating the Cartesian product of the two arrays. This quickly increases the data volumes.

Sign up to request clarification or add additional context in comments.

1 Comment

How do we join jtrequest and jtresponse since there seem to be no common elements between the two?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.