Package DBMS_JSON made useful

As of Oracle 21c, there are two ways to map JSON data into SQL data types and structures:

  • Virtual columns through JSON_VALUE function.
  •  Views or materialized views with function JSON_TABLE.

You can do all the typing by yourself or use DBMS_JSON.ADD_VIRTUAL_COLUMNS and DBMS_JSON.CREATE_VIEW procedures to make the computer do it for you. Unfortunately, the DBMS_JSON package uses JSON datatypes, which is suboptimal with SQL. But of course, there’s a way around it.

SETUP

First some table to keep our JSON data

CREATE TABLE transactions (
    payload json NOT NULL,
        id AS ( JSON_VALUE(payload, '$.ID' RETURNING NUMBER(38) ERROR ON ERROR ERROR ON MISMATCH) )
            CONSTRAINT transaction_pk PRIMARY KEY,
        transaction_ts AS ( JSON_VALUE(payload, '$.TRANSACTION_TS.timestamp()' ERROR ON ERROR ERROR ON MISMATCH) )
)
    PARTITION BY RANGE (
        transaction_ts
    ) INTERVAL ( numtodsinterval(1, 'DAY') ) ( PARTITION p_first
        VALUES LESS THAN ( TIMESTAMP '2021-02-26 00:00:00.00' )
    );

I’m using JSON_VALUE to create virtual columns for the primary key and partitioning. I know id’s are integers and transaction_ts are timestamps so that I can add proper data types in JSON_VALUE.

And some nonsense data:

INSERT INTO transactions ( payload )
    WITH gen AS (
        SELECT
            ROWNUM id
        FROM
            dual
        CONNECT BY
            level < 2000
    )
    SELECT
        JSON_OBJECT(*)
    FROM
        (
            SELECT
                ROWNUM                                   id,
                systimestamp - dbms_random.value() * 355 transaction_ts,
                sys_guid()                               external_reference,
                mod(g2.id, 1023)                         customer_id,
                mod(g1.id, 1023)                         account_id,
                mod(g2.id, 121) + round(dbms_random.value(), 2) amount,
                (
                    SELECT
                        JSON_ARRAYAGG(obj)
                    FROM
                        (
                            SELECT
                                    JSON_OBJECT(
                                        'id' VALUE ROWNUM,
                                        'customer_acc' VALUE mod(g1.id, 1023),
                                        'system_acc' VALUE 0,
                                        'amount' VALUE mod(g2.id, 121)
                                    ) obj
                            FROM
                                dual
                            CONNECT BY level < 1 + mod(g1.id, 5)
                        )
                )                                        transaction_legs
            FROM
                     gen g1 CROSS JOIN gen g2
        );

Search index just to create a data guide:

CREATE SEARCH INDEX ix_trans_payload_search ON transactions ( payload ) FOR JSON;

Rest of this blog post will focus on the three properties of the JSON document: ID, AMOUNT and TRANSACTION_TS, this is the relevant part of the hierarchical data guide:

{
    "o:path" : "$.ID",
    "type" : "number",
    "o:length" : 8,
    "o:preferred_column_name" : "PAYLOAD$ID"
  },
  {
    "o:path" : "$.AMOUNT",
    "type" : "number",
    "o:length" : 8,
    "o:preferred_column_name" : "PAYLOAD$AMOUNT"
  },
{
    "o:path" : "$.TRANSACTION_TS",
    "type" : "string",
    "o:length" : 32,
    "o:preferred_column_name" : "PAYLOAD$TRANSACTION_TS"
  },

JSON datatypes for ID and AMOUNT are float, even if they know that ID is an integer and AMOUNT has two digits from the decimal point. On the other hand, TRANSACTION_TS is a string since JSON does not have datetime data types.

DBMS_JSON.ADD_VIRTUAL COLUMNS and CREATE_VIEW procedures

The curious thing about these procedures is that you can’t specify your own data types or conversion methods. They take the data type from the data guide. Let’s create another virtual column for the TS_TRANSACTION property.

DECLARE
    l_dataguide CLOB := dbms_json.get_index_dataguide('TRANSACTIONS', 'PAYLOAD', dbms_json.format_hierarchical);
BEGIN
    dbms_json.add_virtual_columns('TRANSACTIONS', 'PAYLOAD', l_dataguide);
END;

The result is underwhelming. Our timestamp has VARCHAR2 datatype, integers accept floating point numbers, and because of NULL ON ERROR clause, these virtual columns do not enforce the schema.

Introducing the proper data types

JSON data guide is a JSON document, and editing it before passing it to the DBMS_JSON procedures is relatively easy. And as it appears, I can use data types returned by the type() SQL/JSON path expression item method (what a mouthful!). So let’s try it.

DECLARE
    l_dataguide CLOB;
BEGIN
    SELECT
        JSON_TRANSFORM(dbms_json.get_index_dataguide('TRANSACTIONS', 'PAYLOAD', dbms_json.format_hierarchical),
                       SET '$.properties.TRANSACTION_TS.type' = 'timestamp')
    INTO l_dataguide
    FROM
        dual;
    dbms_json.add_virtual_columns('TRANSACTIONS', 'PAYLOAD', l_dataguide);
END;

Magically our PAYLOAD$TRANSACTION_TS has become a TIMESTAMP:

What does not work is providing precision for the data types. For example, adding precision to the TIMESTAMP column will give ORA-40594: invalid data guide: TRANSACTION_TS has an invalid type timestamp(3). Unfortunately, I couldn’t find a way to change NULL ON ERROR clause, so you can’t use those virtual columns to enforce constraints or to partition the table.

Leave a comment