Oracle In-Memory for virtual columns

If you want to materialize the results of a computation in Oracle, function-based indexes and materialized views are the standard answer. However, in-Memory provides the third option: in-memory virtual columns and expressions. While virtual expressions somewhat supersede in-memory virtual columns, they provide a relatively simple introduction to how virtual expression infrastructure works.

As per documentation, every IMCU has an associated In-memory Expression Unit (IMEU). IMEUs are populated with IMCUs; for a virtual column or expression to be loaded to in-memory, the table must be (re)populated.

Setup

Let’s have a look at how this thing works. First, some test data:

INMTEST@localhost/orclpdb1> CREATE TABLE transactions (
  2      id                 INTEGER PRIMARY KEY,
  3      customer_id        INTEGER NOT NULL,
  4      account_id         INTEGER NOT NULL,
  5      transaction_type   NUMBER(2) NOT NULL,
  6      external_reference VARCHAR2(256),
  7      amount             NUMBER(36, 2) NOT NULL,
  8      created_ts         TIMESTAMP NOT NULL,
  9      system_ts          TIMESTAMP
 10  )
 11      PARTITION BY RANGE (
 12          created_ts
 13      ) INTERVAL ( numtodsinterval(1, 'DAY') ) ( PARTITION p_first
 14          VALUES LESS THAN ( TIMESTAMP '2020-03-19 00:00:00.00' )
 15      );

Table created.

Elapsed: 00:00:00.03
INMTEST@localhost/orclpdb1> INSERT /*+ parallel enable_parallel_dml APPEND */ INTO transactions
  2      WITH gen AS (
  3          SELECT
  4              level id
  5          FROM
  6              dual
  7          CONNECT BY
  8              level < 5000
  9      )
 10      SELECT
 11          ROWNUM,
 12          mod(1 + g1.id * g2.id, 36000),
 13          mod(1 + g1.id * g2.id, 36000),
 14          mod(g1.id, 8),
 15          sys_guid(),
 16          mod(1 + g1.id * g2.id, 36000) + round(g1.id /(1 + g2.id), 4),
 17          systimestamp AT TIME ZONE 'CET' - dbms_random.value(0, 366),
 18          systimestamp
 19      FROM
 20               gen g1
 21          CROSS JOIN gen g2;

24990001 rows created.

Elapsed: 00:11:27.71
INMTEST@localhost/orclpdb1 03-04-2023 09:50:46> commit;

Commit complete.

Elapsed: 00:00:00.38

Virtual column disabled for in-memory

To test if in-memory virtual columns have any effect, I’m using a relatively slow, deterministic function, and I will create a virtual column based on that function. My expectation is that once the virtual column is loaded to the in-memory, queries will become faster since function calls are avoided. The second thing that should happen is once the virtual column is enabled for in-memory, repopulating the in-memory area will take longer.

INMTEST@localhost/orclpdb1 03-04-2023 09:39:18> CREATE OR REPLACE FUNCTION slow_round (
  2              n NUMBER, m NUMBER
  3  ) RETURN NUMBER
  4      AUTHID DEFINER
  5      DETERMINISTIC
  6      PARALLEL_ENABLE
  7  IS
  8      PRAGMA udf;
  9      l_dummy PLS_INTEGER;
 10  BEGIN
 11      for i in 0 .. 400 loop
 12          l_dummy := mod(i, 255);
 13      end loop;
 14      RETURN round(n, m);
 15  END slow_round;
 16  /

Function created.

Elapsed: 00:00:00.01
INMTEST@localhost/orclpdb1 03-04-2023 09:50:46> alter table transactions add amount_2      AS (slow_round(amount, 2));

Table altered.

Elapsed: 00:00:00.02
INMTEST@localhost/orclpdb1 03-04-2023 09:50:46> alter table transactions inmemory;

Table altered.

Elapsed: 00:00:00.22
INMTEST@localhost/orclpdb1 03-04-2023 09:50:46> alter table transactions no inmemory (amount_2);

Table altered.

Elapsed: 00:00:00.01

My test environment has only on table marked for in-memory, so I can just reload all the segments:

INMTEST@localhost/orclpdb1 03-04-2023 09:50:47> declare
  2      l_out NUMBER;
  3  begin
  4      l_out := DBMS_INMEMORY_ADMIN.POPULATE_WAIT('NONE', force => true);
  5      dbms_output.put_line('populate_wait -> ' || l_out);
  6  end;
  7  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:56.53

A little check if everything was populated:

INMTEST@localhost/orclpdb1 03-04-2023 09:51:43> SELECT
  2      populate_status,
  3      COUNT(*)
  4  FROM
  5      gv$im_user_segments
  6  WHERE
  7      segment_name = 'TRANSACTIONS'
  8  GROUP BY
  9      populate_status;

POPULATE_STAT   COUNT(*)
------------- ----------
COMPLETED            367

Elapsed: 00:00:00.01
INMTEST@localhost/orclpdb1 03-04-2023 09:51:43>
INMTEST@localhost/orclpdb1 03-04-2023 09:51:43> SELECT
  2      SUM(inmemory_size),
  3      SUM(bytes),
  4      SUM(bytes_not_populated)
  5  FROM
  6      gv$im_user_segments
  7  WHERE
  8      segment_name = 'TRANSACTIONS'
  9  ;

SUM(INMEMORY_SIZE) SUM(BYTES) SUM(BYTES_NOT_POPULATED)
------------------ ---------- ------------------------
        2783707136 3078619136                        0

Elapsed: 00:00:00.01
INMTEST@localhost/orclpdb1 03-04-2023 09:51:43> SELECT
  2      column_name,
  3      obj_num,
  4      segment_column_id,
  5      inmemory_compression
  6  FROM
  7      v$im_column_level
  8  WHERE
  9      table_name = 'TRANSACTIONS';

COLUMN_NAME                                                                                                                          OBJ_NUM SEGMENT_COLUMN_ID INMEMORY_COMPRESSION
--------------------------------------------------------------------------------------------------------------------------------- ---------- ----------------- --------------------------
AMOUNT_2                                                                                                                              100347                0 NO INMEMORY
<snip>

Query to check how slow is our slow rounding function:

INMTEST@localhost/orclpdb1 03-04-2023 09:51:43> set autotrace on
INMTEST@localhost/orclpdb1 03-04-2023 09:51:43> SELECT
  2      SUM(amount_2) amount
  3  FROM
  4      transactions
  5  WHERE
  6      trunc(created_ts) = trunc(systimestamp - 128)
  7  ;

    AMOUNT
----------
1219916975

Elapsed: 00:00:19.07

Execution Plan
----------------------------------------------------------
Plan hash value: 4195887397

-------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name         | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |              |     1 |    24 |  7331  (58)| 00:00:01 |       |       |
|   1 |  SORT AGGREGATE              |              |     1 |    24 |            |          |       |       |
|   2 |   PARTITION RANGE ALL        |              |   249K|  5857K|  7331  (58)| 00:00:01 |     1 |1048575|
|*  3 |    TABLE ACCESS INMEMORY FULL| TRANSACTIONS |   249K|  5857K|  7331  (58)| 00:00:01 |     1 |1048575|
-------------------------------------------------------------------------------------------------------------

Virtual column enabled for in-memory

Next I will add column AMOUNT_2 to the in-memory and repopulate the segments.

INMTEST@localhost/orclpdb1 03-04-2023 09:52:21> alter table transactions inmemory (amount_2);

Table altered.

Elapsed: 00:00:00.01
INMTEST@localhost/orclpdb1 03-04-2023 09:52:21>
INMTEST@localhost/orclpdb1 03-04-2023 09:52:21> set serverout off
INMTEST@localhost/orclpdb1 03-04-2023 09:52:21> declare
  2      l_out NUMBER;
  3  begin
  4      l_out := DBMS_INMEMORY_ADMIN.POPULATE_WAIT('NONE', force => true);
  5      dbms_output.put_line('populate_wait -> ' || l_out);
  6  end;
  7  /

PL/SQL procedure successfully completed.

Elapsed: 00:39:09.75
INMTEST@localhost/orclpdb1 03-04-2023 10:31:31> SELECT
  2      populate_status,
  3      COUNT(*)
  4  FROM
  5      gv$im_user_segments
  6  WHERE
  7      segment_name = 'TRANSACTIONS'
  8  GROUP BY
  9      populate_status;

POPULATE_STAT   COUNT(*)
------------- ----------
COMPLETED            365
STARTED                2

Elapsed: 00:00:00.01

Reloading the segments took whopping 39 minutes, and somehow POPULATE_WAIT gave up before all segments got repopulated. But for our test case this should be ok:

INMTEST@localhost/orclpdb1 03-04-2023 10:31:31> SELECT
  2      SUM(amount_2) amount
  3  FROM
  4      transactions
  5  WHERE
  6      trunc(created_ts) = trunc(systimestamp - 128)
  7  ;

    AMOUNT
----------
1219916975

Elapsed: 00:00:06.85

Execution Plan
----------------------------------------------------------
Plan hash value: 4195887397

-------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name         | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |              |     1 |    24 |  7331  (58)| 00:00:01 |       |       |
|   1 |  SORT AGGREGATE              |              |     1 |    24 |            |          |       |       |
|   2 |   PARTITION RANGE ALL        |              |   249K|  5857K|  7331  (58)| 00:00:01 |     1 |1048575|
|*  3 |    TABLE ACCESS INMEMORY FULL| TRANSACTIONS |   249K|  5857K|  7331  (58)| 00:00:01 |     1 |1048575|
-------------------------------------------------------------------------------------------------------------

Amazingly it works; with the in-memory virtual column, our query took around 7 seconds vs 19 seconds without.

One thought on “Oracle In-Memory for virtual columns

Leave a comment