This blog post is inspired by the misbehaving Kafka sink. When we traced the queries it sends to the database, the results were spectacular. There has to be some serious engineering going on:
Instead of looking at the source of said thing, I created a test case so it is possible to experiment with things in a controlled manner. TL;DR: root cause is described in Oracle documentation, of course, but it takes some fine print reading to figure out what is going wrong here.
Test data
INMTEST@localhost/orclpdb1 22-08-2023 17:03:47> CREATE SEQUENCE seq_json_test START WITH 80000;
Sequence created.
Elapsed: 00:00:00.01
INMTEST@localhost/orclpdb1 22-08-2023 17:03:54> CREATE TABLE json_test (
id INTEGER PRIMARY KEY,
payload CLOB,
CONSTRAINT ck_payload_json CHECK ( payload IS JSON )
);
Table created.
Elapsed: 00:00:00.02
INMTEST@localhost/orclpdb1 22-08-2023 17:04:01> INSERT INTO json_test
SELECT
ROWNUM,
JSON_OBJECT(* RETURNING CLOB)
FROM
all_objects;
66193 rows created.
Elapsed: 00:00:06.30
INMTEST@localhost/orclpdb1 22-08-2023 17:04:16> COMMIT;
Commit complete.
Elapsed: 00:00:00.02
I’m using CLOB for JSON data because character data is easy to spot in the tcpdump output. It’s better to stick to BLOB or JSON data types for all other purposes.
Java code is at the end of this post. Roughly speaking, there are two ways to interact with LOBs, either through streams or reader/writer objects, or as a Clob/Blob. The main difference is that streaming allows to access LOBs without knowing their size or buffering them in the memory. But as it comes out later in the post, specifying the size of the LOB when writing is essential for the performance.
Row prefetch size, and therefore write batch size is set to default (10).
LOB prefetching
Starting with Oracle 11.1, the JDBC driver fetches LOBs with the rest of the row. The parameter oracle.jdbc.defaultLobPrefetchSize controls how many bytes are fetched. The default is 32k, and this is per single LOB. Size -1 turns off LOB metadata and data prefetching, and 0 enables metadata prefetching and turns off data prefetching. Default LOB prefetch size can be set globally, and it is possible to override it for the Statement.
If we set defaultLobPrefetchSize to -1, both metadata operations and reading Clob data create network roundtrip to the server:
Metadata isn’t cached either; each call to Clob.length() triggers a network roundtrip. Another interesting thing here is wait event SQL*Net vector data to client
. According to MOS document 2133462.1, Oracle 11.2 introduced “vectored IO” for bulk data transfers, which ignores SDU.
When defaultLobPrefetchSize is smaller than LOB, the rest of the LOB is fetched from the DB when accessed. This means a network roundtrip for each remaining LOB piece.
Writing LOBs
Good news is that JDBC batching works. Avoiding excessive network roundtrips is much more trickier.
Naive case
As a simplest case, I can write back the same LOB I read from the database:
dummy = rset.getClob(2);
writeSt.setObject(1, dummy, OracleType.CLOB);
writeSt.addBatch();
As a result Oracle JDBC driver sends instead of the data LOB handles back to the database:
Trouble with the LOB size
When LOB size is known, Oracle uses either direct binding or stream binding. From a performance perspective, both are fine: statement batching works, and data goes to the database without extra network roundtrips. The documentation warns about possibly broken statement batching and additional network roundtrips, but I haven’t noticed any.
There are overloads for set*Stream()
functions that omit the length. These functions use LOB binding no matter what the LOB size is. It might be tempting for the developer to ignore how many bytes are in the stream, but the result is that it takes a minimum of 3 network roundtrips to write one LOB. Oh, and it breaks statement batching: in my test code, Statement.addBatch()
call triggers the network roundtrips.
LOBWRITE seems to send a maximum of 32k of data in one roundtrip, so for the large objects, network chatter will proliferate. Extra network roundtrips are not the case with the stream binding.
Note the second LOBWRITE operation in the end:
WAIT #0: nam='asynch descriptor resize' ela= 22 outstanding #aio=0 current aio limit=0 new aio limit=128 obj#=-1 tim=14811764719105
WAIT #0: nam='Disk file operations I/O' ela= 201 FileOperation=2 fileno=203 filetype=6 obj#=-1 tim=14811764719278
LOBTMPCREATE: type=TEMPORARY LOB,bytes=1,c=852,e=1072,p=0,cr=0,cu=2,tim=14811764719541
WAIT #0: nam='SQL*Net message to client' ela= 3 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=14811764719610
WAIT #0: nam='SQL*Net message from client' ela= 872 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=14811764720509
LOBPGSIZE: type=TEMPORARY LOB,bytes=8132,c=28,e=28,p=0,cr=0,cu=0,tim=14811764720591
WAIT #0: nam='SQL*Net message to client' ela= 3 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=14811764720638
WAIT #0: nam='SQL*Net message from client' ela= 13575 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=14811764734232
WAIT #0: nam='SQL*Net more data from client' ela= 58 driver id=1413697536 #bytes=8132 p3=0 obj#=-1 tim=14811764734999
WAIT #0: nam='SQL*Net more data from client' ela= 46 driver id=1413697536 #bytes=8082 p3=0 obj#=-1 tim=14811764735294
WAIT #0: nam='SQL*Net more data from client' ela= 9 driver id=1413697536 #bytes=8032 p3=0 obj#=-1 tim=14811764735386
WAIT #0: nam='SQL*Net more data from client' ela= 38 driver id=1413697536 #bytes=7982 p3=0 obj#=-1 tim=14811764735491
WAIT #0: nam='SQL*Net more data from client' ela= 10 driver id=1413697536 #bytes=39 p3=0 obj#=-1 tim=14811764735619
WAIT #0: nam='SQL*Net more data from client' ela= 4 driver id=1413697536 #bytes=7893 p3=0 obj#=-1 tim=14811764735641
WAIT #0: nam='SQL*Net more data from client' ela= 5 driver id=1413697536 #bytes=7843 p3=0 obj#=-1 tim=14811764735700
WAIT #0: nam='SQL*Net more data from client' ela= 8 driver id=1413697536 #bytes=7793 p3=0 obj#=-1 tim=14811764735809
WAIT #0: nam='SQL*Net more data from client' ela= 24 driver id=1413697536 #bytes=7743 p3=0 obj#=-1 tim=14811764735929
LOBWRITE: type=TEMPORARY LOB,bytes=32528,c=1891,e=1836,p=0,cr=0,cu=28,tim=14811764736184
WAIT #0: nam='SQL*Net message to client' ela= 4 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=14811764736275
WAIT #0: nam='SQL*Net message from client' ela= 10329 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=14811764746630
WAIT #0: nam='SQL*Net more data from client' ela= 96 driver id=1413697536 #bytes=8132 p3=0 obj#=-1 tim=14811764747109
WAIT #0: nam='SQL*Net more data from client' ela= 13 driver id=1413697536 #bytes=8082 p3=0 obj#=-1 tim=14811764747304
WAIT #0: nam='SQL*Net more data from client' ela= 9 driver id=1413697536 #bytes=8032 p3=0 obj#=-1 tim=14811764747398
WAIT #0: nam='SQL*Net more data from client' ela= 9 driver id=1413697536 #bytes=7982 p3=0 obj#=-1 tim=14811764747484
WAIT #0: nam='SQL*Net more data from client' ela= 11 driver id=1413697536 #bytes=39 p3=0 obj#=-1 tim=14811764747617
WAIT #0: nam='SQL*Net more data from client' ela= 4 driver id=1413697536 #bytes=7893 p3=0 obj#=-1 tim=14811764747639
WAIT #0: nam='SQL*Net more data from client' ela= 16 driver id=1413697536 #bytes=7843 p3=0 obj#=-1 tim=14811764748351
WAIT #0: nam='SQL*Net more data from client' ela= 8 driver id=1413697536 #bytes=6873 p3=0 obj#=-1 tim=14811764748486
LOBWRITE: type=TEMPORARY LOB,bytes=28002,c=1512,e=1777,p=0,cr=1,cu=42,tim=14811764748594
WAIT #0: nam='SQL*Net message to client' ela= 2 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=14811764748657
WAIT #0: nam='SQL*Net message from client' ela= 2574 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=14811764751255
LOBTMPCREATE: type=TEMPORARY LOB,bytes=1,c=22,e=22,p=0,cr=0,cu=0,tim=14811764751342
Test code
It contains some Spring magic, connection to the database is handled by Spring. I modified fetch size and LOB prefetch size through the connect string, which looks something like this:
jdbc:oracle:thin:@my-dev-env:1521/orclpdb1?defaultRowPrefetch=10&oracle.jdbc.defaultLobPrefetchSize=1000
@Component
@Profile({ProfileNames.LOB2})
public class Lob2 {
@Autowired
DataSource ds;
@Value("${db.checker.lobs.query}")
String query;
@Value("${db.checker.lobs.set_size}")
boolean setSize;
@Value("${db.checker.lobs.naive_write}")
boolean beNaive;
static final String TRACE_ENABLE = "call dbms_monitor.session_trace_enable()";
static final String TRACE_DISABLE = "call dbms_monitor.session_trace_disable()";
static final String TRACE_PAYLOAD = "select payload from V$DIAG_SESS_SQL_TRACE_RECORDS";
static final String WRITEBACK = "insert into json_test(id, payload)"
+ " values(seq_json_test.nextval, :1)";
static final long MAX_LENGTH = 60000;
void callStatement(Connection c, String query) throws SQLException {
Statement stmt = c.createStatement();
stmt.execute(query);
stmt.close();
}
void printTrace(Connection c) throws SQLException {
String dummy;
Statement stmt = c.createStatement();
ResultSet rset = stmt.executeQuery(TRACE_PAYLOAD);
while (rset.next()) {
dummy = rset.getString(1);
System.out.println(dummy);
}
stmt.close();
}
String embiggen(String str) {
StringBuilder sb = new StringBuilder("[");
while (sb.length() < MAX_LENGTH) {
sb.append(str + ",");
}
sb.append("{}]");
return sb.toString();
}
public void doStuff() {
Clob dummy;
long len;
try {
Connection conn = ds.getConnection();
conn.setAutoCommit(false);
callStatement(conn, TRACE_ENABLE);
PreparedStatement readSt = conn.prepareStatement(query);
PreparedStatement writeSt = conn.prepareStatement(WRITEBACK);
//Won't be visible in V$DIAG_SESS_SQL_TRACE_RECORDS
callStatement(conn, "call sys.dbms_system.ksdwrt(1, 'Before executeQuery')");
ResultSet rset = readSt.executeQuery();
callStatement(conn, "call sys.dbms_system.ksdwrt(1, 'Before ResultSet processing')");
while (rset.next()) {
dummy = rset.getClob(2);
len = dummy.length(); // comes from the metadata
if (beNaive) {
writeSt.setObject(1, dummy, OracleType.CLOB);
} else {
String s = embiggen(dummy.getSubString(1, (int) len));
if (setSize) {
writeSt.setCharacterStream(1, new StringReader(s), s.length());
} else {
writeSt.setCharacterStream(1, new StringReader(s));
}
}
writeSt.addBatch();
}
callStatement(conn, "call sys.dbms_system.ksdwrt(1, 'ResultSet done')");
writeSt.executeBatch();
writeSt.close();
readSt.close();
conn.commit();
callStatement(conn, TRACE_DISABLE);
printTrace(conn);
conn.close();
} catch (SQLException e) {
throw new RuntimeException(e);
}
}
}
}