Skip to content

Implement jdbc-fetch-size Across JDBC Connectors #27536

@Dilli-Babu-Godari

Description

@Dilli-Babu-Godari

Implement jdbc-fetch-size support across all JDBC connectors

Context

PR #25355 introduced a jdbc-fetch-size configuration in BaseJdbcConfig to optimize batch fetching from JDBC sources.

Currently, this is only leveraged by the Oracle connector, while other JDBC connectors either:

Do not use it, or
Implement fetch behavior inconsistently

This leads to missed performance optimizations and inconsistent behavior across connectors.

Objective

Enable consistent and configurable fetch size handling across all JDBC connectors using the shared BaseJdbcConfig.

Expected Behavior or Use Case

Presto Component, Service, or Connector

presto-mysql
presto-postgresql
presto-sqlserver
presto-redshift
presto-clickhouse
presto-db2
presto-snowflake
presto-db2fori
presto-teradata
presto-phoenix
presto-informix

Possible Implementation

Each connector should adopt one of the following strategies based on driver behavior:

  1. Standard JDBC

Use:

Statement.setFetchSize(int)
ResultSet.setFetchSize(int)

  1. Driver-Specific Optimization

Use connector-specific APIs where required

Examples:

Oracle → setDefaultRowPrefetch()
PostgreSQL → connection property defaultRowFetchSize
MySQL → requires useCursorFetch=true

  1. Connection-Level Handling

Wrap ConnectionFactory if needed to enforce behavior globally.

Implementation Guidelines

For each connector:

  1. Validate whether the driver actually respects fetch size
  2. Apply fetch size consistently to:
    Data queries
    Metadata queries
  3. Avoid partial or inconsistent usage
  4. Document any driver-specific quirks

Testing

Functional

Verify correctness with varying fetch sizes (1, 1000, 10000)

Performance

Compare default vs configured fetch size
Use large result sets (10k+ to 100k+ rows)

Edge Cases

Minimum fetch size = 1
Very large fetch sizes
Empty result sets

Example Screenshots (if appropriate):

For Oracle Connector:

jdbc-fetch-size=10;

presto> select * from oracle_small_fetch.TM_LAKEHOUSE_ENGINE.TEST_FETCH;
    ID    |  DATA   
------------------+------------
 1211.0000000000 | data_1211  
 1212.0000000000 | data_1212  
 1213.0000000000 | data_1213  
 1214.0000000000 | data_1214  
 1215.0000000000 | data_1215  
 1216.0000000000 | data_1216  
 1217.0000000000 | data_1217  
 1218.0000000000 | data_1218  
 1219.0000000000 | data_1219  
 1220.0000000000 | data_1220
(query aborted by user)

Query 20260407_144910_00006_m2vrb, RUNNING, 1 node
Splits: 17 total, 0 done (0.00%)
[Latency: client-side: 1:15, server-side: 1:11] [0 rows, 0B] [0 rows/s, 0B/s]

jdbc-fetch-size=1000

presto> select * from oracle_large_fetch.TM_LAKEHOUSE_ENGINE.TEST_FETCH;
    ID    |  DATA   
------------------+------------
 1211.0000000000 | data_1211  
 1212.0000000000 | data_1212  
 1213.0000000000 | data_1213  
 1214.0000000000 | data_1214  
 1215.0000000000 | data_1215  
 1216.0000000000 | data_1216  
 1217.0000000000 | data_1217  
 1218.0000000000 | data_1218  
 1219.0000000000 | data_1219  
 1220.0000000000 | data_1220  
 1221.0000000000 | data_1221  
 1222.0000000000 | data_1222
(query aborted by user)

Query 20260407_144746_00005_m2vrb, RUNNING, 1 node
Splits: 17 total, 0 done (0.00%)
[Latency: client-side: 0:11, server-side: 0:06] [0 rows, 0B] [0 rows/s, 0B/s]

cc @agrawalreetika @faizdani-ibm

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions