enhance: [ExternalTable Part8 - 1/4] cross-bucket, schemaless reader, force nullable#49061
enhance: [ExternalTable Part8 - 1/4] cross-bucket, schemaless reader, force nullable#49061weiliu1031 wants to merge 1 commit intomilvus-io:masterfrom
Conversation
|
[ci-v2-notice] To rerun ci-v2 checks, comment with:
If you have any questions or requests, please contact @zhikunyao. |
|
@weiliu1031 Please associate the related issue to the body of your Pull Request. (eg. "issue: #") |
Codecov Report❌ Patch coverage is ❌ Your patch check has failed because the patch coverage (60.38%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## master #49061 +/- ##
==========================================
- Coverage 78.11% 78.02% -0.09%
==========================================
Files 2169 2170 +1
Lines 358449 358360 -89
==========================================
- Hits 279990 279598 -392
- Misses 69879 70168 +289
- Partials 8580 8594 +14
🚀 New features to boost your workflow:
|
4392585 to
5d3478a
Compare
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: tedxu, weiliu1031 The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
5d3478a to
f04e8c0
Compare
3f54299 to
d05aa88
Compare
|
/ci-rerun-build |
|
/ci-rerun-ut-go |
|
/ci-rerun-ut-cpp |
|
/ci-rerun-ut-integration |
|
/ci-rerun-e2e-default |
|
/ci-rerun-ut-go |
|
/ci-rerun-ut-cpp |
|
/ci-rerun-integration |
|
/ci-rerun-e2e-default |
1 similar comment
|
/ci-rerun-e2e-default |
|
/ci-rerun-ciloop |
|
/ci-rerun-e2e-default |
|
/ci-rerun-e2e |
|
/ci-rerun-e2e-default |
|
/ci-rerun-go-sdk |
|
/ci-rerun-e2e-default |
|
/ci-rerun-go-sdk |
1 similar comment
|
/ci-rerun-go-sdk |
|
/ci-rerun-e2e-default |
1 similar comment
|
/ci-rerun-e2e-default |
33e651f to
0d9fdcd
Compare
… force nullable This is part of a 4-PR series delivering Part 8 of the external table feature: 1/4 cross-bucket, schemaless reader, force nullable 2/4 control plane — DDL validation, refresh infrastructure 3/4 Iceberg external table support 4/4 segment load — reduce overhead, memory estimation - Cross-bucket external data source support: detect same-bucket prefix and treat non-matching paths as cross-bucket, routed via extfs. - Schemaless reader: external collections open the Reader with a nullptr arrow schema plus an explicit needed_columns projection, instead of the full pre-built arrow schema. - Force all external table user fields to nullable=true so the Arrow buffers emitted by external sources can be consumed uniformly, and populate valid_data accordingly in ArrowToDataArray. - Replace external_access_mode string with use_take_for_output bool. - Skip redundant S3 data pull during segment load. - Introduce pkg/util/externalspec shared parsing helper and paramtable entries for external collection behavior. - Fix CI compilation and test issues. This commit intentionally does NOT include the refresh pipeline work (manager/checker/inspector/task, schema DDL updater, refresh task update, and e2e refresh tests); those land in the follow-up commit so each commit stays on a single theme. Signed-off-by: Wei Liu <wei.liu@zilliz.com>
0d9fdcd to
4749d90
Compare
issue: #45881
Summary
Part 1/4 of the External Table Part 8 series. Foundation layer for external collections — delivers data-plane primitives that Parts 2–4 build on.
extfs.nullptrarrow schema + explicitneeded_columnsprojection instead of full pre-built arrow schema.nullable=trueso Arrow buffers from external sources are consumed uniformly (valid_datapopulated inArrowToDataArray).external_access_modestring withuse_take_for_outputbool.pkg/util/externalspecparsing package + paramtable entries (100% test coverage).Related to #45881 (External Collection with Lakehouse Integration tracking).
Why split this way
This PR intentionally excludes the refresh pipeline (manager/checker/inspector/task, schema DDL updater, refresh task update, e2e refresh tests) — those land in Part 2/4. Each PR stays on a single theme for reviewability.
Test plan