Fix CApi tests on S390x by AlekseiNikiforovIBM · Pull Request #28074 · microsoft/onnxruntime

AlekseiNikiforovIBM · 2026-04-15T08:37:39Z

Description

When loading data into tensors from memory buffers from external files, byteswap it if necessary.

Also add a fix for deleter when byteswapping: keep copy of AllocatorPtr instead of reference.

Motivation and Context

While trying to setup local s390x CI, I've found 4 more tests that fail on s390x:

CApiTest.TestLoadModelFromArrayWithExternalInitializerFromFileArray
CApiTest.TestLoadModelFromArrayWithExternalInitializersFromFileArray
CApiTest.TestLoadModelFromArrayWithExternalInitializersFromFileArrayPathRobust
CApiTest.TestLoadModelFromArrayWithExternalInitializersFromFileMmap

…rom external files This change fixes following tests: CApiTest.TestLoadModelFromArrayWithExternalInitializerFromFileArray CApiTest.TestLoadModelFromArrayWithExternalInitializersFromFileArray CApiTest.TestLoadModelFromArrayWithExternalInitializersFromFileArrayPathRobust CApiTest.TestLoadModelFromArrayWithExternalInitializersFromFileMmap

tianleiwu

Thanks for fixing the big-endian external-initializer path. I found one blocking correctness issue before this lands: the new conversion buffer size is based on logical element count, which can over-read packed sub-byte tensor payloads. Please reuse the already-computed TensorProto storage byte size for the allocation and spans.

tianleiwu · 2026-04-15T21:23:14Z

+        auto allocator = CPUAllocator::DefaultInstance();
+
+        auto deleter = [allocator](uint8_t* ptr) { allocator->Free(ptr); };
+        std::unique_ptr<uint8_t[], decltype(deleter)> native_data{reinterpret_cast<uint8_t*>(allocator->Alloc(element_size * element_count)), deleter};


This should use the TensorProto storage byte size that was already computed in tensor_byte_size, not element_size * tensor_shape.Size(). For packed sub-byte types such as INT4/UINT4/INT2/UINT2/FLOAT4E2M1, tensor_shape.Size() is the logical element count while the external payload has fewer storage bytes. For example, 3 INT4 values occupy 2 bytes, but this code would allocate/span 3 bytes and read one byte past the provided external initializer buffer on big-endian builds.

Please size native_data, src_span, and dst_span from tensor_byte_size (or Tensor::CalculateTensorStorageSize) and use utils::GetElementSizeOfTensor(old_initializer.data_type()) only as the byteswap granularity.

AlekseiNikiforovIBM added 2 commits April 15, 2026 08:44

Deleter: capture copy of variable instead of reference

64994e5

AlekseiNikiforovIBM changed the title ~~S390x capi tests~~ Fix CApi tests on S390x Apr 15, 2026

tianleiwu requested changes Apr 15, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix CApi tests on S390x#28074

Fix CApi tests on S390x#28074
AlekseiNikiforovIBM wants to merge 2 commits intomicrosoft:mainfrom
AlekseiNikiforovIBM:s390x_capi_tests

AlekseiNikiforovIBM commented Apr 15, 2026

Uh oh!

tianleiwu left a comment

Uh oh!

tianleiwu Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

AlekseiNikiforovIBM commented Apr 15, 2026

Description

Motivation and Context

Uh oh!

tianleiwu left a comment

Choose a reason for hiding this comment

Uh oh!

tianleiwu Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants