Fix CApi tests on S390x#28074
Conversation
…rom external files This change fixes following tests: CApiTest.TestLoadModelFromArrayWithExternalInitializerFromFileArray CApiTest.TestLoadModelFromArrayWithExternalInitializersFromFileArray CApiTest.TestLoadModelFromArrayWithExternalInitializersFromFileArrayPathRobust CApiTest.TestLoadModelFromArrayWithExternalInitializersFromFileMmap
tianleiwu
left a comment
There was a problem hiding this comment.
Thanks for fixing the big-endian external-initializer path. I found one blocking correctness issue before this lands: the new conversion buffer size is based on logical element count, which can over-read packed sub-byte tensor payloads. Please reuse the already-computed TensorProto storage byte size for the allocation and spans.
| auto allocator = CPUAllocator::DefaultInstance(); | ||
|
|
||
| auto deleter = [allocator](uint8_t* ptr) { allocator->Free(ptr); }; | ||
| std::unique_ptr<uint8_t[], decltype(deleter)> native_data{reinterpret_cast<uint8_t*>(allocator->Alloc(element_size * element_count)), deleter}; |
There was a problem hiding this comment.
This should use the TensorProto storage byte size that was already computed in tensor_byte_size, not element_size * tensor_shape.Size(). For packed sub-byte types such as INT4/UINT4/INT2/UINT2/FLOAT4E2M1, tensor_shape.Size() is the logical element count while the external payload has fewer storage bytes. For example, 3 INT4 values occupy 2 bytes, but this code would allocate/span 3 bytes and read one byte past the provided external initializer buffer on big-endian builds.
Please size native_data, src_span, and dst_span from tensor_byte_size (or Tensor::CalculateTensorStorageSize) and use utils::GetElementSizeOfTensor(old_initializer.data_type()) only as the byteswap granularity.
Description
When loading data into tensors from memory buffers from external files, byteswap it if necessary.
Also add a fix for deleter when byteswapping: keep copy of AllocatorPtr instead of reference.
Motivation and Context
While trying to setup local s390x CI, I've found 4 more tests that fail on s390x:
CApiTest.TestLoadModelFromArrayWithExternalInitializerFromFileArray
CApiTest.TestLoadModelFromArrayWithExternalInitializersFromFileArray
CApiTest.TestLoadModelFromArrayWithExternalInitializersFromFileArrayPathRobust
CApiTest.TestLoadModelFromArrayWithExternalInitializersFromFileMmap