Skip to content

Commit 0e2ea06

Browse files
committed
VULKAN: Decline SW framebuffer when linear image has padded rowPitch
When the linear-tiled VkImage backing the per-frame swapchain texture ends up with a row pitch wider than width*bpp, vulkan_get_current_sw_framebuffer now returns false instead of handing the core a directly mapped pointer. The core falls back to its own tightly packed buffer, which vulkan_frame() already uploads row-by-row via the existing slow path. Why --- RetroArch's pattern (host-write into a HOST_VISIBLE linear-tiled image at vkGetImageSubresourceLayout's rowPitch, transition PREINITIALIZED -> GENERAL, sample) is fully spec-correct and works on every conformant Vulkan driver. On MoltenVK it doesn't always work. Linear images are backed by a buffer- backed MTLTexture, and Apple requires bytesPerRow alignment of 64 bytes on Apple GPUs (256 on the simulator). For widths whose tight pitch isn't already aligned (e.g. the 2048 core at 376x444 XRGB8888, where 376*4 = 1504 gets padded up to 1536), the host writes and GPU sampling go through different paths in MoltenVK's MVKImage and produce a diagonal shear. The check --- A pure runtime test: 'is rowPitch wider than width*bpp?'. On Mesa, NVIDIA, AMD, ARM Mali, Qualcomm Adreno etc., linear images with sampled+transfer_src usage at retro-friendly widths report rowPitch == width*bpp, so the test is false and the existing direct-write fast path is taken unchanged. Only MoltenVK at awkward widths takes the fallback, paying one extra row-by-row memcpy per frame (~40 MB/s at 60 fps for 376x444x4 - negligible). Fixes 2048 core rendering on iOS Vulkan.
1 parent 6a9b02d commit 0e2ea06

1 file changed

Lines changed: 20 additions & 0 deletions

File tree

gfx/drivers/vulkan.c

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7340,6 +7340,26 @@ static bool vulkan_get_current_sw_framebuffer(void *data,
73407340
}
73417341
}
73427342

7343+
/* If the driver picked a row pitch wider than width*bpp for this linear
7344+
* image (i.e. the image has trailing per-row padding), decline the
7345+
* direct-write SW framebuffer and let the core fall back to its own
7346+
* tightly packed buffer. vulkan_frame() will then upload it row-by-row
7347+
* via the slow path. We hit this on MoltenVK / Apple GPUs, where
7348+
* buffer-backed MTLTextures require bytesPerRow to be aligned to 64 (or
7349+
* 256 in the simulator), so vkGetImageSubresourceLayout reports a
7350+
* padded rowPitch for "awkward" widths. Spec-correct host writes at the
7351+
* reported rowPitch *should* be readable by the GPU sampler at the same
7352+
* stride on any conformant driver, but in practice this is fragile on
7353+
* Apple platforms and produces sheared output. The check is a pure
7354+
* runtime test, so non-Apple drivers that report rowPitch == width*bpp
7355+
* (the overwhelming majority for retro-friendly widths) keep the
7356+
* direct-write fast path unchanged. */
7357+
{
7358+
unsigned bpp = vulkan_format_to_bpp(chain->texture.format);
7359+
if (chain->texture.stride != (size_t)framebuffer->width * bpp)
7360+
return false;
7361+
}
7362+
73437363
framebuffer->data = chain->texture.mapped;
73447364
framebuffer->pitch = chain->texture.stride;
73457365
framebuffer->format = vk->video.rgb32

0 commit comments

Comments
 (0)