Fix Linux madvise() advice values incorrectly combined via bitwise OR in GCToOSInterface::VirtualReset#126966
Fix Linux madvise() advice values incorrectly combined via bitwise OR in GCToOSInterface::VirtualReset#126966
Conversation
…bitwise OR Agent-Logs-Url: https://github.com/dotnet/runtime/sessions/0cc0572d-b79f-46a0-bae2-6159c6d9ca19 Co-authored-by: janvorli <10758568+janvorli@users.noreply.github.com>
…bitwise OR Agent-Logs-Url: https://github.com/dotnet/runtime/sessions/0cc0572d-b79f-46a0-bae2-6159c6d9ca19 Co-authored-by: janvorli <10758568+janvorli@users.noreply.github.com>
Agent-Logs-Url: https://github.com/dotnet/runtime/sessions/0cc0572d-b79f-46a0-bae2-6159c6d9ca19 Co-authored-by: janvorli <10758568+janvorli@users.noreply.github.com>
|
Tagging subscribers to this area: @agocke |
…is not available Agent-Logs-Url: https://github.com/dotnet/runtime/sessions/d656ac8b-2ae6-4625-a13b-ae63c5075c53 Co-authored-by: janvorli <10758568+janvorli@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Fixes Linux GCToOSInterface::VirtualReset to avoid passing an invalid/incorrectly-combined advice value to madvise(), ensuring the GC’s “reset pages” hint behaves correctly across kernel versions.
Changes:
- Replace the single
madvise()call (with OR-combined advice values) with separate calls forMADV_DONTDUMPand the reclaim hint (MADV_FREEor POSIX fallback). - Restructure the POSIX fallback as a proper
#elifso it is only used when theMADV_FREEpath is unavailable.
| #ifdef HAVE_MADV_FREE | ||
| // Tell the kernel that the application doesn't need the pages in the range. | ||
| // Freeing the pages can be delayed until a memory pressure occurs. | ||
| madviseFlags |= MADV_FREE; | ||
| #endif | ||
|
|
||
| st = madvise(address, size, madviseFlags); | ||
|
|
||
| #endif // defined(MADV_DONTDUMP) || defined(HAVE_MADV_FREE) | ||
|
|
||
| #if defined(HAVE_POSIX_MADVISE) && !defined(MADV_DONTDUMP) | ||
| st = madvise(address, size, MADV_FREE); | ||
| #elif defined(HAVE_POSIX_MADVISE) | ||
| // DONTNEED is the nearest posix equivalent of FREE. |
There was a problem hiding this comment.
HAVE_MADV_FREE doesn't appear to be defined by the GC unix CMake configure step (it isn't in config.gc.h.in, and gc/unix/configure.cmake doesn't check for MADV_FREE). As a result, the MADV_FREE path may be compiled out on all platforms, and this change may not actually exercise the intended madvise(..., MADV_FREE) behavior. Consider either (1) keying this branch off MADV_FREE directly, or (2) adding a configure check and #cmakedefine01 HAVE_MADV_FREE entry so the macro is reliably set when available.
There was a problem hiding this comment.
Yes, apparently this was never defined :-(. @copilot can you please replace HAVE_MADV_FREE by MADV_FREE?
Description
VirtualResetwas combiningMADV_FREE(8) andMADV_DONTDUMP(16) via bitwise OR and passing the result (24) to a singlemadvise()call.madvise()takes a single advice constant — not a bitmask — so this was either rejected asEINVAL(kernels < 5.18) or silently matchedMADV_DONTNEED_LOCKED(kernels ≥ 5.18), a completely different operation.Fix: Issue two separate
madvise()calls, and useposix_madvise(POSIX_MADV_DONTNEED)as a proper fallback only whenMADV_FREEis not available:The
posix_madvise(POSIX_MADV_DONTNEED)path is now an#elifchained to#ifdef HAVE_MADV_FREE, making it a proper fallback that only executes whenMADV_FREEis unavailable. The previous#if defined(HAVE_POSIX_MADVISE) && !defined(MADV_DONTDUMP)condition was semantically unrelated to that fallback decision. This matches the pattern already used byVirtualReserveInnerandVirtualCommitInnerin the same file. Introduced by #95643.Customer Impact
On all Linux builds where both
MADV_DONTDUMPandMADV_FREEare defined (i.e., all modern glibc/x86-64 Linux):MADV_DONTDUMPis never applied (reset memory appears in coredumps).MADV_DONTNEED_LOCKEDis invoked instead — immediately discarding pages including locked pages, which is semantically incorrect and potentially dangerous.Regression
Yes — introduced in .NET 9 by #95643 (an optimization to reduce syscall count).
Testing
The bug is a straightforward misuse of a syscall interface. The fix has been manually verified against the Linux
madvise(2)man page and kernel headers. No existing automated test infrastructure coversmadvise()advice values directly.Risk
Low. The change is a minimal, targeted fix: replace one combined
madvise()call with two separate calls, and restructure theposix_madvisefallback as a proper#elif. No logic changes, no new APIs, no structural changes.Package authoring no longer needed in .NET 9
IMPORTANT: Starting with .NET 9, you no longer need to edit a NuGet package's csproj to enable building and bump the version.
Keep in mind that we still need package authoring in .NET 8 and older versions.