Skip to content

Commit 9dcfaa3

Browse files
committed
pythongh-149202: Fix frame pointer unwinding on s390x and ARM
-fno-omit-frame-pointer is not enough to make every target walkable by the simple manual frame pointer unwinder. On s390x, GCC does not emit a usable backchain unless -mbackchain is also enabled. Without it, the unwinder stops at the current C frame and the test reports no Python frames. s390x also keeps the return address in the caller-provided register save area rather than at fp[1], so read it from the architecture frame layout. On 32-bit ARM, GCC defaults to Thumb mode on common armhf toolchains. The Thumb prologue keeps the saved frame pointer and link register at offsets that depend on the generated frame, which breaks the fp[0]/fp[1] walk used by the helper. Use -marm when it is supported for frame-pointer builds, and teach the helper the GCC ARM-mode saved-fp and saved-lr slots.
1 parent 3efd2f4 commit 9dcfaa3

7 files changed

Lines changed: 171 additions & 23 deletions

File tree

Doc/howto/perf_profiling.rst

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -219,7 +219,9 @@ How to obtain the best results
219219

220220
For best results, keep frame pointers enabled. On supported GCC-compatible
221221
toolchains, CPython builds itself with ``-fno-omit-frame-pointer`` and, when
222-
available, ``-mno-omit-leaf-frame-pointer`` by default. These flags allow
222+
available, ``-mno-omit-leaf-frame-pointer`` by default. On 32-bit ARM,
223+
CPython also adds ``-marm`` when supported. On s390 platforms, CPython also
224+
adds ``-mbackchain`` when supported. These flags allow
223225
profilers to unwind using only the frame pointer and not on DWARF debug
224226
information. This is because as the code that is interposed to allow ``perf``
225227
support is dynamically generated it doesn't have any DWARF debugging information

Doc/using/configure.rst

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -784,9 +784,11 @@ also be used to improve performance.
784784

785785
Disable frame pointers, which are enabled by default (see :pep:`831`).
786786

787-
By default, the build appends ``-fno-omit-frame-pointer`` (and
788-
``-mno-omit-leaf-frame-pointer`` when the compiler supports it) to
789-
``BASECFLAGS`` so profilers, debuggers, and system tracing tools
787+
By default, the build appends ``-fno-omit-frame-pointer``,
788+
``-mno-omit-leaf-frame-pointer`` when the compiler supports it,
789+
``-marm`` on 32-bit ARM when supported, and ``-mbackchain`` on s390
790+
platforms when supported, to ``BASECFLAGS`` so
791+
profilers, debuggers, and system tracing tools
790792
(``perf``, ``eBPF``, ``dtrace``, ``gdb``) can walk the C call stack
791793
without DWARF metadata. The flags propagate to third-party C
792794
extensions through :mod:`sysconfig`. On compilers that do not

Doc/whatsnew/3.15.rst

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2305,8 +2305,9 @@ Build changes
23052305
(:pep:`831`). Pass :option:`--without-frame-pointers` to opt out.
23062306
Authors of C extensions and native libraries built with custom build
23072307
systems should add ``-fno-omit-frame-pointer`` and
2308-
``-mno-omit-leaf-frame-pointer`` to their own ``CFLAGS`` to keep the
2309-
unwind chain intact.
2308+
``-mno-omit-leaf-frame-pointer`` to their own ``CFLAGS``,
2309+
``-marm`` on 32-bit ARM, and ``-mbackchain`` on s390 platforms,
2310+
to keep the unwind chain intact.
23102311
(Contributed by Pablo Galindo Salgado and Savannah Ostrowski in :gh:`149201`.)
23112312

23122313
.. _whatsnew315-windows-tail-calling-interpreter:
Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
Enable frame pointers by default for GCC-compatible CPython builds, including
2-
``-mno-omit-leaf-frame-pointer`` when the compiler supports it, so profilers
3-
and debuggers can unwind native interpreter frames more reliably. Users can pass
2+
``-mno-omit-leaf-frame-pointer``, ``-marm`` on 32-bit ARM, and ``-mbackchain``
3+
on s390 platforms when the compiler supports them, so profilers and debuggers
4+
can unwind native interpreter frames more reliably. Users can pass
45
``--without-frame-pointers`` to opt out.

Modules/_testinternalcapi.c

Lines changed: 47 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -325,6 +325,49 @@ get_jit_backend(PyObject *self, PyObject *Py_UNUSED(args))
325325
#endif
326326
}
327327

328+
static int
329+
next_frame_pointer_is_valid(uintptr_t *frame_pointer, uintptr_t *next_fp,
330+
int stack_grows_down)
331+
{
332+
uintptr_t fp_addr = (uintptr_t)frame_pointer;
333+
uintptr_t next_addr = (uintptr_t)next_fp;
334+
if (next_addr < min_frame_pointer_addr) {
335+
return 0;
336+
}
337+
if ((next_addr % sizeof(uintptr_t)) != 0) {
338+
return 0;
339+
}
340+
if (stack_grows_down) {
341+
return next_addr > fp_addr;
342+
}
343+
return next_addr < fp_addr;
344+
}
345+
346+
static uintptr_t *
347+
next_frame_pointer(uintptr_t *frame_pointer)
348+
{
349+
#if defined(__arm__) && !defined(__thumb__) && !defined(__clang__)
350+
// GCC ARM mode keeps the saved LR at fp[0] and the caller's fp at fp[-1].
351+
return (uintptr_t *)frame_pointer[-1];
352+
#else
353+
return (uintptr_t *)frame_pointer[0];
354+
#endif
355+
}
356+
357+
static uintptr_t
358+
frame_return_address(uintptr_t *frame_pointer)
359+
{
360+
#ifdef __s390x__
361+
// zSeries stores the return address in the caller-provided register save
362+
// area, 14 words above the frame address.
363+
return *(uintptr_t *)((char *)frame_pointer + 14 * sizeof(uintptr_t));
364+
#elif defined(__arm__) && !defined(__thumb__) && !defined(__clang__)
365+
return frame_pointer[0];
366+
#else
367+
return frame_pointer[1];
368+
#endif
369+
}
370+
328371
static PyObject *
329372
manual_unwind_from_fp(uintptr_t *frame_pointer)
330373
{
@@ -348,7 +391,8 @@ manual_unwind_from_fp(uintptr_t *frame_pointer)
348391
if ((fp_addr % sizeof(uintptr_t)) != 0) {
349392
break;
350393
}
351-
uintptr_t return_addr = frame_pointer[1];
394+
uintptr_t *next_fp = next_frame_pointer(frame_pointer);
395+
uintptr_t return_addr = frame_return_address(frame_pointer);
352396

353397
PyObject *addr_obj = PyLong_FromUnsignedLongLong(return_addr);
354398
if (addr_obj == NULL) {
@@ -362,22 +406,10 @@ manual_unwind_from_fp(uintptr_t *frame_pointer)
362406
}
363407
Py_DECREF(addr_obj);
364408

365-
uintptr_t *next_fp = (uintptr_t *)frame_pointer[0];
366-
// Stop if the frame pointer is extremely low.
367-
if ((uintptr_t)next_fp < min_frame_pointer_addr) {
409+
if (!next_frame_pointer_is_valid(frame_pointer, next_fp,
410+
stack_grows_down)) {
368411
break;
369412
}
370-
uintptr_t next_addr = (uintptr_t)next_fp;
371-
if (stack_grows_down) {
372-
if (next_addr <= fp_addr) {
373-
break;
374-
}
375-
}
376-
else {
377-
if (next_addr >= fp_addr) {
378-
break;
379-
}
380-
}
381413
frame_pointer = next_fp;
382414
}
383415

configure

Lines changed: 100 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

configure.ac

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2548,6 +2548,16 @@ AS_VAR_IF([ac_cv_gcc_compat], [yes], [
25482548
AX_CHECK_COMPILE_FLAG([-mno-omit-leaf-frame-pointer], [
25492549
frame_pointer_cflags="$frame_pointer_cflags -mno-omit-leaf-frame-pointer"
25502550
], [], [-Werror])
2551+
AS_CASE([$host_cpu], [arm|armv*], [
2552+
AX_CHECK_COMPILE_FLAG([-marm], [
2553+
frame_pointer_cflags="$frame_pointer_cflags -marm"
2554+
], [], [-Werror])
2555+
])
2556+
AS_CASE([$host_cpu], [s390*], [
2557+
AX_CHECK_COMPILE_FLAG([-mbackchain], [
2558+
frame_pointer_cflags="$frame_pointer_cflags -mbackchain"
2559+
], [], [-Werror])
2560+
])
25512561
], [], [-Werror])
25522562
if test -n "$frame_pointer_cflags" && test "x$with_frame_pointers" != xno; then
25532563
BASECFLAGS="$frame_pointer_cflags $BASECFLAGS"

0 commit comments

Comments
 (0)