Skip to content

UPSTREAM PR #27111: feat: enable generic access to message full names#160

Open
loci-dev wants to merge 1 commit intomainfrom
loci/pr-27111-yordis-rust-fullname
Open

UPSTREAM PR #27111: feat: enable generic access to message full names#160
loci-dev wants to merge 1 commit intomainfrom
loci/pr-27111-yordis-rust-fullname

Conversation

@loci-dev
Copy link
Copy Markdown

Note

Source pull request: protocolbuffers/protobuf#27111

  • Generic Rust code needs a stable way to refer to protobuf message full names without relying on unstable descriptor or reflection APIs.
  • Message full names are known at code generation time, so exposing them as compile-time metadata avoids adding runtime lookup costs or kernel-specific reflection behavior.
  • Keeping the API narrow and metadata-specific makes the new surface easier to reason about while covering both direct and generic Rust use cases.

Signed-off-by: Yordis Prieto <yordis.prieto@gmail.com>
@loci-review
Copy link
Copy Markdown

loci-review Bot commented Apr 27, 2026

Overview

Analysis of commit 7186ebc ("feat: enable generic access to message full names") shows net positive performance impact across 10,161 total functions (36 modified, 0 new, 0 removed, 10,125 unchanged).

Binary: build.protoc-stable
Power Consumption: 588,875.69 nJ → 588,683.55 nJ (-0.033%)

Function Analysis

MpRepeatedVarintT (TcParser parsing hot path):

  • Response time: 42,559,584 ns → 10,600,380 ns (-75.09%, -32.0 ms)
  • Throughput time: 448.84 ns → 445.01 ns (-0.85%)
  • Impact: Dramatic improvement in performance-critical parsing function. The function's own logic remained unchanged (stable 445ns throughput); the 32ms reduction stems entirely from optimized arena allocator paths (Grow → AllocateAlignedFallback → GetSerialArenaSlow → AllocateBlock). Benefits all repeated field parsing operations.

AllocateOptionsImpl (descriptor building):

  • Response time: 14,911,770 ns → 9,907,052 ns (-33.56%, -5.0 ms)
  • Throughput time: 689.87 ns → 556.77 ns (-19.29%)
  • Impact: Compiler introduced unconditional early exit path optimizing the common case (descriptors without options). Reduces protoc compilation time for enum-heavy proto files.

ValidateFeatureSupport (compile-time validation):

  • Throughput time: 201.24 ns → 121.80 ns (-39.48%)
  • Response time: 631.67 ns → 534.23 ns (-15.43%)
  • Impact: Fast-path optimization bypasses stack canary verification for successful validations, reducing success path from ~10 to ~6 basic blocks.

Code generation functions (default_value, GenerateMemberConstexprConstructor):

  • Throughput improvements: 19-24% (64-86ns reductions)
  • Response times dominated by I/O (minimal change)
  • Impact: Stack canary refactoring and optimized logging infrastructure improve code generation efficiency.

Minor regressions in non-critical paths: call_once (+16.3% throughput, +48ns), InvokeObject (+58.4% throughput, +24ns), and security hardening overhead (+20-32ns) in FindAllFileNames and GetLocationPath. All acceptable trade-offs for initialization/error paths.

Flame Graph Comparison

MpRepeatedVarintT — illustrates the 75% response time improvement driven by arena allocator optimization:

Base version:
Flame Graph: build.protoc-stable::_ZN6google8protobuf8internal8TcParser17MpRepeatedVarintTILb0EjLt1024EEEPKcPNS0_11MessageLiteES5_PNS1_12ParseContextENS1_11TcFieldDataEPKNS1_16TcParseTableBaseEm

Target version:
Flame Graph: build.protoc-stable::_ZN6google8protobuf8internal8TcParser17MpRepeatedVarintTILb0EjLt1024EEEPKcPNS0_11MessageLiteES5_PNS1_12ParseContextENS1_11TcFieldDataEPKNS1_16TcParseTableBaseEm

The base version shows significantly deeper call stacks through memory allocation paths (42.6ms total), while the target version exhibits more efficient allocation patterns with reduced overhead in GetSerialArenaSlow and related functions (10.6ms total).

Additional Findings

No source code changes were detected in the analyzed functions. Performance improvements stem from compiler optimizations enabled by type consistency improvements in the commit. The 75% improvement in MpRepeatedVarintT—a TcParser hot path function—provides the most significant benefit to protobuf's performance-critical parsing infrastructure.

💬 Questions? Tag @loci-dev

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants