perf: eliminate closure allocations in evaluator hot paths by He-Pin · Pull Request #775 · databricks/sjsonnet

He-Pin · 2026-04-12T20:43:57Z

Summary

Replace .map()/.filter()/.foreach() calls with explicit while loops in the evaluator to eliminate per-call closure allocations on hot paths.

Note: Array.map already creates the target array directly (no "intermediate" array). The saved allocation is the closure/lambda passed to these methods, plus the method call overhead of the Scala collections layer.

Changes:

visitArr: replace .map(visitAsLazy) with while loop + empty array shortcut
visitApply: extract evalArgsToArray helper to replace 2× .map calls (tailstrict and non-tailstrict paths)
visitExprWithTailCallSupport: reuse evalArgsToArray for tail-position Apply
visitImportBin: replace .map with while loop for raw bytes → Array[Eval] conversion
visitComp IfSpec: replace .filter with manual ArrayBuilder while loop
visitMemberList: replace fields.foreach with while loop (eliminates per-object construction closure)

Benchmark Results

Hyperfine (20 runs, JVM, M4 Max macOS):

Benchmark	Before	After	Δ
realistic2	418.9ms	397.5ms	+5.1%
bench.02	336.5ms	322.8ms	+4.1%
realistic1	292.5ms	287.8ms	+1.6%
bench.08	250.0ms	249.4ms	flat

Test plan

./mill 'sjsonnet.jvm[3.3.7]'.test — all 4 test suites pass
./mill __.checkFormat — formatting clean

Notes

Low-risk, localized changes (1 file, 70 insertions, 37 deletions). Each method replaces a single .map/.filter/.foreach call with an equivalent while loop — the same pattern already used in visitLocalExpr, visitApplyBuiltin, and several stdlib functions (std.map, std.filter, SetModule, etc.).

The evalArgsToArray helper is reused by both visitApply and visitExprWithTailCallSupport, keeping the code DRY.

Why JIT doesn't fully optimize these away: Array.map goes through scala.collection.ArrayOps (implicit conversion), adding an indirection layer. Lambda bodies with pattern matching (e.g., fields.foreach { case ... }) produce larger bytecode that C2 may decide not to inline.

Replace .map() and .filter() calls with explicit while loops in the evaluator to eliminate intermediate Array allocations that increase GC pressure in hot paths. Changes: - visitArr: replace .map(visitAsLazy) with while loop - visitApply: extract evalArgsToArray helper to replace .map calls - visitExprWithTailCallSupport: reuse evalArgsToArray for tail Apply - visitImportBin: replace .map with while loop - visitComp IfSpec: replace .filter with manual filtered array builder Benchmark results (hyperfine, 20 runs, M4 Max macOS): realistic2: 418.9ms -> 397.5ms (+5.1%) bench.02: 336.5ms -> 322.8ms (+4.1%) realistic1: 292.5ms -> 287.8ms (+1.6%) bench.08: 250.0ms -> 249.4ms (flat) 🤖 Generated with [Qoder][https://qoder.com]

Replace .map()/.filter()/.foreach() calls with explicit while loops in the evaluator to eliminate per-call closure allocations on hot paths. Note: Array.map already creates the target array directly (no "intermediate" array). The saved allocation is the closure/lambda passed to these methods, plus the method call overhead of the Scala collections layer. Changes: - visitArr: replace .map(visitAsLazy) with while loop + empty array shortcut - visitApply: extract evalArgsToArray helper to replace 2x .map calls - visitExprWithTailCallSupport: reuse evalArgsToArray for tail Apply - visitImportBin: replace .map with while loop for raw bytes conversion - visitComp IfSpec: replace .filter with manual ArrayBuilder while loop - visitMemberList: replace fields.foreach with while loop (per-object closure) Benchmark results (hyperfine, 20 runs, JVM, M4 Max macOS): realistic2: 418.9ms -> 397.5ms (+5.1%) bench.02: 336.5ms -> 322.8ms (+4.1%) realistic1: 292.5ms -> 287.8ms (+1.6%) bench.08: 250.0ms -> 249.4ms (flat) 🤖 Generated with [Qoder][https://qoder.com]

He-Pin added 2 commits April 13, 2026 04:11

He-Pin marked this pull request as draft April 12, 2026 21:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: eliminate closure allocations in evaluator hot paths#775

perf: eliminate closure allocations in evaluator hot paths#775
He-Pin wants to merge 2 commits intodatabricks:masterfrom
He-Pin:perf/closure-elimination

He-Pin commented Apr 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

He-Pin commented Apr 12, 2026

Summary

Benchmark Results

Test plan

Notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant