perf: eliminate closure allocations in evaluator hot paths#775
Draft
He-Pin wants to merge 2 commits intodatabricks:masterfrom
Draft
perf: eliminate closure allocations in evaluator hot paths#775He-Pin wants to merge 2 commits intodatabricks:masterfrom
He-Pin wants to merge 2 commits intodatabricks:masterfrom
Conversation
Replace .map() and .filter() calls with explicit while loops in the evaluator to eliminate intermediate Array allocations that increase GC pressure in hot paths. Changes: - visitArr: replace .map(visitAsLazy) with while loop - visitApply: extract evalArgsToArray helper to replace .map calls - visitExprWithTailCallSupport: reuse evalArgsToArray for tail Apply - visitImportBin: replace .map with while loop - visitComp IfSpec: replace .filter with manual filtered array builder Benchmark results (hyperfine, 20 runs, M4 Max macOS): realistic2: 418.9ms -> 397.5ms (+5.1%) bench.02: 336.5ms -> 322.8ms (+4.1%) realistic1: 292.5ms -> 287.8ms (+1.6%) bench.08: 250.0ms -> 249.4ms (flat) 🤖 Generated with [Qoder][https://qoder.com]
Replace .map()/.filter()/.foreach() calls with explicit while loops in the evaluator to eliminate per-call closure allocations on hot paths. Note: Array.map already creates the target array directly (no "intermediate" array). The saved allocation is the closure/lambda passed to these methods, plus the method call overhead of the Scala collections layer. Changes: - visitArr: replace .map(visitAsLazy) with while loop + empty array shortcut - visitApply: extract evalArgsToArray helper to replace 2x .map calls - visitExprWithTailCallSupport: reuse evalArgsToArray for tail Apply - visitImportBin: replace .map with while loop for raw bytes conversion - visitComp IfSpec: replace .filter with manual ArrayBuilder while loop - visitMemberList: replace fields.foreach with while loop (per-object closure) Benchmark results (hyperfine, 20 runs, JVM, M4 Max macOS): realistic2: 418.9ms -> 397.5ms (+5.1%) bench.02: 336.5ms -> 322.8ms (+4.1%) realistic1: 292.5ms -> 287.8ms (+1.6%) bench.08: 250.0ms -> 249.4ms (flat) 🤖 Generated with [Qoder][https://qoder.com]
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Replace
.map()/.filter()/.foreach()calls with explicitwhileloops in the evaluator to eliminate per-call closure allocations on hot paths.Note:
Array.mapalready creates the target array directly (no "intermediate" array). The saved allocation is the closure/lambda passed to these methods, plus the method call overhead of the Scala collections layer.Changes:
visitArr: replace.map(visitAsLazy)with while loop + empty array shortcutvisitApply: extractevalArgsToArrayhelper to replace 2×.mapcalls (tailstrict and non-tailstrict paths)visitExprWithTailCallSupport: reuseevalArgsToArrayfor tail-position ApplyvisitImportBin: replace.mapwith while loop for raw bytes →Array[Eval]conversionvisitCompIfSpec: replace.filterwith manualArrayBuilderwhile loopvisitMemberList: replacefields.foreachwith while loop (eliminates per-object construction closure)Benchmark Results
Hyperfine (20 runs, JVM, M4 Max macOS):
Test plan
./mill 'sjsonnet.jvm[3.3.7]'.test— all 4 test suites pass./mill __.checkFormat— formatting cleanNotes
Low-risk, localized changes (1 file, 70 insertions, 37 deletions). Each method replaces a single
.map/.filter/.foreachcall with an equivalentwhileloop — the same pattern already used invisitLocalExpr,visitApplyBuiltin, and several stdlib functions (std.map,std.filter,SetModule, etc.).The
evalArgsToArrayhelper is reused by bothvisitApplyandvisitExprWithTailCallSupport, keeping the code DRY.Why JIT doesn't fully optimize these away:
Array.mapgoes throughscala.collection.ArrayOps(implicit conversion), adding an indirection layer. Lambda bodies with pattern matching (e.g.,fields.foreach { case ... }) produce larger bytecode that C2 may decide not to inline.