Commit 4d16e17
authored
perf: add canDirectIterate fast path to fused ByteRenderer materializer (#773)
## Motivation
The fused ByteRenderer (`materializeDirect`) path used by Scala Native
bypasses the upickle Visitor interface for direct byte-level JSON
rendering. However, it was missing the `canDirectIterate` optimization
already present in `Materializer.materializeRecursiveObj` — meaning
every object field went through `visibleKeyNames` allocation + `value()`
HashMap lookup, even for simple inline objects.
Profiling showed **realistic2 spends 99% of its time in
materialization** (155ms materialize vs 1.7ms eval), processing ~125K
objects and producing 28MB JSON output. This makes materializer
optimization the highest-leverage target.
## Key Design Decision
Mirror the `canDirectIterate` fast path from `Materializer.scala` into
`ByteRenderer.scala`, splitting `materializeDirectObj` into three
specialized methods that avoid HashMap lookup for the common case of
inline objects.
## Modification
Split `ByteRenderer.materializeDirectObj` into:
- **`materializeDirectInlineObj`**: Iterates raw
`inlineFieldKeys`/`inlineFieldMembers` arrays directly, invoking members
by index. Handles both multi-field and single-field objects.
- **`materializeDirectSortedInlineObj`**: Uses `_sortedInlineOrder`
cached sort order (shared across all objects from same MemberList) for
sorted output.
- **`materializeDirectGenericObj`**: Fallback to `visibleKeyNames` +
`value()` for complex objects with super chains or excludedKeys.
## Benchmark Results
### JMH (JVM, Scala 3.3.7)
Baseline: master @ `0d13274`
| Benchmark | Before (ms) | After (ms) | Change |
|-----------|-------------|-----------|--------|
| **realistic2** | **57.541** | **49.391** | **-14.2%** ✅ |
| comparison2 | 18.681 | 17.606 | -5.8% ✅ |
| base64Decode | 0.123 | 0.118 | -4.1% |
| bench.02 | 35.401 | 32.904 | -7.1% |
| reverse | 6.717 | 6.883 | +2.5% (noise) |
### Scala Native (hyperfine, 15 runs, 5 warmup)
| Binary | realistic2 (ms) | Relative |
|--------|----------------|----------|
| **sjsonnet (this PR)** | **96.1 ± 2.4** | **1.00x** ✅ |
| jrsonnet (Rust) | 112.8 ± 5.5 | 1.17x slower |
| sjsonnet (master) | 171.8 ± 5.6 | 1.79x slower |
**sjsonnet now beats jrsonnet by 17% on realistic2!**
## Analysis
The `canDirectIterate` fast path eliminates:
1. **`visibleKeyNames` allocation**: No more `ArrayBuffer` → `Array`
creation per object
2. **`value()` HashMap lookup**: No more key-based cache lookup per
field (replaced by direct index invocation)
3. **Validation checks**: Inline fields skip the `value()` validation
path
For realistic2 with 125K objects, this removes ~125K HashMap lookups and
~125K array allocations from the hot materialization loop.
## References
- Mirrors `Materializer.materializeInlineObj` /
`materializeSortedInlineObj` logic
- Related profiling: `sjsonnet --debug-stats
bench/resources/cpp_suite/realistic2.jsonnet`
## Result
All 420 tests pass across JVM/JS/WASM/Native × Scala 2.12/2.13/3.3.7.1 parent 0020cec commit 4d16e17
1 file changed
Lines changed: 139 additions & 34 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2 | 2 | | |
3 | 3 | | |
4 | 4 | | |
| 5 | + | |
5 | 6 | | |
6 | 7 | | |
7 | 8 | | |
| |||
255 | 256 | | |
256 | 257 | | |
257 | 258 | | |
258 | | - | |
259 | | - | |
260 | | - | |
261 | | - | |
262 | | - | |
263 | | - | |
264 | | - | |
265 | | - | |
266 | | - | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
| 266 | + | |
| 267 | + | |
| 268 | + | |
| 269 | + | |
| 270 | + | |
| 271 | + | |
267 | 272 | | |
268 | | - | |
269 | | - | |
270 | | - | |
271 | | - | |
| 273 | + | |
| 274 | + | |
| 275 | + | |
| 276 | + | |
| 277 | + | |
| 278 | + | |
| 279 | + | |
272 | 280 | | |
273 | | - | |
| 281 | + | |
| 282 | + | |
| 283 | + | |
| 284 | + | |
| 285 | + | |
| 286 | + | |
| 287 | + | |
| 288 | + | |
| 289 | + | |
| 290 | + | |
| 291 | + | |
| 292 | + | |
274 | 293 | | |
275 | | - | |
276 | | - | |
277 | | - | |
278 | | - | |
| 294 | + | |
| 295 | + | |
| 296 | + | |
| 297 | + | |
| 298 | + | |
| 299 | + | |
| 300 | + | |
| 301 | + | |
| 302 | + | |
| 303 | + | |
| 304 | + | |
| 305 | + | |
| 306 | + | |
| 307 | + | |
| 308 | + | |
| 309 | + | |
| 310 | + | |
| 311 | + | |
| 312 | + | |
| 313 | + | |
| 314 | + | |
| 315 | + | |
| 316 | + | |
| 317 | + | |
| 318 | + | |
| 319 | + | |
| 320 | + | |
| 321 | + | |
| 322 | + | |
| 323 | + | |
| 324 | + | |
| 325 | + | |
| 326 | + | |
| 327 | + | |
| 328 | + | |
| 329 | + | |
| 330 | + | |
| 331 | + | |
279 | 332 | | |
280 | | - | |
281 | | - | |
| 333 | + | |
| 334 | + | |
| 335 | + | |
| 336 | + | |
| 337 | + | |
| 338 | + | |
| 339 | + | |
| 340 | + | |
| 341 | + | |
| 342 | + | |
| 343 | + | |
| 344 | + | |
| 345 | + | |
282 | 346 | | |
| 347 | + | |
| 348 | + | |
| 349 | + | |
| 350 | + | |
| 351 | + | |
| 352 | + | |
| 353 | + | |
| 354 | + | |
| 355 | + | |
| 356 | + | |
| 357 | + | |
| 358 | + | |
| 359 | + | |
| 360 | + | |
| 361 | + | |
| 362 | + | |
| 363 | + | |
| 364 | + | |
| 365 | + | |
| 366 | + | |
| 367 | + | |
283 | 368 | | |
284 | | - | |
285 | | - | |
| 369 | + | |
286 | 370 | | |
| 371 | + | |
| 372 | + | |
| 373 | + | |
| 374 | + | |
| 375 | + | |
| 376 | + | |
287 | 377 | | |
288 | 378 | | |
289 | 379 | | |
290 | 380 | | |
291 | | - | |
292 | | - | |
293 | | - | |
294 | | - | |
295 | | - | |
296 | | - | |
297 | | - | |
298 | | - | |
299 | | - | |
300 | | - | |
301 | | - | |
302 | | - | |
| 381 | + | |
| 382 | + | |
| 383 | + | |
| 384 | + | |
303 | 385 | | |
304 | 386 | | |
305 | 387 | | |
| 388 | + | |
| 389 | + | |
| 390 | + | |
| 391 | + | |
| 392 | + | |
| 393 | + | |
| 394 | + | |
| 395 | + | |
| 396 | + | |
| 397 | + | |
| 398 | + | |
| 399 | + | |
| 400 | + | |
| 401 | + | |
| 402 | + | |
| 403 | + | |
| 404 | + | |
| 405 | + | |
| 406 | + | |
| 407 | + | |
| 408 | + | |
| 409 | + | |
| 410 | + | |
306 | 411 | | |
307 | 412 | | |
308 | 413 | | |
| |||
0 commit comments