perf(compile): build dispatch handler context as a stable-shape literal #52
Open
ChristianMurphy wants to merge 1 commit intosyntax-tree:mainfrom
Open
perf(compile): build dispatch handler context as a stable-shape literal #52ChristianMurphy wants to merge 1 commit intosyntax-tree:mainfrom
ChristianMurphy wants to merge 1 commit intosyntax-tree:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR optimizes the markdown-to-mdast compile dispatch loop by reducing per-event allocations in the hot path, aiming to improve performance on large event streams.
Changes:
- Pre-allocates a single merged handler
this-binding object (callContext) and mutates onlysliceSerializeper event. - Refactors the loop to reuse
eventlocal variable for cleaner access and slightly less indexing.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+250
to
+260
| // The handler this-binding receives the same fields as `context` plus a | ||
| // per-event `sliceSerialize`. Pre-allocate the merged object once outside | ||
| // the loop and mutate sliceSerialize per event instead of allocating a | ||
| // fresh merged object per event. Stack/tokenStack/data references stay | ||
| // shared with `context`, so mutations inside handlers remain visible to | ||
| // the post-loop code via `context.tokenStack`. | ||
| const callContext = /** @type {CompileContext} */ ( | ||
| /** @type {unknown} */ ( | ||
| Object.assign({sliceSerialize: undefined}, context) | ||
| ) | ||
| ) |
Member
Author
There was a problem hiding this comment.
@wooorm is this an intentional contract?
I assume it is, but can't find documentation.
42ca832 to
a01d72a
Compare
The compile loop built each handler's this-binding with
Object.assign({sliceSerialize: ...}, context). Object.assign goes
through a generic copy-loop over the source's enumerable properties;
V8 cannot give that loop a stable hidden class for the merged object
because the source's shape is a runtime variable. Each event paid the
cost of a copy-iteration plus a hidden-class transition.
The fix replaces the Object.assign with an inline object literal that
lists every field explicitly. The literal has a fixed compile-time
shape so V8 keeps a single hidden class for the merged object across
every event, allocates it through a fast path, and can inline the
construction at the call site. The merged object is still allocated
per event, so the prior contract (handlers may capture `this` or
reassign top-level fields without leaking into subsequent events) is
preserved. The dispatch is otherwise unchanged.
Inputs that benefit, with multi-run median-of-medians vs the baseline
(spread in parentheses):
10,000 short backtick code spans -36.1% (5.3%)
1,000 inline links in a paragraph -31.0% (2.0%)
10,000 character entity references -29.9% (2.2%)
one CommonMark example -13.8% (5.5%)
CommonMark spec * 35 (~564 KB) -10.6% (3.6%)
CommonMark spec * 7 (~113 KB) -7.8% (18.3%)
full CommonMark spec (~16 KB) -7.4% (43.5%, NOISY)
Every input that drives a non-trivial number of events through the
dispatch loop benefits, because Object.assign was the cost being
eliminated. Inputs with high node-count-per-byte (many small inline
tokens) gain the most.
Trade-offs and inputs where the gain is small or absent:
A 10,000-unmatched-asterisks input and a 256 KB Unicode-heavy input
both moved within +/- 1% of baseline. Their event count per byte is
low, so the per-event Object.assign was not a hotspot.
The pure emphasis stress inputs ('a**b' repeated 10,000 times and
similar) reported large deltas in single runs but their cross-run
spread is 44 to 53% on the baseline alone. The input shape (mostly
attentionSequence events that mostly do not match a handler) means
the per-event merge was not the cost driver. The deltas there are
noise.
Tests pass: dev + prod 1448/1448, 100% coverage maintained,
mdast-util-gfm 54/54, mdast-util-mdx 11/13. The two failing mdx tests
reproduce on upstream/main and are not introduced by this branch.
a01d72a to
5deffad
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 1 out of 2 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+255
to
+257
| // less to optimize. A fresh object per event preserves the previous | ||
| // contract that handlers may capture `this` or reassign top-level fields | ||
| // without leaking into subsequent events. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Initial checklist
Description of changes
The compile loop built each handler's this-binding with
Object.assign({sliceSerialize: ...}, context). Object.assign goes
through a generic copy-loop over the source's enumerable properties;
V8 cannot give that loop a stable hidden class for the merged object
because the source's shape is a runtime variable. Each event paid the
cost of a copy-iteration plus a hidden-class transition.
The fix replaces the Object.assign with an inline object literal that
lists every field explicitly. The literal has a fixed compile-time
shape so V8 keeps a single hidden class for the merged object across
every event, allocates it through a fast path, and can inline the
construction at the call site. The merged object is still allocated
per event, so the prior contract (handlers may capture
thisorreassign top-level fields without leaking into subsequent events) is
preserved. The dispatch is otherwise unchanged.
Inputs that benefit, with multi-run median-of-medians vs the baseline
(spread in parentheses):
10,000 short backtick code spans -36.1% (5.3%)
1,000 inline links in a paragraph -31.0% (2.0%)
10,000 character entity references -29.9% (2.2%)
one CommonMark example -13.8% (5.5%)
CommonMark spec * 35 (~564 KB) -10.6% (3.6%)
CommonMark spec * 7 (~113 KB) -7.8% (18.3%)
full CommonMark spec (~16 KB) -7.4% (43.5%, NOISY)
Every input that drives a non-trivial number of events through the
dispatch loop benefits, because Object.assign was the cost being
eliminated. Inputs with high node-count-per-byte (many small inline
tokens) gain the most.
Trade-offs and inputs where the gain is small or absent:
A 10,000-unmatched-asterisks input and a 256 KB Unicode-heavy input
both moved within +/- 1% of baseline. Their event count per byte is
low, so the per-event Object.assign was not a hotspot.
The pure emphasis stress inputs ('a**b' repeated 10,000 times and
similar) reported large deltas in single runs but their cross-run
spread is 44 to 53% on the baseline alone. The input shape (mostly
attentionSequence events that mostly do not match a handler) means
the per-event merge was not the cost driver. The deltas there are
noise.
Tests pass: dev + prod 1448/1448, 100% coverage maintained,
mdast-util-gfm 54/54, mdast-util-mdx 11/13. The two failing mdx tests
reproduce on upstream/main and are not introduced by this branch.