Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 8 additions & 8 deletions lib/handle/emphasis.js
Original file line number Diff line number Diff line change
Expand Up @@ -3,21 +3,21 @@
* @import {Emphasis, Parents} from 'mdast'
*/

import {checkEmphasis} from '../util/check-emphasis.js'
import {emphasisMarker} from '../util/emphasis-marker.js'
import {encodeCharacterReference} from '../util/encode-character-reference.js'
import {encodeInfo} from '../util/encode-info.js'

emphasis.peek = emphasisPeek

/**
* @param {Emphasis} node
* @param {Parents | undefined} _
* @param {Parents | undefined} parent
* @param {State} state
* @param {Info} info
* @returns {string}
*/
export function emphasis(node, _, state, info) {
const marker = checkEmphasis(state)
export function emphasis(node, parent, state, info) {
const marker = emphasisMarker(node, parent, state, info)
const exit = state.enter('emphasis')
const tracker = state.createTracker(info)
const before = tracker.move(marker)
Expand Down Expand Up @@ -59,11 +59,11 @@ export function emphasis(node, _, state, info) {
}

/**
* @param {Emphasis} _
* @param {Parents | undefined} _1
* @param {Emphasis} node
* @param {Parents | undefined} parent
* @param {State} state
* @returns {string}
*/
function emphasisPeek(_, _1, state) {
return state.options.emphasis || '*'
function emphasisPeek(node, parent, state) {
return emphasisMarker(node, parent, state, {before: '', after: ''})
}
77 changes: 77 additions & 0 deletions lib/util/emphasis-marker.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
/**
* @import {Emphasis, Parents} from 'mdast'
* @import {State} from 'mdast-util-to-markdown'
*/

import {checkEmphasis} from './check-emphasis.js'

/**
* Pick the marker to use for an emphasis node, flipping from the configured
* marker to its opposite when the configured marker would fuse with an
* adjacent attention delimiter and re-parse as a different construct.
*
* Only emphasis gets the flip. Strong already round-trips through the
* spec's attention algorithm because a run of 4 asterisks pairs as two
* strong delimiters, and a run of 6 as three, and so on. Nested emphasis
* is the asymmetric case: a run of 2 asterisks pairs as one strong, not as
* two nested emphases, so without a flip `emphasis > emphasis > text`
* round-trips as `strong > text`.
*
* Two situations drive a flip, both narrowly scoped to avoid disturbing
* shapes the serializer already handles via fusion:
*
* 1. The emphasis is an only child of an attention parent (emphasis or
* strong), and both its opening and closing markers would be adjacent
* to the parent's primary marker. Using the opposite marker (for
* example, `*_a_*` for `emphasis > emphasis > text` with primary
* `*`) breaks the fusion.
*
* 2. The emphasis sits at the top of a strict same-type chain of depth at
* least 2 (each link has exactly one emphasis child), with primary
* `*`. Three-deep emphasis collapses under rule 17 unless the
* outermost marker is `_`, because `_`'s flanking rules are stricter
* than `*`'s. The check is asymmetric by design: when the configured
* marker is already `_`, the adjacency flip in rule 1 alone is enough.
*
* @param {Emphasis} node
* @param {Parents | undefined} parent
* @param {State} state
* @param {{before: string, after: string}} info
* Only the `before` and `after` fields are read.
* @returns {'*' | '_'}
*/
export function emphasisMarker(node, parent, state, info) {
const primary = checkEmphasis(state)
const other = primary === '*' ? '_' : '*'

if (
parent &&
(parent.type === 'emphasis' || parent.type === 'strong') &&
'children' in parent &&
parent.children.length === 1 &&
info.before.charAt(info.before.length - 1) === primary &&
info.after.charAt(0) === primary
) {
return other
}

if (primary === '*' && strictChainDepth(node) >= 2) return other

return primary
}

/**
Comment on lines +58 to +63
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

observation is correct: strictChainDepth(node) >= 2 does match multiple nodes in a strict chain of depth 4 or deeper, which I'll grant wasn't by explicit design. But applying the proposed gating is a regression, not an improvement. Comparison across depths:

depth current output current reparses as copilot output copilot reparses as
2 *_a_* em > em > text("a") *_a_* em > em > text("a")
3 _*_a_*_ em > em > em > text("a") _*_a_*_ same ✅
4 __*_a_*__ strong > em > em > text("a") _*_*a*_*_ [em(text("*")), em("a"), em(text("*"))]
5 ___*_a_*___ em > strong > em > em > text("a") _*_*_a_*_*_ 4 flat em with literal * text
6 ____*_a_*____ strong > strong > em > em > text("a") _*_*_*a*_*_*_ 4 flat em with literal * text

Both approaches drift at depth 4+, CommonMark has no round-tripping representation for a strict 4-deep emphasis chain. But the current output is strictly more faithful to the input tree:

  • The text content ("a") is preserved.
  • Attention nesting is preserved (the inner em > em survives).
  • Only the outermost element type changes (em → strong).

The proposed fix:

  • Flattens the nesting entirely into sibling em nodes.
  • Injects literal * or _ characters into text nodes that weren't in the original tree (content corruption).

So the __...__ at the outer boundary looks alarming, but it's producing the least-bad degradation at depth 4+. I'd rather keep the accidental over-match and document it than trade it for a drift that corrupts content.

* Count the depth of a strict single-child emphasis chain descending from
* `node`. A chain is strict when every link has exactly one child and that
* child is also `emphasis`.
*
* @param {Emphasis} node
* @returns {number}
*/
function strictChainDepth(node) {
const children = node.children
if (!children || children.length !== 1) return 0
const only = children[0]
if (only.type !== 'emphasis') return 0
return 1 + strictChainDepth(only)
}
Loading
Loading