Skip to content

spec: machine-readable export for agent conversations (#10112)#10500

Open
lonexreb wants to merge 3 commits intowarpdotdev:masterfrom
lonexreb:spec/10112-agent-conversation-export
Open

spec: machine-readable export for agent conversations (#10112)#10500
lonexreb wants to merge 3 commits intowarpdotdev:masterfrom
lonexreb:spec/10112-agent-conversation-export

Conversation

@lonexreb
Copy link
Copy Markdown
Contributor

@lonexreb lonexreb commented May 8, 2026

Spec-only PR for #10112 — JSON export option for agent conversations.

Summary

Warp can already export agent conversations to Markdown, but Markdown is for human reading: tool calls, plan steps, reasoning blocks, and structured tool I/O collapse into prose. Scripts and tools that want to consume agent transcripts have to parse rendered Markdown, which is brittle. This spec adds a versioned JSON export option to the existing Export Conversation flow so transcripts are losslessly machine-readable.

Scope (V1)

  • New "JSON" option in the existing Export Conversation UI alongside "Markdown".
  • Versioned schema (schema_version semver, V1.0.0) with typed content blocks: text, reasoning, tool_call, plan_step, code_diff, image.
  • Full round-trip fidelity for messages, tool calls (input + output), plan steps with status, reasoning with duration, code diffs with per-file structure, and inline images as data URIs.
  • Redaction-aware: secrets/PII already stripped from the rendered conversation stay stripped in JSON.
  • Optional truncation for large tool outputs (off by default, with truncated/original_size_bytes flags so consumers can detect).
  • CLI parity: warp export-conversation <id> --format json --output path (or stdout via -).

Non-Goals

  • Not a sync, replication, or live-streaming API — one-way local file export.
  • Not a server-hosted endpoint.
  • No re-import / restore-from-export UI in V1 (the schema enables it).
  • No bulk-export-all-conversations in V1.

Schema versioning

schema_version follows semver. Decoders MUST accept the current major, SHOULD reject unknown majors, and SHOULD pass through unknown minor-version fields and content-block types unchanged. This lets future Warp versions evolve the schema while keeping older tools still able to parse past exports.

Spec document

specs/GH10112/SPEC.md — full Behavior Contract (B1–B8 with the typed content-block schema), Acceptance Criteria (A1–A10), Implementation Pointers verified against the codebase (app/src/ai/agent/conversation.rs, app/src/ai/agent/conversation_yaml.rs, app/src/ai/agent_conversations_model.rs, app/src/ai/agent/api/convert_conversation.rs), Tests (T1–T11), Open Questions, and Telemetry notes.

Labels

ready-to-spec

@cla-bot cla-bot Bot added the cla-signed label May 8, 2026
@github-actions github-actions Bot added the external-contributor Indicates that a PR has been opened by someone outside the Warp team. label May 8, 2026
@oz-for-oss
Copy link
Copy Markdown
Contributor

oz-for-oss Bot commented May 8, 2026

@lonexreb

I'm starting a first review of this spec-only pull request.

You can view the conversation on Warp.

I completed the review and no human review was requested for this pull request.

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

Copy link
Copy Markdown
Contributor

@oz-for-oss oz-for-oss Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overview

This spec adds a versioned JSON export for agent conversations. The direction is clear, but several requirements are ambiguous or internally inconsistent enough to risk divergent or unsafe implementation.

Concerns

  • The export setting is named two different ways.
  • The tool_call schema constrains input/output as objects even though lossless tool I/O may be scalar or text.
  • CLI export is promised, but redaction is defined only via the rendered UI tree.
  • Sensitive reasoning export is left as an open question while the behavior defaults to including it.
  • A decoder round-trip test conflicts with the V1 non-goal of no re-import/restore path.

Security

  • CLI redaction needs an explicit redacted-source contract so headless exports cannot accidentally read raw content.
  • Reasoning blocks need an explicit include/exclude policy and controls before implementation.

Verdict

Found: 0 critical, 5 important, 0 suggestions

Request changes

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

Comment thread specs/GH10112/SPEC.md Outdated

### B1. UI entry point

The existing "Export Conversation" UI gains a format selector with two options: `Markdown` (existing, default behavior preserved) and `JSON` (new). Settings → Agents → Conversation Export adds `default_export_format` (enum, default `Markdown`) which seeds that selector.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] default_export_format conflicts with agent.conversation_export.default_format later in the spec; choose one canonical setting key so UI, CLI, migration, and tests implement the same API.

Comment thread specs/GH10112/SPEC.md Outdated

- `{ "type": "text", "text": "..." }`
- `{ "type": "reasoning", "text": "...", "duration_ms": <int> }`
- `{ "type": "tool_call", "tool": "<name>", "input": { ... }, "output": { ... }, "duration_ms": <int>, "status": "ok" | "error" }`
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] The schema restricts input and output to JSON objects, but lossless tool I/O also needs strings, arrays, numbers/null, and possibly structured stdout/stderr; define these fields as arbitrary JSON values plus truncation metadata, or add a wrapper shape.

Comment thread specs/GH10112/SPEC.md

### B4. Redaction is preserved

Any content already redacted in the displayed conversation MUST appear redacted in the JSON export — typically as a `"[REDACTED]"` placeholder string in the relevant `text` / `input` / `output` field. The export pipeline reuses the redaction state from the rendered conversation tree; it never re-reads the raw underlying source. This guarantees secrets that were stripped from the UI cannot reappear in an export.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] [SECURITY] Redaction is defined as reusing the rendered conversation tree, but CLI export may run without a rendered UI tree; specify the redacted source or pipeline CLI must use so headless export cannot read raw secrets.

Comment thread specs/GH10112/SPEC.md Outdated
- T8. Integration: CLI `--format json --output -` and UI export produce byte-identical (or at minimum JSON-structurally-identical) output for the same conversation snapshot.
- T9. Unit: filename slug matches `<conversation_title-or-id>-<timestamp>.warp-export.json` for several title shapes including titles with spaces, slashes, and unicode.
- T10. Unit: a conversation with an inline image serializes as a `data_uri` content block.
- T11. Unit: unknown content-block type encountered during decode round-trips unchanged (forward-compat probe).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] T11 requires decode round-tripping unknown content-block types, but V1 explicitly has no re-import/restore path; either define an in-product decoder/parser for this test or move this to external consumer guidance.

Comment thread specs/GH10112/SPEC.md Outdated

- Should we ALSO ship YAML and JSONL? Suggest V1 = JSON only, V1.5 = JSONL (one message per line) for streaming consumption. Defer YAML unless concrete demand surfaces.
- Should image blocks support an external-file mode (`{"type":"image","path":"..."}`) for very large images, with the JSON exported alongside an `assets/` sibling directory? Suggest defer to V1.5.
- Should there be an explicit "include reasoning blocks" toggle? Reasoning may be sensitive in some workflows. Default include; offer toggle if early users ask.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] [SECURITY] Reasoning blocks are sensitive and currently included by default while the control is left as an open question; resolve this in the behavior contract with an explicit include/exclude policy and UI/CLI setting before implementation.

@lonexreb
Copy link
Copy Markdown
Contributor Author

lonexreb commented May 8, 2026

Round-1 fixes for oz-for-oss review: (1) export setting unified as agent.conversation_export.default_format everywhere; (2) tool_call input/output types now use discriminated wrapper accepting object/array/string/number/boolean/null variants — lossless for any JSON-representable tool I/O; (3) CLI redaction contract — headless exports MUST source from RedactedConversationView interface; CLI is a thin wrapper over the same export pipeline as UI; T_cli_redaction_parity added; (4) reasoning include/exclude resolved — V1 includes by default; new agent.conversation_export.include_reasoning toggle (bool, default true); UI surface added; (5) renamed 'round-trip test' to 'schema decoder validation' (JSON Schema validation, not re-import) — aligns with V1 non-goal.

@lonexreb
Copy link
Copy Markdown
Contributor Author

lonexreb commented May 8, 2026

/oz-review

@oz-for-oss
Copy link
Copy Markdown
Contributor

oz-for-oss Bot commented May 8, 2026

@lonexreb

I'm re-reviewing this spec-only pull request in response to a review request.

You can view the conversation on Warp.

I completed the review and no human review was requested for this pull request.

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

Copy link
Copy Markdown
Contributor

@oz-for-oss oz-for-oss Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overview

This spec defines a JSON export format for agent conversations with UI and CLI parity, redaction preservation, typed content blocks, truncation controls, and telemetry updates.

Concerns

  • The spec requires validation against a documented JSON Schema, but the schema itself is not included or located, so implementers and tests do not have a concrete contract to validate against.
  • Several typed block shapes remain placeholders, which undermines the promised round-trip fidelity for plan steps and code diffs.
  • Tool-output truncation is ambiguous now that outputs are discriminated TypedValues that may be strings, arrays, objects, numbers, booleans, or null.

Security

  • Reasoning is included by default even though the spec acknowledges users may not want it in shared exports; the export action needs an explicit per-export disclosure or override so users do not accidentally include sensitive reasoning content.

Verdict

Found: 0 critical, 3 important, 1 suggestions

Request changes

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

Comment thread specs/GH10112/SPEC.md Outdated
- `{ "type": "reasoning", "text": "...", "duration_ms": <int> }`
- `{ "type": "tool_call", "tool": "<name>", "input": <TypedValue>, "output": <TypedValue>, "duration_ms": <int>, "status": "ok" | "error" }`
- `{ "type": "plan_step", "title": "...", "status": "pending" | "in_progress" | "complete", "items": [ ... ] }`
- `{ "type": "code_diff", "files": [ { "path": "...", "before": "...", "after": "...", "hunks": [ ... ] } ] }`
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] plan_step.items and code_diff.hunks are still placeholders, so the spec does not define enough structure to deliver the promised round-trip fidelity; define the item and hunk schemas, including line ranges and old/new content semantics.

Comment thread specs/GH10112/SPEC.md Outdated

Default: large tool outputs are included in full.

If the user enables `agent.conversation_export.truncate_large_outputs`, each tool-call `output` exceeding `agent.conversation_export.large_output_limit_kb` (default 64 KiB) is truncated to that limit. The block additionally carries `"truncated": true` and `"original_size_bytes": <int>` so consumers can detect and re-fetch if needed.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] Truncation is underspecified for non-string TypedValue outputs; state whether the byte limit applies to serialized JSON, nested strings, or only string outputs, and whether the original type must be preserved after truncation.

Comment thread specs/GH10112/SPEC.md
- A5. With `truncate_large_outputs = true` and limit `64 KiB`, the same tool call's `output` is truncated to 64 KiB and the block carries `"truncated": true` and `"original_size_bytes": 204800`.
- A6. The CLI `--format json` flag produces output identical to the UI export for the same conversation. The CLI consumes `RedactedConversationView` (B4.1) and runs the same exporter the UI runs.
- A6a. CLI export of a conversation containing redacted secrets produces `[REDACTED]` placeholders in exactly the same fields the UI export does (CLI redaction parity).
- A7. The exported JSON validates against the documented JSON Schema for `schema_version 1.0.0`. (Schema-decoder validation; this is NOT a re-import test — V1 has no re-import path. See Non-Goals.)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] A7 requires validation against the documented JSON Schema, but this PR does not include a schema or a path to one; add the schema artifact or include the full schema so implementation and tests have a single source of truth.

Comment thread specs/GH10112/SPEC.md

### B4.2 Reasoning blocks: include/exclude policy

V1 INCLUDES reasoning blocks (`{ "type": "reasoning", ... }`) by default. Some users will not want reasoning surfaced in shared exports, so V1 ships an explicit toggle:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 [SUGGESTION] [SECURITY] Reasoning is exported by default despite being sensitive in shared transcripts; require the export UI to show the include-reasoning state and allow a per-export override or confirmation before writing reasoning blocks.

@lonexreb
Copy link
Copy Markdown
Contributor Author

lonexreb commented May 8, 2026

Addressed all 3 important + 1 suggestion + 1 security concern from the latest oz-for-oss review:

  1. JSON Schema artifact (new B9): Located the schema in-tree at app/src/ai/agent/export_schema/v1.0.0/schema.json (JSON Schema draft 2020-12), with canonical $id, generated from the Rust types in export_schema.rs via build-time generator with a CI drift check. Tests load the artifact rather than hand-rolling fixtures. Added A_schema_in_tree and T_schema_artifact_in_tree.
  2. Plan-step and code-diff placeholders (new B2.1.2, B2.1.3): Replaced ... with full concrete schemas. Plan steps: id/title/status required, fixed status enum (added blocked and cancelled), nullable optional fields explicitly present as null, recursive subitems, blocked_reason invariant. Code diffs: structured hunks with old_start/old_lines/new_start/new_lines/lines[], fixed line-kind enum (context/added/removed/no_newline), binary and rename handling. New A_-criteria and T-tests.
  3. Tool-output truncation per TypedValue (rewritten B5, new B5.1/B5.2/B5.3): Defined sizing as canonical UTF-8 JSON of inner value with lex-sorted keys; per-type rules — string truncates at codepoint boundary; array drops trailing elements; object drops trailing keys lex-order; number/boolean/null NEVER truncate (and the truncation flags MUST NOT appear). Added truncation_strategy field. Five new acceptance criteria and tests cover each path.
  4. Reasoning per-export disclosure (new B4.2.1, security): Added a per-export disclosure: when reasoning blocks exist AND effective include_reasoning=true, the Export dialog shows an inline notice and a per-export checkbox pre-seeded from the user setting (toggling does NOT write the setting). CLI counterpart emits a single warning line to stderr (not stdout, so --output - pipelines stay clean). New A_-criteria and T-tests for both UI and CLI paths, including the "no disclosure when no reasoning" case.

Commit: a7b9853

@lonexreb
Copy link
Copy Markdown
Contributor Author

lonexreb commented May 8, 2026

/oz-review

@oz-for-oss
Copy link
Copy Markdown
Contributor

oz-for-oss Bot commented May 8, 2026

@lonexreb

I'm re-reviewing this spec-only pull request in response to a review request.

You can view the conversation on Warp.

I completed the review and no human review was requested for this pull request.

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

Copy link
Copy Markdown
Contributor

@oz-for-oss oz-for-oss Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overview

The spec defines a new machine-readable JSON export for agent conversations, including UI/CLI surfaces, schema versioning, redaction expectations, truncation behavior, and validation tests. The overall structure is complete, but several requirements are internally inconsistent or infeasible as written.

Concerns

  • Reasoning export is default-on even though the spec explicitly acknowledges reasoning can contain sensitive draft thoughts or API keys.
  • String truncation mixes raw UTF-8 byte limits with canonical JSON byte sizing, so implementations cannot satisfy both requirements consistently.
  • The JSON Schema requirements ask the schema to enforce runtime byte-size conditions that JSON Schema cannot know.
  • The test plan requires decode round-tripping despite V1 declaring no re-import/decode path.
  • Plan-step status coverage is ambiguous between top-level plan_step.status and nested items[].status.

Security

  • Reasoning inclusion should use a safe default or explicit per-export opt-in before sensitive reasoning is serialized.

Verdict

Found: 0 critical, 5 important, 1 suggestions

Request changes

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

Comment thread specs/GH10112/SPEC.md

V1 INCLUDES reasoning blocks (`{ "type": "reasoning", ... }`) by default. Some users will not want reasoning surfaced in shared exports, so V1 ships an explicit toggle:

- Setting: `agent.conversation_export.include_reasoning` — bool, default `true`.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] [SECURITY] The spec defaults reasoning export to true even though B4.2.1 states reasoning may contain sensitive draft thoughts or API keys. Make the safe default false, or require explicit per-export opt-in for UI and CLI before any reasoning is serialized.

Comment thread specs/GH10112/SPEC.md

| `output.type` | Truncation behavior when `original_size_bytes > limit_bytes` |
|---|---|
| `string` | Replace `value` with the first `limit_bytes` bytes of the original UTF-8 string, snapped backward to the nearest valid codepoint boundary so the result is well-formed UTF-8. |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] B5.1 measures original_size_bytes using canonical JSON bytes, but this rule truncates string values to raw UTF-8 bytes; the serialized value can still exceed limit_bytes because JSON adds quotes/escapes. Define string truncation against canonical serialized size or change B5.1 to measure raw string bytes.

Comment thread specs/GH10112/SPEC.md
`TypedValue` variants in B2.1.1), `plan_step` (B2.1.2),
`code_diff` (B2.1.3), `image`. Required vs optional fields and
enum values match this spec exactly.
- **Truncation flags**: the schema constrains `truncated`,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] JSON Schema cannot enforce the runtime iff rule that truncation fields appear only when original_size_bytes > limit_bytes, because the configured limit and canonical byte size are not available to the schema. Restrict this section to schema-enforceable type/field constraints and leave size-dependent checks to tests.

Comment thread specs/GH10112/SPEC.md
- T_reasoning_omitted. Unit: with `include_reasoning = false`, reasoning blocks are omitted entirely from `content` arrays — no empty marker, no `[REDACTED]` stub — and surrounding text blocks remain in their original order. Cover the `[text, reasoning, text]` -> `[text, text]` case explicitly.
- T9. Unit: filename slug matches `<conversation_title-or-id>-<timestamp>.warp-export.json` for several title shapes including titles with spaces, slashes, and unicode.
- T10. Unit: a conversation with an inline image serializes as a `data_uri` content block.
- T11. Unit: unknown content-block `type` encountered during decode round-trips unchanged (forward-compat probe).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] This requires a decode/round-trip path even though V1 explicitly has no re-import path and Implementation Pointers only define an exporter/schema. Either scope a decoder into V1 or replace this with a schema forward-compat test that does not require decoding unknown block types.

Comment thread specs/GH10112/SPEC.md
- T10. Unit: a conversation with an inline image serializes as a `data_uri` content block.
- T11. Unit: unknown content-block `type` encountered during decode round-trips unchanged (forward-compat probe).
- T_schema_artifact_in_tree. Unit: `app/src/ai/agent/export_schema/v1.0.0/schema.json` exists, parses as JSON Schema draft 2020-12, and matches the generator output of `export_schema.rs` byte-for-byte.
- T_plan_step_items_concrete. Unit: a plan step exercising every `status` value (including `blocked` with a non-null `blocked_reason` and `cancelled`), with mixed `subitems` depths, round-trips losslessly and validates against the schema.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] plan_step status is defined as only pending | in_progress | complete, while blocked/cancelled live on items[]; this test reads as requiring top-level plan steps to exercise blocked and cancelled. Reword it to plan_step.items or extend the top-level plan_step.status enum.

Comment thread specs/GH10112/SPEC.md
contain those fields — they only appear when an actual truncation
occurred.

#### B5.3 Worked example (object)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 [SUGGESTION] The heading says object, but the example is a string output; rename it so implementers do not look for an object truncation fixture.

Suggested change
#### B5.3 Worked example (object)
#### B5.3 Worked example (string)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla-signed external-contributor Indicates that a PR has been opened by someone outside the Warp team.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant