You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: .github/workflows/aw-failure-investigator.md
+17-13Lines changed: 17 additions & 13 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,5 +1,5 @@
1
1
---
2
-
description: Investigates [aw] failures from the last 6 hours, correlates with open agentic-workflows issues, and opens a parent report with fix sub-issues
2
+
description: Investigates [aw] failures from the last 6 hours, correlates with open agentic-workflows issues, closes fixed issues, and opens focused fix sub-issues when needed
3
3
on:
4
4
schedule:
5
5
- cron: "every 6h"
@@ -22,10 +22,13 @@ safe-outputs:
22
22
expires: 7d
23
23
title-prefix: "[aw-failures] "
24
24
labels: [agentic-workflows, automation, cookie]
25
-
max: 8
25
+
max: 2
26
26
group: true
27
+
update-issue:
28
+
target: "*"
29
+
max: 10
27
30
link-sub-issue:
28
-
max: 20
31
+
max: 10
29
32
noop:
30
33
timeout-minutes: 60
31
34
imports:
@@ -49,7 +52,7 @@ Investigate agentic workflow failures from the last 6 hours and produce actionab
49
52
1. Find recent failures from agentic workflows in the last 6 hours.
50
53
2. Correlate findings with currently open `agentic-workflows` issues.
4.Create one parent report issue and linked sub-issues proposing concrete fixes.
55
+
4.Close fixed/stale issues first, then create only the minimum necessary linked fix sub-issues.
53
56
54
57
## Required Investigation Steps
55
58
@@ -91,16 +94,15 @@ Use `agentic-workflows` MCP `audit-diff` to compare:
91
94
92
95
Identify regressions and deltas (metrics/tooling/firewall/MCP behavior) that support fix recommendations.
93
96
94
-
### 5) Create parent report issue + sub-issues
97
+
### 5) Close fixed issues first, then add focused sub-issues
95
98
96
-
Create a **single parent report issue** with a temporary ID (format `aw_` + 3-8 alphanumeric characters) summarizing:
97
-
- observed failure clusters in last 6h
98
-
- links to analyzed run IDs
99
-
- evidence from logs/audit/audit-diff
100
-
- mapping to existing open issues (duplicate / related / new)
101
-
- prioritized fix plan
99
+
First, identify currently open `agentic-workflows` issues that are now fixed, stale, or no longer actionable based on fresh evidence, and close them using `update-issue`.
102
100
103
-
Then create **sub-issues** (linked to the parent) for concrete fixes. Each sub-issue must include:
101
+
Then, if new uncovered work remains, add **sub-issues** for concrete fixes to the **most recent open parent report issue** instead of creating a new parent by default.
102
+
103
+
Only create a new parent report issue (temporary ID format `aw_` + 3-8 alphanumeric characters) when **P0 failures have no existing tracking coverage**.
104
+
105
+
Each new sub-issue must include:
104
106
- clear problem statement
105
107
- affected workflows and run IDs
106
108
- probable root cause
@@ -128,7 +130,9 @@ Include these sections:
128
130
## Decision Rules
129
131
130
132
- If there are **no failures** in the last 6h, or no actionable delta vs existing issues, call `noop` with a concise reason.
131
-
- If failures exist but are already fully tracked, update by creating a minimal parent report that links to existing issues and only create new sub-issues for uncovered gaps.
133
+
- If failures exist but are already fully tracked, prefer closing stale/fixed issues and avoid creating new issues.
134
+
- Only create a new parent report issue when P0 failures have no existing tracking coverage.
135
+
- Prefer closing stale/fixed issues over creating new issues when issue volume is high.
132
136
- Always be explicit about confidence and unknowns.
133
137
134
138
**Important**: If no action is needed after completing your analysis, you **MUST** call the `noop` safe-output tool with a brief explanation.
0 commit comments