Skip to content

Commit 00c0215

Browse files
committed
AGENTS: document learnings from split-index + fsmonitor investigation
While investigating a CI failure in the `linux-TEST-vars` job caused by the interaction between the `pt/fsmonitor-linux` and `hn/git-checkout-m-with-stash` topics in `seen`, several debugging techniques proved essential and were not previously documented. The investigation required bisecting the first-parent history of `seen` while temporarily merging the fsmonitor topic at each step. This revealed that `GIT_TEST_SPLIT_INDEX=yes` corrupts the bisect machinery's own index operations unless it is unset before cleanup checkouts. It also revealed that `fprintf(stderr, ...)` instrumentation in Git's C code is swallowed by the test framework, making Trace2 the correct instrumentation approach. A key insight was that the bug appeared Linux-specific only because `linux-TEST-vars` is the sole CI job setting `GIT_TEST_SPLIT_INDEX=yes`; there is no macOS or Windows equivalent. The actual root cause (the `index.skipHash=true` + split-index interaction producing a null `base_oid` in the shared index) is platform-independent. Add four documentation sections capturing these learnings: bisecting `seen` interactions, reproducing with exact CI variables, verifying CI platform coverage before concluding platform-specificity, and using Trace2 for instrumentation inside the test framework. Assisted-by: Claude Opus 4.6 Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
1 parent c1e4522 commit 00c0215

File tree

1 file changed

+67
-0
lines changed

1 file changed

+67
-0
lines changed

AGENTS.md

Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -454,6 +454,30 @@ Key trace messages:
454454
- `gh_client__get_immediate: <oid>` - Object fetched immediately
455455
- `gh-client/objects/post` - Batch POST request region
456456

457+
### Instrumenting Git Internals During Tests
458+
459+
When adding debug output to Git's C code during test investigation,
460+
`fprintf(stderr, ...)` from git subprocesses spawned by the test framework
461+
is typically swallowed (redirected or discarded by the test harness). Use
462+
Trace2 instead:
463+
464+
```c
465+
trace2_data_intmax("index", NULL, "my_debug/cache_nr", istate->cache_nr);
466+
trace2_data_string("index", NULL, "my_debug/state", some_string);
467+
```
468+
469+
Then run the test with `GIT_TRACE2_EVENT` or `GIT_TRACE2_PERF` pointing to
470+
a file, and grep the output. This integrates with Git's existing tracing
471+
infrastructure and survives the test framework's output management.
472+
473+
As a last resort (e.g. when Trace2 is not initialized yet at the point you
474+
need to instrument), write to a fixed file path:
475+
476+
```c
477+
FILE *f = fopen("/tmp/debug.log", "a");
478+
if (f) { fprintf(f, "state: %u\n", value); fclose(f); }
479+
```
480+
457481
### Comparing Branches After Rebase
458482

459483
```bash
@@ -488,6 +512,32 @@ away from the original branch point the commit is cherry-picked to, so it
488512
often makes sense to squash both old and new downstream changes, and then
489513
to "interpolate" between them when encountering merge conflicts).
490514

515+
### Bisecting Failures in `seen`
516+
517+
When a topic passes on its own but fails after being merged to `seen`, the
518+
failure is caused by interaction with another in-flight topic. To identify
519+
the culprit:
520+
521+
1. Fetch the exact `seen` commit from the failing CI run (get the SHA from
522+
the workflow run metadata via the GitHub API).
523+
2. Use a worktree checked out at that `seen` commit.
524+
3. Bisect the first-parent history between `upstream/master` and `seen~1`
525+
(excluding the topic's own merge). At each bisection step, merge the
526+
topic in temporarily, build, run the test, then undo the merge.
527+
4. Write a `git bisect run` script that automates this. Key pitfalls:
528+
- The script must `unset` test environment variables (especially
529+
`GIT_TEST_SPLIT_INDEX`) before cleanup operations like
530+
`git checkout -f`, otherwise the worktree's own index can get
531+
corrupted.
532+
- Use `git checkout -f "$ORIG"` (not `git reset --hard`) to undo the
533+
temporary merge, since `reset --hard` under split-index can corrupt.
534+
- Save the current commit OID at the start (`ORIG=$(git rev-parse HEAD)`)
535+
because `ORIG_HEAD` is unreliable during bisect.
536+
- On merge conflict, return 125 (skip) and `git merge --abort`.
537+
5. Store the alias for running with the full set of CI test variables as a
538+
repository-local alias (to avoid repeating the long export list and to
539+
allow the user to approve the tool call once).
540+
491541
### CI/Workflow Failure Investigation
492542

493543
When a CI workflow fails, the debugging process has a high cost per iteration.
@@ -519,6 +569,23 @@ locally with faster turnaround:
519569
- For build failures: replicate the build environment and commands.
520570
- For macOS issues: if you lack a Mac, at least trace the Makefile logic
521571
to understand what flags should be set and why.
572+
- For test failures that only appear in specific CI jobs (like
573+
`linux-TEST-vars`): reproduce with the _exact_ set of environment
574+
variables that job sets. Check `ci/run-build-and-tests.sh` for the
575+
job's variable block. Do not assume a single variable (e.g.
576+
`GIT_TEST_SPLIT_INDEX`) is sufficient; other variables may contribute
577+
to the failure path.
578+
- When a test fails in `seen` but not on the topic branch alone, check
579+
out the exact `seen` commit from the failing CI run (get the SHA from
580+
the workflow run metadata) and reproduce against that. The interaction
581+
with other in-flight topics is the likely cause.
582+
583+
**5. Do not assume CI coverage from platform support.** When asking "why
584+
does platform X not see this bug?", verify whether CI actually tests that
585+
combination on that platform. For example, `GIT_TEST_SPLIT_INDEX=yes` is
586+
only set by `linux-TEST-vars`; there is no equivalent `osx-TEST-vars` or
587+
`windows-TEST-vars` job. A bug that only manifests under split-index
588+
testing may be present on all platforms but only caught on Linux.
522589

523590
**5. Add comprehensive diagnostics on first attempt.** If you must push to
524591
CI to test, make that push count:

0 commit comments

Comments
 (0)