docs(readme): reconcile platform table + Testing section to actual reality

normand-marineau · normand-marineau · commit 957cc63201f8 · 2026-05-03T07:37:03.000-04:00
Reconciles the README's stale platform-compatibility claims and test-suite description to ground truth, addressing Reviewer 2's concern #4 directly ("README identifies macOS/Linux as 'In progress,' but the paper claims they are supported. This must be fixed.") and extending the same README/reality-alignment principle to the immediately adjacent Testing section. EVIDENCE: four consecutive 16-cell green CI matrix runs since Conv 7 / Commit 1 added .github/workflows/test.yml: CI #1 (cf42ec6) - 16 of 16 green CI #2 (8e7d521) - 16 of 16 green CI #3 (dce0352) - 16 of 16 green CI #4 (81d4e0c) - 16 of 16 green The matrix covers {ubuntu-22.04, ubuntu-24.04, macos-14, windows-latest} x {Python 3.10, 3.11, 3.12, 3.13} which substantially exceeds the §10 acceptance gate ("Windows + ≥2 of {Ubuntu 22.04, Ubuntu 24.04, macOS 14}"). CHANGES: Edit 1 - Platform compatibility table (lines 229-237 pre-edit): * macOS row: "12+ - In progress; testing underway" replaced by "14+ (Apple Silicon) - Verified by CI; macos-14 runner, Python 3.10-3.13". * Windows row: "Developed and tested - Primary development platform" replaced by "Verified by CI; windows-latest (Windows Server 2022 runner), Python 3.10-3.13" - same operational claim now grounded in CI evidence rather than developer attestation. * Linux row: "Ubuntu 24.04 - Tested (Docker, headless); Test suite executes via Dockerfile in repo" replaced by "Ubuntu 22.04 / 24.04 - Verified by CI; ubuntu-22.04 and ubuntu-24.04 LTS runners, Python 3.10-3.13" - both LTSes are tested in CI; Docker is no longer the only verification path. * Closing paragraph: removed "Cross-platform validation is currently being conducted and will be documented here upon completion" (no longer true) and replaced with a direct link to the live GitHub Actions matrix. Edit 2 - Testing section prose + per-module breakdown (lines 241-258 pre-edit): * Prose count: "73 automated tests" -> "104 automated tests" with expanded scope statement covering all the categories actually exercised (deterministic filters, evidence gating, plugin imports, bundle integrity, repo metadata, per-stage regression goldens). * Per-module breakdown table rebuilt: 5 stale rows (summing to 63 listed; prose was inconsistent with the table by 10) -> 7 accurate rows summing to 104, with a Total row added for at-a-glance verification. Hybrid design consolidates 5 small per-stage regression files (1+1+3+3+1=9 tests) into one row "Per-stage regression suites" rather than inflating the table to 11 individual rows. Counts verified by `pytest --co -q` against the post-C4 sandbox: test_bundle_integrity.py: 12 test_criteria_parser.py: 16 test_deterministic_filters.py: 15 test_evidence_gating.py: 23 test_imports.py: 27 test_metadata.py: 2 Per-stage regression: 9 (eh:1, ih:1, el:3, il:3, harmoniser:1) Total: 104 (= 103 passed + 1 xfailed) SCOPE NOTE: CONV7_handoff.md sec 4 specifies the platform-table reconciliation explicitly. The Testing-section update extends the same README/reality-alignment principle to staleness encountered in the immediately adjacent prose - a credibility precondition rather than a feature. The §12 anti-pattern that warns against scope creep specifically targets *features* (time estimation O8, confidence scoring O9), not documentation accuracy. NO test changes. NO code changes. NO plugin changes. Test count unchanged: 103 passed, 1 xfailed. Spec: see CONV7_handoff.md sec 4 (platform-table directive), sec 5.2 (acceptable CI failure modes / honest documentation), and sec 10 acceptance gate items 2 and 3.
diff --git a/README.md b/README.md
@@ -229,32 +229,35 @@ All dependencies are listed in `requirements.txt`.
 
 | Platform | Status | Notes |
 |----------|--------|-------|
-| Windows 10+ | ✅ Developed and tested | Primary development platform |
-| macOS 12+ | 🔄 In progress | Tkinter is included with Python on macOS; testing underway |
-| Linux (Ubuntu 24.04) | ✅ Tested (Docker, headless) | Test suite executes via Dockerfile in repo |
+| Windows 10+ | ✅ Verified by CI | `windows-latest` (Windows Server 2022 runner), Python 3.10–3.13 |
+| macOS 14+ (Apple Silicon) | ✅ Verified by CI | `macos-14` runner, Python 3.10–3.13 |
+| Linux (Ubuntu 22.04 / 24.04) | ✅ Verified by CI | `ubuntu-22.04` and `ubuntu-24.04` LTS runners, Python 3.10–3.13 |
 
-The application is pure Python with no compiled extensions. It is expected to work on any platform supporting Python 3.10+ and Tkinter. Cross-platform validation is currently being conducted and will be documented here upon completion.
+The application is pure Python with no compiled extensions and runs on any platform supporting Python 3.10+ and Tkinter. Cross-platform compatibility is continuously verified by the GitHub Actions matrix on every push; see the [live CI status](https://github.com/lars-ulaval/metaScreener/actions/workflows/test.yml) for current run results.
 
 ---
 
 ## Testing
 
-The project includes 73 automated tests covering the deterministic components of the pipeline. No OpenAI API key, network access, or graphical display server is required.
+The project includes 104 automated tests covering the deterministic components of the pipeline as well as quote-based evidence gating, plugin imports, bundle integrity, repo metadata consistency, and per-stage regression goldens. No OpenAI API key, network access, or graphical display server is required.
 
 ```bash
 pip install pytest
 python -m pytest tests/ -v
 ```
 
-The test suite covers four areas:
+The test suite covers seven areas:
 
 | Module | Tests | Coverage |
 |--------|-------|----------|
-| `test_criteria_parser.py` | 16 | Free-text parsing, operator/stage inference |
-| `test_deterministic_filters.py` | 11 | EH/IH `_eval_criterion` for all operator types |
-| `test_evidence_gating.py` | 17 | Quote validation, SHA-256 hashing, cache key construction |
-| `test_bundle_integrity.py` | 10 | Bundle ZIP structure, manifest schema, hash verification |
-| `test_imports.py` | 9 | Module import smoke tests, plugin_manager sanitizer |
+| `test_criteria_parser.py` | 16 | Free-text criteria parsing, operator/stage inference |
+| `test_deterministic_filters.py` | 15 | EH/IH `_eval_criterion` for all operator types |
+| `test_evidence_gating.py` | 23 | Quote validation, SHA-256 hashing, cache key construction |
+| `test_bundle_integrity.py` | 12 | Bundle ZIP structure, manifest schema, hash verification |
+| `test_imports.py` | 27 | Module imports, plugin shim regression, cache-key invariants |
+| `test_metadata.py` | 2 | Repo metadata consistency (version match, README CI badge) |
+| Per-stage regression suites | 9 | Byte-identity goldens for the EH, IH, EL, IL, and Harmoniser plugins (one file per stage) |
+| **Total** | **104** | |
 
 ### Refactoring safety: static import audit