Skip to content

Commit 957cc63

Browse files
docs(readme): reconcile platform table + Testing section to actual reality
Reconciles the README's stale platform-compatibility claims and test-suite description to ground truth, addressing Reviewer 2's concern #4 directly ("README identifies macOS/Linux as 'In progress,' but the paper claims they are supported. This must be fixed.") and extending the same README/reality-alignment principle to the immediately adjacent Testing section. EVIDENCE: four consecutive 16-cell green CI matrix runs since Conv 7 / Commit 1 added .github/workflows/test.yml: CI #1 (cf42ec6) - 16 of 16 green CI #2 (8e7d521) - 16 of 16 green CI #3 (dce0352) - 16 of 16 green CI #4 (81d4e0c) - 16 of 16 green The matrix covers {ubuntu-22.04, ubuntu-24.04, macos-14, windows-latest} x {Python 3.10, 3.11, 3.12, 3.13} which substantially exceeds the §10 acceptance gate ("Windows + ≥2 of {Ubuntu 22.04, Ubuntu 24.04, macOS 14}"). CHANGES: Edit 1 - Platform compatibility table (lines 229-237 pre-edit): * macOS row: "12+ - In progress; testing underway" replaced by "14+ (Apple Silicon) - Verified by CI; macos-14 runner, Python 3.10-3.13". * Windows row: "Developed and tested - Primary development platform" replaced by "Verified by CI; windows-latest (Windows Server 2022 runner), Python 3.10-3.13" - same operational claim now grounded in CI evidence rather than developer attestation. * Linux row: "Ubuntu 24.04 - Tested (Docker, headless); Test suite executes via Dockerfile in repo" replaced by "Ubuntu 22.04 / 24.04 - Verified by CI; ubuntu-22.04 and ubuntu-24.04 LTS runners, Python 3.10-3.13" - both LTSes are tested in CI; Docker is no longer the only verification path. * Closing paragraph: removed "Cross-platform validation is currently being conducted and will be documented here upon completion" (no longer true) and replaced with a direct link to the live GitHub Actions matrix. Edit 2 - Testing section prose + per-module breakdown (lines 241-258 pre-edit): * Prose count: "73 automated tests" -> "104 automated tests" with expanded scope statement covering all the categories actually exercised (deterministic filters, evidence gating, plugin imports, bundle integrity, repo metadata, per-stage regression goldens). * Per-module breakdown table rebuilt: 5 stale rows (summing to 63 listed; prose was inconsistent with the table by 10) -> 7 accurate rows summing to 104, with a Total row added for at-a-glance verification. Hybrid design consolidates 5 small per-stage regression files (1+1+3+3+1=9 tests) into one row "Per-stage regression suites" rather than inflating the table to 11 individual rows. Counts verified by `pytest --co -q` against the post-C4 sandbox: test_bundle_integrity.py: 12 test_criteria_parser.py: 16 test_deterministic_filters.py: 15 test_evidence_gating.py: 23 test_imports.py: 27 test_metadata.py: 2 Per-stage regression: 9 (eh:1, ih:1, el:3, il:3, harmoniser:1) Total: 104 (= 103 passed + 1 xfailed) SCOPE NOTE: CONV7_handoff.md sec 4 specifies the platform-table reconciliation explicitly. The Testing-section update extends the same README/reality-alignment principle to staleness encountered in the immediately adjacent prose - a credibility precondition rather than a feature. The §12 anti-pattern that warns against scope creep specifically targets *features* (time estimation O8, confidence scoring O9), not documentation accuracy. NO test changes. NO code changes. NO plugin changes. Test count unchanged: 103 passed, 1 xfailed. Spec: see CONV7_handoff.md sec 4 (platform-table directive), sec 5.2 (acceptable CI failure modes / honest documentation), and sec 10 acceptance gate items 2 and 3.
1 parent 81d4e0c commit 957cc63

1 file changed

Lines changed: 14 additions & 11 deletions

File tree

README.md

Lines changed: 14 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -229,32 +229,35 @@ All dependencies are listed in `requirements.txt`.
229229

230230
| Platform | Status | Notes |
231231
|----------|--------|-------|
232-
| Windows 10+ |Developed and tested | Primary development platform |
233-
| macOS 12+ | 🔄 In progress | Tkinter is included with Python on macOS; testing underway |
234-
| Linux (Ubuntu 24.04) |Tested (Docker, headless) | Test suite executes via Dockerfile in repo |
232+
| Windows 10+ |Verified by CI | `windows-latest` (Windows Server 2022 runner), Python 3.10–3.13 |
233+
| macOS 14+ (Apple Silicon) | ✅ Verified by CI | `macos-14` runner, Python 3.10–3.13 |
234+
| Linux (Ubuntu 22.04 / 24.04) |Verified by CI | `ubuntu-22.04` and `ubuntu-24.04` LTS runners, Python 3.10–3.13 |
235235

236-
The application is pure Python with no compiled extensions. It is expected to work on any platform supporting Python 3.10+ and Tkinter. Cross-platform validation is currently being conducted and will be documented here upon completion.
236+
The application is pure Python with no compiled extensions and runs on any platform supporting Python 3.10+ and Tkinter. Cross-platform compatibility is continuously verified by the GitHub Actions matrix on every push; see the [live CI status](https://github.com/lars-ulaval/metaScreener/actions/workflows/test.yml) for current run results.
237237

238238
---
239239

240240
## Testing
241241

242-
The project includes 73 automated tests covering the deterministic components of the pipeline. No OpenAI API key, network access, or graphical display server is required.
242+
The project includes 104 automated tests covering the deterministic components of the pipeline as well as quote-based evidence gating, plugin imports, bundle integrity, repo metadata consistency, and per-stage regression goldens. No OpenAI API key, network access, or graphical display server is required.
243243

244244
```bash
245245
pip install pytest
246246
python -m pytest tests/ -v
247247
```
248248

249-
The test suite covers four areas:
249+
The test suite covers seven areas:
250250

251251
| Module | Tests | Coverage |
252252
|--------|-------|----------|
253-
| `test_criteria_parser.py` | 16 | Free-text parsing, operator/stage inference |
254-
| `test_deterministic_filters.py` | 11 | EH/IH `_eval_criterion` for all operator types |
255-
| `test_evidence_gating.py` | 17 | Quote validation, SHA-256 hashing, cache key construction |
256-
| `test_bundle_integrity.py` | 10 | Bundle ZIP structure, manifest schema, hash verification |
257-
| `test_imports.py` | 9 | Module import smoke tests, plugin_manager sanitizer |
253+
| `test_criteria_parser.py` | 16 | Free-text criteria parsing, operator/stage inference |
254+
| `test_deterministic_filters.py` | 15 | EH/IH `_eval_criterion` for all operator types |
255+
| `test_evidence_gating.py` | 23 | Quote validation, SHA-256 hashing, cache key construction |
256+
| `test_bundle_integrity.py` | 12 | Bundle ZIP structure, manifest schema, hash verification |
257+
| `test_imports.py` | 27 | Module imports, plugin shim regression, cache-key invariants |
258+
| `test_metadata.py` | 2 | Repo metadata consistency (version match, README CI badge) |
259+
| Per-stage regression suites | 9 | Byte-identity goldens for the EH, IH, EL, IL, and Harmoniser plugins (one file per stage) |
260+
| **Total** | **104** | |
258261

259262
### Refactoring safety: static import audit
260263

0 commit comments

Comments
 (0)