Skip to content

Commit 81d4e0c

Browse files
docs: add local-model providers section to README
Promotes the existing `### LLM endpoint compatibility` subsection from inside `## Configuration` to a top-level `## Using local LLM providers` section, expanded with per-provider paragraphs covering Ollama, llama.cpp, and vLLM/DeepSeek per CONV7_handoff.md sec 5.4. This is the documentation half of Reviewer 2's optional recommendation O7 ("API Flexibility: Support for local models... will further improve metaScreener"). The technical capability has existed since the project's first release - any OpenAI-compatible endpoint just works once OPENAI_BASE_URL is set - but it was previously buried in a single-paragraph subsection. The new structure gives the topic visibility appropriate to its scope and provides concrete copy-paste-ready environment variable settings for each common provider. WHAT THE NEW SECTION COVERS: * Opening paragraph stating that metaScreener targets any OpenAI-compatible endpoint, with a bulleted summary distinguishing hosted commercial APIs (Azure OpenAI, DeepSeek) from locally hosted models (Ollama, llama.cpp, vLLM). * The OPENAI_BASE_URL / OPENAI_API_KEY / Model field contract, explained once at section level so per-provider paragraphs don't need to repeat it. * ### Ollama subsection: endpoint URL, install/pull workflow, Model field guidance. * ### llama.cpp subsection: llama-server invocation, endpoint URL, note that Model field is informational when running llama.cpp directly (the server uses whichever model is currently loaded). * ### vLLM and DeepSeek subsection: vLLM as a high-throughput self-hosted alternative; DeepSeek as a hosted alternative with larger context windows than GPT-4o-mini. * Closing evidence-gating caveat (preserved VERBATIM from the previous subsection per sec 5.4: open-weight model compatibility with the evidence-gating protocol has not been formally tested; users testing local models are invited to file feedback). WHAT WAS REMOVED: * The previous `### LLM endpoint compatibility` subsection inside `## Configuration` (8 lines). Its content is fully absorbed into the new top-level section, with the bullet list of compatible backends restructured and expanded. The verbatim caveat is preserved word-for-word as the closing note. `## Configuration` retains its `### Environment variables` subsection unchanged; only the LLM-endpoint subsection is moved out. INVARIANTS PRESERVED: * Test count unchanged (no test changes): 103 passed, 1 xfailed. * The README badge regression test added in C0 (test_readme_tested_on_badge_lists_actual_ci_platforms) still passes via the GitHub Actions CI badge present from C1. * No code changes; no plugin changes; no test changes. Spec: see CONV7_handoff.md sec 4 ("Add README section on local-model providers") and sec 5.4 (where it goes; verbatim caveat directive).
1 parent dce0352 commit 81d4e0c

1 file changed

Lines changed: 19 additions & 5 deletions

File tree

README.md

Lines changed: 19 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -290,12 +290,26 @@ Tested on Windows 10 and Ubuntu 24.04 (headless, via WSL/Docker).
290290

291291
Copy `.env.example` to `.env` and set your API key. The application will prompt for confirmation on each launch.
292292

293-
### LLM endpoint compatibility
293+
## Using local LLM providers
294294

295-
metaScreener targets any **OpenAI-compatible API endpoint**. This includes:
296-
- OpenAI (GPT-4o, GPT-4o-mini, etc.)
297-
- Azure OpenAI
298-
- Locally hosted models via compatible inference frameworks (e.g., Ollama, LM Studio, vLLM)
295+
metaScreener targets any **OpenAI-compatible API endpoint**. The default backend is OpenAI's hosted API, but the same Python client transparently supports:
296+
297+
- **Hosted commercial APIs** — Azure OpenAI, DeepSeek, and others that mirror OpenAI's chat completions schema.
298+
- **Locally hosted models** — open-weight models served via compatible inference frameworks such as Ollama, llama.cpp, and vLLM.
299+
300+
Switching providers requires no code change: set the `OPENAI_BASE_URL` environment variable to the target endpoint and ensure `OPENAI_API_KEY` is non-empty (most local servers ignore the key value but require it to be set). The **Model** field in metaScreener's EL/IL Settings panels then selects which backend model to use. Three commonly used local-model paths are described below.
301+
302+
### Ollama
303+
304+
[Ollama](https://ollama.com/) exposes an OpenAI-compatible chat completions endpoint at `http://localhost:11434/v1`. After installing Ollama and pulling a model (e.g., `ollama pull llama3.1`), set `OPENAI_BASE_URL=http://localhost:11434/v1` and `OPENAI_API_KEY=ollama` (or any non-empty placeholder). In the EL/IL Settings panels, set **Model** to the local model name (e.g., `llama3.1`).
305+
306+
### llama.cpp
307+
308+
[llama.cpp](https://github.com/ggerganov/llama.cpp)'s `llama-server` binary exposes an OpenAI-compatible endpoint at `http://localhost:8080/v1` by default. Start the server with `./llama-server --model your-model.gguf` and set `OPENAI_BASE_URL=http://localhost:8080/v1` with `OPENAI_API_KEY=llama-cpp` (or any non-empty placeholder). The **Model** field can be set to any value when running llama.cpp directly, since the server uses whichever model is currently loaded.
309+
310+
### vLLM and DeepSeek
311+
312+
For higher-throughput self-hosted inference, [vLLM](https://github.com/vllm-project/vllm) exposes an OpenAI-compatible API tuned for batched GPU workloads; consult the vLLM documentation for the deployment-specific `OPENAI_BASE_URL`. As a hosted alternative, [DeepSeek](https://platform.deepseek.com/) provides an OpenAI-compatible endpoint at `https://api.deepseek.com/v1` with substantially larger context windows than GPT-4o-mini, useful when working with very long records. Use your DeepSeek API key as `OPENAI_API_KEY` for the hosted route.
299313

300314
> **Note**: open-weight model compatibility with the evidence gating protocol (which requires models to produce verbatim substring quotations) has not been formally tested. If you test with a local model, we welcome your feedback via the issue tracker.
301315

0 commit comments

Comments
 (0)