|
| 1 | +# Codex Guidelines for Langfuse Python |
| 2 | + |
| 3 | +This is the canonical root agent guide for the repo. The root `AGENTS.md` |
| 4 | +should remain only as a discovery symlink so tools that require that filename |
| 5 | +continue to work while `.agents/` stays the source of truth. |
| 6 | + |
| 7 | +Langfuse Python SDK guidance for fast, safe code changes. |
| 8 | + |
| 9 | +## Maintenance Contract |
| 10 | + |
| 11 | +- `AGENTS.md` is a living document. |
| 12 | +- Update this file in the same PR when repo-level architecture, workflows, |
| 13 | + verification requirements, release processes, or agent setup conventions |
| 14 | + materially change. |
| 15 | +- Update this file when user feedback adds a durable repo-level instruction that |
| 16 | + future agents should follow. |
| 17 | +- Keep root guidance concise, specific, and easy to verify. If repo-wide |
| 18 | + guidance grows large or becomes task-specific, move that detail into shared |
| 19 | + skills or future nested `AGENTS.md` files closer to the relevant code. |
| 20 | +- If no durable guidance changed, do not edit AGENTS files. |
| 21 | + |
| 22 | +## Project Overview |
| 23 | + |
| 24 | +This repository contains the Langfuse Python SDK, a client library for |
| 25 | +accessing the Langfuse observability platform. The SDK integrates with |
| 26 | +OpenTelemetry for tracing, provides automatic instrumentation for popular LLM |
| 27 | +frameworks, and exposes a generated API client for the Langfuse platform. |
| 28 | + |
| 29 | +## Project Structure |
| 30 | + |
| 31 | +```text |
| 32 | +langfuse-python/ |
| 33 | +├─ langfuse/_client/ # Core SDK implementation built on OpenTelemetry |
| 34 | +├─ langfuse/api/ # Generated Fern API client (do not hand-edit) |
| 35 | +├─ langfuse/_task_manager/ # Background upload and ingestion helpers |
| 36 | +├─ langfuse/langchain/ # LangChain integration |
| 37 | +├─ tests/ # Test suite |
| 38 | +├─ static/ # Test fixtures and sample content |
| 39 | +├─ scripts/ # Repo scripts |
| 40 | +└─ .agents/ # Canonical shared agent instructions and config |
| 41 | +``` |
| 42 | + |
| 43 | +High-signal entry points: |
| 44 | + |
| 45 | +- `langfuse/_client/client.py`: core Langfuse client with OTel integration |
| 46 | +- `langfuse/_client/span.py`: observation/span abstractions |
| 47 | +- `langfuse/_client/observe.py`: decorator-based instrumentation |
| 48 | +- `langfuse/openai.py`: OpenAI instrumentation helpers |
| 49 | +- `langfuse/langchain/CallbackHandler.py`: LangChain integration |
| 50 | +- `langfuse/api/`: generated API surface copied from the main Langfuse repo |
| 51 | + |
| 52 | +## Instruction Design |
| 53 | + |
| 54 | +- Root `AGENTS.md` should cover durable repo-wide expectations only: setup, |
| 55 | + verification, architecture, security, generated files, and handoff rules. |
| 56 | +- Prefer concrete, testable instructions over vague phrasing. Name the exact |
| 57 | + command, path, module, or condition whenever possible. |
| 58 | +- Keep stable tone/role guidance separate from task-specific examples. For |
| 59 | + complex prompts or reusable workflows, place examples in skills or referenced |
| 60 | + docs instead of bloating the root guide. |
| 61 | +- Add nearby nested guidance only when a subdirectory truly needs different |
| 62 | + rules. Put the override as close as possible to the specialized code. |
| 63 | +- Use shared skills for recurring task-specific workflows that should not be |
| 64 | + loaded into context on every task. |
| 65 | + |
| 66 | +## Build, Test, and Development Commands |
| 67 | + |
| 68 | +- Agent environment bootstrap: `bash scripts/codex/setup.sh` |
| 69 | +- Install dependencies: `poetry install --all-extras` |
| 70 | +- Sync generated agent shims: `python3 scripts/agents/sync-agent-shims.py` |
| 71 | +- Verify generated agent shims: `python3 scripts/agents/sync-agent-shims.py --check` |
| 72 | +- Install pre-commit hooks: `poetry run pre-commit install` |
| 73 | +- Run all tests: `poetry run pytest -s -v --log-cli-level=INFO` |
| 74 | +- Run tests in parallel: `poetry run pytest -s -v --log-cli-level=INFO -n auto` |
| 75 | +- Run one test: `poetry run pytest -s -v --log-cli-level=INFO tests/test_core_sdk.py::test_flush` |
| 76 | +- Format code: `poetry run ruff format .` |
| 77 | +- Lint code: `poetry run ruff check .` |
| 78 | +- Type-check: `poetry run mypy langfuse --no-error-summary` |
| 79 | +- Run pre-commit across the repo: `poetry run pre-commit run --all-files` |
| 80 | +- Build package: `poetry build` |
| 81 | +- Generate docs: `poetry run pdoc -o docs/ --docformat google --logo "https://langfuse.com/langfuse_logo.svg" langfuse` |
| 82 | + |
| 83 | +Minimum verification matrix: |
| 84 | + |
| 85 | +| Change scope | Minimum verification | |
| 86 | +| --- | --- | |
| 87 | +| `langfuse/_client/**` | `poetry run ruff check .` + `poetry run mypy langfuse --no-error-summary` + targeted pytest coverage | |
| 88 | +| `langfuse/api/**` | verify source update path from main repo + `poetry run ruff format .` + targeted API tests | |
| 89 | +| Integration modules (`langfuse/openai.py`, `langfuse/langchain/**`) | targeted tests for the touched integration + lint + latest official provider docs review if behavior or API usage changed | |
| 90 | +| Test-only changes | targeted pytest coverage for the updated tests | |
| 91 | +| Agent setup files (`.agents/**`, `scripts/agents/**`, `scripts/codex/**`) | `python3 scripts/agents/sync-agent-shims.py` + `python3 scripts/agents/sync-agent-shims.py --check` + `poetry run pytest tests/test_sync_agent_shims.py` | |
| 92 | + |
| 93 | +CI notes: |
| 94 | + |
| 95 | +- Linting runs via `astral-sh/ruff-action`. |
| 96 | +- Type checking runs on Python 3.13 with Poetry, `.venv` caching, and the agent |
| 97 | + shim sync/check step. |
| 98 | +- The main test matrix runs on Python 3.10 through 3.14. |
| 99 | +- Integration CI clones the main `langfuse/langfuse` repo, boots Dockerized |
| 100 | + services, seeds the server with `pnpm`, and then runs this SDK's pytest suite |
| 101 | + against that local server. |
| 102 | +- If a change plausibly depends on server behavior, call out whether it was only |
| 103 | + covered by unit tests locally and whether full CI is the real end-to-end |
| 104 | + verification path. |
| 105 | + |
| 106 | +## Architecture |
| 107 | + |
| 108 | +### Core Components |
| 109 | + |
| 110 | +- `langfuse/_client/`: main SDK implementation built on OpenTelemetry |
| 111 | + - `client.py`: core Langfuse client |
| 112 | + - `span.py`: span, generation, and event classes |
| 113 | + - `observe.py`: decorator for automatic instrumentation |
| 114 | + - `datasets.py`: dataset management functionality |
| 115 | +- `langfuse/api/`: auto-generated Fern API client |
| 116 | +- `langfuse/_task_manager/`: background processing for uploads and ingestion |
| 117 | +- `langfuse/openai.py`: OpenAI instrumentation |
| 118 | +- `langfuse/langchain/`: LangChain integration |
| 119 | + |
| 120 | +### Key Design Patterns |
| 121 | + |
| 122 | +- The SDK is built on OpenTelemetry for observability. |
| 123 | +- Spans are the core tracing primitive. |
| 124 | +- Attributes carry trace metadata. See `LangfuseOtelSpanAttributes`. |
| 125 | +- The client batches work and flushes asynchronously to the Langfuse API. |
| 126 | + |
| 127 | +## Generated Files |
| 128 | + |
| 129 | +- `langfuse/api/**` is generated from the main Langfuse repo. Do not edit it by |
| 130 | + hand unless the task is explicitly about generated client updates. |
| 131 | +- `docs/` output from `pdoc` is generated. Regenerate it instead of editing |
| 132 | + rendered output directly. |
| 133 | +- Agent/tool shims at `.mcp.json`, `.claude/settings.json`, `.claude/skills/*`, |
| 134 | + `.cursor/mcp.json`, `.cursor/environment.json`, `.vscode/mcp.json`, |
| 135 | + `.codex/config.toml`, and `.codex/environments/environment.toml` are local |
| 136 | + generated artifacts. Update `.agents/config.json` or `.agents/skills/**` |
| 137 | + instead of editing them by hand. |
| 138 | +- `AGENTS.md` and `CLAUDE.md` at the repo root are compatibility symlinks. Edit |
| 139 | + `.agents/AGENTS.md`, not the symlink target path directly. |
| 140 | + |
| 141 | +## Configuration |
| 142 | + |
| 143 | +Environment variables are defined in |
| 144 | +`langfuse/_client/environment_variables.py`. |
| 145 | + |
| 146 | +Common ones: |
| 147 | + |
| 148 | +- `LANGFUSE_PUBLIC_KEY` / `LANGFUSE_SECRET_KEY`: API credentials |
| 149 | +- `LANGFUSE_HOST`: API endpoint, defaults to `https://cloud.langfuse.com` |
| 150 | +- `LANGFUSE_DEBUG`: enable debug logging |
| 151 | +- `LANGFUSE_TRACING_ENABLED`: enable or disable tracing |
| 152 | +- `LANGFUSE_SAMPLE_RATE`: sampling rate for traces |
| 153 | + |
| 154 | +Security/config notes: |
| 155 | + |
| 156 | +- Keep credentials and machine-specific secrets in environment variables or |
| 157 | + local untracked files, never in committed agent config. |
| 158 | +- The shared Claude settings intentionally deny reading `./.env` and |
| 159 | + `./.env.*`. If a task genuinely requires inspecting local env overrides, get |
| 160 | + explicit user approval first instead of weakening the default policy. |
| 161 | +- For authenticated MCP servers or provider-specific config additions, prefer |
| 162 | + secret injection via environment variables rather than committed inline |
| 163 | + tokens. |
| 164 | + |
| 165 | +## Testing Guidelines |
| 166 | + |
| 167 | +- Keep tests independent and parallel-safe. |
| 168 | +- Do not weaken or delete meaningful assertions just to make tests pass. |
| 169 | +- When fixing a bug, write or update the regression test first when feasible. |
| 170 | +- E2E tests involving external APIs are often skipped in CI. Document when |
| 171 | + manual coverage is still needed. |
| 172 | +- Use `respx` and `pytest-httpserver` for HTTP mocking when possible. |
| 173 | +- Prefer the narrowest useful test invocation first, then widen coverage when a |
| 174 | + change touches shared tracing, batching, or provider integrations. |
| 175 | + |
| 176 | +## API Generation |
| 177 | + |
| 178 | +The `langfuse/api/` directory is generated from the Langfuse OpenAPI |
| 179 | +specification via Fern. |
| 180 | + |
| 181 | +Update flow: |
| 182 | + |
| 183 | +1. Generate the Python SDK in the main `langfuse/langfuse` repo. |
| 184 | +2. Copy the generated files from `generated/python` into `langfuse/api/`. |
| 185 | +3. Run `poetry run ruff format .`. |
| 186 | +4. Run targeted verification for any touched endpoints or types. |
| 187 | + |
| 188 | +## Release Guidelines |
| 189 | + |
| 190 | +- Releases are automated via GitHub Actions. |
| 191 | +- The release workflow updates `pyproject.toml` and `langfuse/version.py`, |
| 192 | + builds the package, publishes to PyPI, and creates a GitHub release. |
| 193 | +- Do not change release/versioning flow without updating this file and |
| 194 | + `CONTRIBUTING.md`. |
| 195 | + |
| 196 | +## Agent-specific Notes |
| 197 | + |
| 198 | +- `.agents/AGENTS.md` is the canonical root guide. |
| 199 | +- Root `AGENTS.md` is a symlink to `.agents/AGENTS.md`. |
| 200 | +- Root `CLAUDE.md` is a compatibility symlink to `AGENTS.md`. |
| 201 | +- Shared agent/tool config lives in `.agents/config.json`. |
| 202 | +- Shared agent setup documentation lives in `.agents/README.md`. |
| 203 | +- Shared skills live under `.agents/skills/`. |
| 204 | +- `python3 scripts/agents/sync-agent-shims.py` regenerates tool-specific config |
| 205 | + shims for Claude, Cursor, VS Code, Codex, and shared MCP discovery files. |
| 206 | +- Tool-specific directories such as `.claude/`, `.cursor/`, `.codex/`, and |
| 207 | + `.vscode/` remain because those tools discover project settings from fixed |
| 208 | + paths. |
| 209 | +- Cursor discovery should continue to work through the generated |
| 210 | + `.cursor/environment.json` and `.cursor/mcp.json` shims plus the root |
| 211 | + `AGENTS.md` symlink. Do not hand-edit those generated files. |
| 212 | +- This file should stay concise. Anthropic recommends keeping persistent project |
| 213 | + memory under roughly 200 lines, and both Anthropic and OpenAI guidance favor |
| 214 | + specific, well-structured instructions over long prose. |
| 215 | +- If future `.cursor/rules/*.mdc` files are added, keep them as thin wrappers |
| 216 | + around shared `AGENTS.md` guidance or shared skills instead of making them the |
| 217 | + only source of durable repo guidance. |
| 218 | +- Shared skill index: [`skills/README.md`](skills/README.md) |
| 219 | +- When changing OpenAI or Anthropic integrations, prompts, or documented usage: |
| 220 | + check the latest official provider docs first, keep prompts simple and direct, |
| 221 | + preserve clear separation between stable instructions and task-specific |
| 222 | + examples, and mention any provider-facing verification you did not run. |
| 223 | + |
| 224 | +Official references to start from: |
| 225 | + |
| 226 | +- OpenAI AGENTS guide: <https://developers.openai.com/codex/guides/agents-md> |
| 227 | +- OpenAI prompting guide: <https://developers.openai.com/api/docs/guides/prompting> |
| 228 | +- OpenAI reasoning best practices: <https://developers.openai.com/api/docs/guides/reasoning-best-practices> |
| 229 | +- Anthropic Claude Code memory guide: <https://docs.anthropic.com/en/docs/claude-code/memory> |
| 230 | +- Anthropic Claude Code MCP guide: <https://docs.anthropic.com/en/docs/claude-code/mcp> |
| 231 | +- Anthropic prompting best practices: <https://platform.claude.com/docs/en/build-with-claude/prompt-engineering/claude-prompting-best-practices> |
| 232 | + |
| 233 | +## Git Notes |
| 234 | + |
| 235 | +- Do not use destructive git commands such as `reset --hard` unless explicitly |
| 236 | + requested. |
| 237 | +- Do not revert unrelated working tree changes. |
| 238 | +- Keep commits focused and atomic. |
| 239 | + |
| 240 | +## Python Code Rules |
| 241 | + |
| 242 | +- Exception messages must not inline f-string literals directly in the `raise`. |
| 243 | + Assign the string to a variable first if formatting is required. |
0 commit comments