Skip to content

Commit 5ef17a0

Browse files
authored
chore: improve agent development setup (#1642)
1 parent 4412567 commit 5ef17a0

10 files changed

Lines changed: 249 additions & 24 deletions

File tree

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
version = 1
2+
name = "langfuse-python"
3+
4+
[setup]
5+
script = "bash scripts/codex/setup.sh"

.github/PULL_REQUEST_TEMPLATE.md

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
## What does this PR do?
2+
3+
> PR title must follow Conventional Commits, for example `feat: add dataset scoring helper` or `fix(openai): preserve trace context`.
4+
5+
Fixes #
6+
7+
## Type of change
8+
9+
- [ ] Bug fix
10+
- [ ] New feature
11+
- [ ] Breaking change
12+
- [ ] Refactor
13+
- [ ] Documentation update
14+
- [ ] Tooling, CI, or repo maintenance
15+
16+
## Verification
17+
18+
List the main commands you ran:
19+
20+
```bash
21+
22+
```
23+
24+
## Checklist
25+
26+
- [ ] I self-reviewed the diff using `code_review.md`.
27+
- [ ] I added or updated tests for behavior changes.
28+
- [ ] I updated docs, examples, or `.env.template` if needed.
29+
- [ ] I did not hand-edit generated files; if generated files changed, I used the upstream regeneration path.
30+
- [ ] I did not commit secrets or credentials.
Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
---
2+
name: "Validate PR Title"
3+
4+
on:
5+
pull_request:
6+
branches:
7+
- "**"
8+
types:
9+
- opened
10+
- edited
11+
- synchronize
12+
- reopened
13+
14+
permissions: {}
15+
16+
jobs:
17+
validate-pr-title:
18+
runs-on: ubuntu-latest
19+
permissions:
20+
statuses: write
21+
pull-requests: read
22+
steps:
23+
- name: Validate PR title follows conventional commits
24+
uses: amannn/action-semantic-pull-request@48f256284bd46cdaab1048c3721360e808335d50 # v6.1.1
25+
env:
26+
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
27+
with:
28+
types: |
29+
feat
30+
fix
31+
docs
32+
style
33+
refactor
34+
perf
35+
test
36+
build
37+
ci
38+
chore
39+
revert
40+
security
41+
requireScope: false
42+
validateSingleCommit: false
43+
ignoreLabels: |
44+
bot
45+
ignore-semantic-pull-request

.gitignore

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,3 +33,8 @@ docs
3333
tests/mocks/llama-index-storage
3434

3535
*.local.*
36+
37+
# Codex local runtime state
38+
.codex/log/
39+
.codex/sessions/
40+
.codex/tmp/

AGENTS.md

Lines changed: 22 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,7 @@ This repository contains the Langfuse Python SDK.
2525
- `tests/live_provider/`: live OpenAI / LangChain provider tests
2626
- `tests/support/`: shared helpers for e2e tests
2727
- `scripts/select_e2e_shard.py`: CI shard selector for `tests/e2e`
28+
- `scripts/codex/`: Codex cloud/worktree bootstrap and shared quick checks
2829

2930
## Working Style
3031

@@ -34,6 +35,8 @@ This repository contains the Langfuse Python SDK.
3435
- Keep repo-shared instructions here. Keep personal or machine-specific notes out of version control.
3536
- Keep tests independent and parallel-safe by default.
3637
- For bug fixes, prefer writing or identifying the failing test first, confirm the failure, then implement the fix.
38+
- For complex or ambiguous tasks, plan first, identify the likely verification path, then implement.
39+
- Before final handoff, review the diff for correctness, regressions, missing tests, and accidental generated-file edits.
3740

3841
## Setup And Quality Commands
3942

@@ -43,6 +46,7 @@ uv run pre-commit install
4346
uv run --frozen ruff check .
4447
uv run --frozen ruff format .
4548
uv run --frozen mypy langfuse --no-error-summary
49+
bash scripts/codex/quick-check.sh
4650
```
4751

4852
## Test Commands
@@ -66,6 +70,18 @@ uv run --frozen pytest -n 4 --dist worksteal tests/live_provider -m "live_provid
6670
uv run --frozen pytest tests/unit/test_resource_manager.py::test_pause_signals_score_consumer_shutdown
6771
```
6872

73+
Minimum verification matrix:
74+
75+
| Change scope | Minimum verification |
76+
| --- | --- |
77+
| Docs or comments only | `uv run --frozen ruff format --check .` if Python files changed |
78+
| Python source only | `uv run --frozen ruff check .` + `uv run --frozen mypy langfuse --no-error-summary` + targeted unit tests |
79+
| Unit-test-only change | targeted `uv run --frozen pytest ...` for the changed tests |
80+
| Shutdown, flushing, worker-thread, or OTEL-heavy change | targeted resource-manager/OTEL tests plus affected integration tests when relevant |
81+
| OpenAI or LangChain instrumentation | targeted unit tests using exporter-local assertions; add e2e/live-provider coverage only when unit tests cannot cover behavior |
82+
| Generated API client or public API contract | upstream Fern/OpenAPI regeneration path plus targeted SDK serialization/deserialization tests |
83+
| CI, sharding, or bootstrap | relevant script test plus CI workflow review against this file's CI contract |
84+
6985
## Test Topology
7086

7187
### `tests/unit`
@@ -96,6 +112,7 @@ The main CI workflow currently runs:
96112
- `tests/unit` on a Python 3.10-3.14 matrix
97113
- `tests/e2e` in 2 mechanical shards plus a serial subset inside each shard
98114
- `tests/live_provider` as one always-on suite
115+
- PR title validation for Conventional Commits
99116

100117
If you change the e2e split:
101118

@@ -113,16 +130,19 @@ If you change CI bootstrap:
113130
- Keep changes scoped. Avoid unrelated refactors.
114131
- Prefer `LANGFUSE_BASE_URL`; `LANGFUSE_HOST` is deprecated and is only kept for compatibility tests.
115132
- If you touch `langfuse/api/`, regenerate it from the upstream Fern/OpenAPI source instead of hand-editing files.
133+
- If you change public SDK behavior, update examples, README snippets, or generated reference docs when they would otherwise become stale.
116134
- If you touch shutdown, flushing, or worker-thread behavior, run the relevant resource-manager and OTEL-heavy tests.
117135
- If you change OpenAI or LangChain instrumentation, keep as much coverage as possible in `tests/unit` using exporter-local assertions, and leave only the minimal necessary coverage in `tests/e2e` / `tests/live_provider`.
118136
- Never commit secrets or credentials.
119137
- Keep `.env.template` in sync with required local-development environment variables.
120138

121139
## Commit And PR Rules
122140

123-
- Commit messages and PR titles should follow Conventional Commits: `type(scope): description` or `type: description`.
141+
- Commit messages and PR titles must follow Conventional Commits: `type(scope): description` or `type: description`.
142+
- Allowed common types include `feat`, `fix`, `docs`, `style`, `refactor`, `perf`, `test`, `build`, `ci`, `chore`, `revert`, and `security`.
124143
- Keep commits focused and atomic.
125-
- In PR descriptions, list the main verification commands you ran.
144+
- Before opening a PR, self-review the diff and check `code_review.md` for the repo-specific review checklist.
145+
- In PR descriptions, list the main verification commands you ran and call out any skipped checks with the reason.
126146

127147
## Python-Specific Notes
128148

CONTRIBUTING.md

Lines changed: 74 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -4,48 +4,100 @@
44

55
### Install dependencies
66

7-
```
8-
uv sync
7+
```bash
8+
uv sync --locked
99
```
1010

11-
### Add Pre-commit
11+
### Add pre-commit
1212

13-
```
13+
```bash
1414
uv run pre-commit install
1515
```
1616

17-
### Type Checking
17+
### Quality checks
1818

19-
To run type checking on the langfuse package, run:
20-
```sh
21-
uv run mypy langfuse --no-error-summary
19+
```bash
20+
uv run --frozen ruff check .
21+
uv run --frozen ruff format .
22+
uv run --frozen mypy langfuse --no-error-summary
23+
```
24+
25+
For a broad local confidence check, run:
26+
27+
```bash
28+
bash scripts/codex/quick-check.sh
2229
```
2330

2431
### Tests
2532

26-
#### Setup
33+
Unit tests do not require a running Langfuse server:
2734

28-
- Add .env based on .env.template
35+
```bash
36+
uv run --frozen pytest -n auto --dist worksteal tests/unit
37+
```
2938

30-
#### Run
39+
E2E tests require a running Langfuse server and environment variables based on `.env.template`:
3140

32-
- Run all
41+
```bash
42+
uv run --frozen pytest -n 4 --dist worksteal tests/e2e -m "not serial_e2e"
43+
uv run --frozen pytest tests/e2e -m "serial_e2e"
44+
```
45+
46+
Live-provider tests make real provider calls and require provider API keys:
47+
48+
```bash
49+
uv run --frozen pytest -n 4 --dist worksteal tests/live_provider -m "live_provider"
50+
```
51+
52+
Run a specific test with:
3353

34-
```
35-
uv run --env-file .env pytest -s -v --log-cli-level=INFO
36-
```
54+
```bash
55+
uv run --frozen pytest tests/unit/test_resource_manager.py::test_pause_signals_score_consumer_shutdown
56+
```
57+
58+
## Codex Cloud Setup
59+
60+
This repository includes repo-owned Codex setup so agents can start from a reproducible environment.
61+
62+
Recommended Codex UI configuration:
63+
64+
1. Create a Codex cloud environment for this repository.
65+
2. Set the setup script to:
66+
67+
```bash
68+
bash scripts/codex/setup.sh
69+
```
70+
71+
3. Set the maintenance script to:
72+
73+
```bash
74+
bash scripts/codex/maintenance.sh
75+
```
76+
77+
4. Keep agent internet access disabled by default, or allow only the domains required for the task.
78+
5. Add secrets and environment variables in the Codex UI instead of committing them.
79+
80+
## Pull Requests
81+
82+
PR titles and commit messages must follow Conventional Commits:
83+
84+
```text
85+
type(scope): description
86+
type: description
87+
```
3788

38-
- Run a specific test
89+
Common types include `feat`, `fix`, `docs`, `style`, `refactor`, `perf`, `test`, `build`, `ci`, `chore`, `revert`, and `security`.
3990

40-
```
41-
uv run --env-file .env pytest -s -v --log-cli-level=INFO tests/test_core_sdk.py::test_flush
42-
```
91+
Before opening a PR:
4392

44-
- E2E tests involving OpenAI and Serp API are usually skipped, remove skip decorators in [tests/test_langchain.py](tests/test_langchain.py) to run them.
93+
- Self-review the diff and use `code_review.md` for the repo-specific checklist.
94+
- Keep changes focused and avoid unrelated refactors.
95+
- Add or update tests for behavior changes.
96+
- List the verification commands you ran in the PR description.
4597

46-
### Update openapi spec
98+
### Update OpenAPI spec
4799

48-
A PR with the changes is automatically created upon changing the Spec in the langfuse repo.
100+
The generated API client in `langfuse/api/` must not be hand-edited. Regenerate it from the upstream Fern/OpenAPI source.
49101

50102
### Publish release
51103

code_review.md

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
# Langfuse Python SDK Review Checklist
2+
3+
Use this checklist for `/review`, PR review, or self-review before handoff.
4+
5+
## Priorities
6+
7+
- Findings first: correctness bugs, regressions, security/privacy risks, performance issues with real impact, and missing tests for risky behavior.
8+
- Keep line references tight and actionable.
9+
- If no findings, say so explicitly and mention any residual risk or unrun verification.
10+
11+
## SDK Correctness
12+
13+
- Public SDK behavior should remain backwards compatible unless the PR is explicitly breaking.
14+
- Prefer `LANGFUSE_BASE_URL`; `LANGFUSE_HOST` is deprecated and should only appear in compatibility paths or tests.
15+
- Check shutdown, flushing, background task, and resource-manager changes for races, dropped events/scores/media, daemon-thread leaks, and hanging interpreter shutdown.
16+
- OpenTelemetry changes should preserve context propagation, span parenting, exporter-local testability, and idempotent instrumentation setup.
17+
- OpenAI and LangChain instrumentation should avoid brittle assertions on provider internals; prefer stable exporter-local behavior in unit tests.
18+
19+
## API And Generated Code
20+
21+
- Do not hand-edit `langfuse/api/`; regenerate it from the upstream Fern/OpenAPI source.
22+
- Public API or serialization changes should include tests for request shape, response shape, and backwards-compatible aliases when relevant.
23+
- Update README examples, `.env.template`, or generated reference docs when changed behavior would make them stale.
24+
25+
## Tests And CI
26+
27+
- Unit tests must not require a running Langfuse server.
28+
- E2E tests should use bounded polling helpers from `tests/support/`, not raw `sleep()`.
29+
- New e2e files must be named `tests/e2e/test_*.py` so mechanical CI sharding includes them.
30+
- Use `serial_e2e` only for tests that are unsafe with shared-server concurrency.
31+
- Live-provider tests should assert stable provider-facing behavior, not exact observation counts unless counts are the behavior under test.
32+
33+
## Python Style
34+
35+
- Exception messages should not inline f-string literals in `raise` statements; build the message in a variable first.
36+
- Keep edits ASCII-only unless the file already uses Unicode or Unicode is clearly required.
37+
- Keep changes scoped; avoid opportunistic refactors.
38+
- Never commit secrets or credentials.

scripts/codex/maintenance.sh

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
#!/usr/bin/env bash
2+
set -euo pipefail
3+
4+
repo_root="$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd)"
5+
cd "$repo_root"
6+
7+
uv sync --locked
8+
uv cache prune --ci >/dev/null 2>&1 || true

scripts/codex/quick-check.sh

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
#!/usr/bin/env bash
2+
set -euo pipefail
3+
4+
repo_root="$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd)"
5+
cd "$repo_root"
6+
7+
uv run --frozen ruff check .
8+
uv run --frozen mypy langfuse --no-error-summary
9+
uv run --frozen pytest -n auto --dist worksteal tests/unit

scripts/codex/setup.sh

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
#!/usr/bin/env bash
2+
set -euo pipefail
3+
4+
repo_root="$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd)"
5+
cd "$repo_root"
6+
7+
if ! command -v uv >/dev/null 2>&1; then
8+
python3 -m pip install --user "uv==0.11.2"
9+
export PATH="$HOME/.local/bin:$PATH"
10+
fi
11+
12+
uv sync --locked
13+
uv run --frozen python --version

0 commit comments

Comments
 (0)