·
14 commits
to main
since this release
What's Changed
New Features
- feat(telemetry): latency histograms for LLM request duration and TTFB (#463) by @ajbozarth in #782
- feat: rename generative slots -> generative stubs by @jakelorocco in #801
- feat: (m-decompose) Module Prompt V3 by @csbobby in #770
- feat: simplify plugin tests; fix plugin resetting by @jakelorocco in #819
- feat: add examples and tooling tests to run_tests_with_ollama_and_vllm by @jakelorocco in #821
- feat: add return types to invoke_hook by @jakelorocco in #707
- feat: separate out remaining dependencies and improve tests by @jakelorocco in #789
- feat: add error counter metrics categorized by semantic type (#465) by @ajbozarth in #856
- refactor: improve fancylogger implementation by @AngeloDanducci in #792
- refactor: add otel tracing filter to logging by @AngeloDanducci in #859
- feat: streaming support in m serve OpenAI API server by @markstur in #823
- feat: first pass at carrying contextvars though async flows by @AngeloDanducci in #878
- refactor: add print statements to show code flow in mify example by @code4days in #870
- feat: add pricing registry and cost metrics (#464) by @ajbozarth in #882
- feat: add operational counters for sampling, requirements, and tools (#467) by @ajbozarth in #883
- feat: add --skip-resource-checks flag to bypass hardware capability g… by @ajbozarth in #889
- refactor!: partition ModelOutputThunk execution metadata into Generat… by @ajbozarth in #908
- feat: add additional logging handlers by @AngeloDanducci in #907
- feat(core): add PartialValidationResult with tri-state semantics by @planetf1 in #924
- feat(stdlib): add ChunkingStrategy ABC and built-in chunkers by @planetf1 in #923
- feat: add prompt cache token support to cost telemetry by @ajbozarth in #936
- feat: add stream_validate() hook to Requirement (#900) by @planetf1 in #925
- feat(examples): add extra_requirements param to IVR qiskit validation by @ajbozarth in #955
- feat: add embedded adapters (granite switch) to openai backend by @jakelorocco in #881
- refactor(telemetry): replace builtin_pricing.json with litellm pricing API by @ajbozarth in #956
- feat: simplify intrinsics (code and examples) by @jakelorocco in #946
- feat: granite4.1 by @avinash2692 in #964
- feat: allow
namefield in intrinsics io.yaml by @ink-pad in #980 - feat: handle message docs correctly by @jakelorocco in #975
- feat: update granite library examples to use Granite 4.1 3B adapters. by @nrfulton in #981
Bug Fixes
- fix: restore example collection during directory traversal (#794) by @planetf1 in #795
- fix: redirect /how-to/safety-guardrails to existing security page (#788) by @planetf1 in #803
- fix(cli): handle sync/async serve functions in m serve by @markstur in #784
- fix: evict Ollama models between test modules to prevent memory starvation by @planetf1 in #804
- fix: sofai graph coloring example — broken model and incorrect problem #806 by @planetf1 in #807
- fix: flush MPS cache in alora test GPU cleanup (#790) by @planetf1 in #800
- fix(test): widen hallucination detection tolerance (#809) by @planetf1 in #810
- fix: reload module for telemetry testing so all tests can run by @jakelorocco in #805
- fix: handle stale .vllm-venv in test runner by @planetf1 in #829
- fix: remove all mentions to RITS by @guicho271828 in #868
- fix: granite33 response_end span uses sentence length not full respon… by @planetf1 in #845
- fix: run zizmor checker for github actions to ensure security by @jakelorocco in #854
- fix: render Click \b verbatim blocks in CLI reference docs (#866) by @planetf1 in #867
- fix: fixes invalid workflow file by @markstur in #877
- fix: granite33 citation spans wrong for duplicate sentences (#851) by @planetf1 in #872
- fix: fixing test bugs with xfail by @avinash2692 in #886
- fix: handle nested JSON in parse_judge_output via raw_decode by @sjoerdvink99 in #875
- fix: disable OCR in RichDocument CI test to avoid modelscope.cn download by @ajbozarth in #888
- fix: update hallucination_detection fixture for upstream NA enum addition by @ajbozarth in #918
- fix: remove wall time checks from tracing_backend tests by @jakelorocco in #927
- fix: add missing nav and fix cli ref by @AngeloDanducci in #922
- fix: add vllm pytest marker back by @jakelorocco in #933
- fix: raise ValueError on duplicate subtask tags in reorder_subtasks by @sjoerdvink99 in #874
- fix: replace asyncio.sleep FAF guards with deterministic awaits by @ajbozarth in #919
- fix: removing ollama hardcoding in examples, guardian, and test by @avinash2692 in #912
- fix: pin uncertainty and context-attribution revisions and update uncertai… by @AngeloDanducci in #970
- fix: swap python decompose example model by @AngeloDanducci in #968
- fix: model options with intrinsics by @jakelorocco in #972
- fix: add guardian intrinsic document by @subhajitchaudhury in #966
- fix: key in json object returned by policy_guardrails intrinsic by @monindersingh in #979
- fix: default intrinsic adapter types by @jakelorocco in #994
- fix: issues introduced by intrinsic changes by @jakelorocco in #986
- fix: update model ids and documentation links for switch by @jakelorocco in #997
- fix: move test_huggingface.py to granite4.1; and small rag intrinsic … by @jakelorocco in #1008
- fix: prevent major releases by @jakelorocco in #1016
Documentation
- docs: add redirects for former pages by @psschwei in #846
- docs: add CLI reference page and remove CLI from API docs (#704) by @planetf1 in #852
- docs: add AI attribution policy by @ajbozarth in #848
- docs: consolidate how-to section by @psschwei in #893
- docs: add generation_error hook to plugins page, remove stale plan doc by @ajbozarth in #887
- docs: fix 'convienance' -> 'convenience' (5 occurrences) by @MukundaKatta in #894
- docs: move glossary to reference section by @psschwei in #892
- docs: document two session creation patterns by @akihikokuroda in #906
- docs: add backend selection lookup table by @akihikokuroda in #905
- docs: restructure sidebar — split Observability from Evaluation, move LLM-as-a-Judge to How-To by @ajbozarth in #895
- docs: add metadata to code block by @akihikokuroda in #917
- docs: test based eval documentation by @seirasto in #916
- docs: fix link to CONTRIBUTING guide by @seirasto in #960
- docs: add expected output blocks and update quickstart examples by @AngeloDanducci in #957
- docs: add architecture diagram for intrinsics by @jakelorocco in #998
Other Changes
- chore: update governance by @psschwei in #799
- test: add unit tests for stdlib/requirements (#814) by @planetf1 in #820
- test: add tool_arg_validator edge case test, fix typo (#826) by @planetf1 in #831
- test: add unit tests for helpers (#815) by @planetf1 in #847
- test: add unit tests for granite formatters (#812) by @planetf1 in #818
- test: unit tests for backend pure logic (cache, catalog, bedrock) by @planetf1 in #832
- chore: add info for working with intrinsics to AGENTS.md by @psschwei in #768
- test: add unit and integration tests for stdlib components (#817) by @planetf1 in #830
- test: unit tests for CLI decompose and eval pure-logic helpers (#861) by @planetf1 in #863
- test: pure-logic unit tests for stdlib, core, backends, telemetry (#860) by @planetf1 in #862
- ci: add actionlint to validate workflow files on PRs by @planetf1 in #880
- chore: Update expected test outputs to reflect upstream config changes by @frreiss in #897
- chore: removing some comments by @avinash2692 in #978
- test: add tests for new intrinsic field name by @jakelorocco in #988
- release: bump minor version by @jakelorocco in #977
- ci: add action for holding PRs (preventing merge) by @psschwei in #1014
New Contributors
- @sjoerdvink99 made their first contribution in #875
- @MukundaKatta made their first contribution in #894
- @seirasto made their first contribution in #916
- @subhajitchaudhury made their first contribution in #966
- @monindersingh made their first contribution in #979
Full Changelog: v0.4.2...v0.5.0