File tree Expand file tree Collapse file tree
Expand file tree Collapse file tree Original file line number Diff line number Diff line change 1+
2+ # LLM Output Evaluator
3+
4+ This script evaluates the outputs of Large Language Models (LLMs) and estimates the associated token usage and cost.
5+
6+ It supports batch evalaution via a configuration CSV and produces a detailed metrics report in CSV format.
7+
8+ ## Usage
9+
10+ Ensure you have the ` lighteval ` library and any model SDKs (e.g., OpenAI, Anthropic) configured properly.
11+
12+
13+ ``` bash
14+ python evals.py --config path/to/config.csv --reference path/to/reference.csv --output path/to/results.csv
15+ ```
16+
17+ The arguments to the script are:
18+
19+ - Path to the config CSV file (model, query, context)
20+ - Path to the reference CSV file
21+ - Path where the evaluation resulst will be saved
22+
23+
24+ The script outputs a CSV with the following columns:
25+
26+ * Evaluates LLM outputs for:
27+
28+ * Extractiveness Coverage
29+ * Extractiveness Density
30+ * Extractiveness Compression
31+
32+ * Computes:
33+
34+ * Token usage (input/output)
35+ * Estimated cost in USD
36+ * Duration (in seconds)
You can’t perform that action at this time.
0 commit comments