From natural language queries to SQL insights with 6 layers of context and automatic learning.
About · Architecture · 6 Layers · Getting Started · Tech Stack · Demo
Table of Contents
Recall is a production-ready Self-Learning MCP Data Agent that delivers insights, not just SQL results. Inspired by OpenAI's internal data agent, Recall grounds every query in 6 layers of context and improves automatically with every interaction.
Traditional SQL agents fail repeatedly on the same errors. They return raw SQL output that users still need to interpret. They lack observability into their reasoning process.
Recall is different. It treats errors as learning opportunities, synthesizes natural language insights from query results, and exposes every step of its pipeline for full transparency.
flowchart LR
subgraph before [Traditional]
SQL[Raw SQL Results]
Static[Static Knowledge]
Opaque[Opaque Process]
Error[Error-Prone]
end
subgraph after [Recall]
Insights[Natural Language Insights]
SelfLearn[Curated + Discovered]
Observable[7-Step Visualization]
SelfCorrect[Self-Correcting]
end
SQL -->|"SQL to Insights"| Insights
Static -->|"Static to Self-Learning"| SelfLearn
Opaque -->|"Opaque to Observable"| Observable
Error -->|"Error-Prone to Self-Correcting"| SelfCorrect
style SQL fill:#546e7a,stroke:#90a4ae,stroke-width:2px,color:#eceff1
style Static fill:#546e7a,stroke:#90a4ae,stroke-width:2px,color:#eceff1
style Opaque fill:#546e7a,stroke:#90a4ae,stroke-width:2px,color:#eceff1
style Error fill:#546e7a,stroke:#90a4ae,stroke-width:2px,color:#eceff1
style Insights fill:#1b5e20,stroke:#66bb6a,stroke-width:2px,color:#f1f8e9
style SelfLearn fill:#1b5e20,stroke:#66bb6a,stroke-width:2px,color:#f1f8e9
style Observable fill:#1b5e20,stroke:#66bb6a,stroke-width:2px,color:#f1f8e9
style SelfCorrect fill:#1b5e20,stroke:#66bb6a,stroke-width:2px,color:#f1f8e9
Built for Hackathons, Ready for Production. Recall was built for the "2 Fast 2 MCP" hackathon with OpenTelemetry traces, Prometheus metrics, Docker + Kubernetes deployment, and a ChatGPT-style UI with process visualization.
Recall follows a context-grounded agent architecture where every SQL query is enriched with 6 layers of contextual knowledge before execution.
graph TB
User[User Query] -->|Natural Language| UI[React Frontend]
UI -->|POST Request| API[FastAPI Server]
API -->|MCP Protocol| Agent[Recall Agent]
Agent -->|Step 1| Parse[Parse Query]
Parse -->|Step 2| Knowledge[Search Knowledge Base]
Knowledge -->|Step 3| Learnings[Retrieve Learnings]
Learnings -->|Step 4| Schema[Introspect Schema]
Schema -->|Step 5| SQLGen[Generate SQL]
SQLGen -->|Step 6| Execute[Execute Query]
Execute -->|Step 7| Format[Format Insights]
Knowledge -->|Retrieve| VectorDB[(pgvector)]
Learnings -->|Retrieve| LearningDB[(Learning Machine)]
Execute -->|Query| PostgreSQL[(PostgreSQL)]
SQLGen -->|LLM Call| Gemini[Google Gemini]
Format -->|LLM Call| Gemini
Format -->|Natural Language Answer| UI
UI -->|Display with Process Viz| User
API -->|Traces| OTEL[OpenTelemetry]
OTEL -->|Metrics| Prometheus[Prometheus]
style User fill:#546e7a,stroke:#90a4ae,stroke-width:2px,color:#eceff1
style UI fill:#1565c0,stroke:#4a9eff,stroke-width:2px,color:#e3f2fd
style API fill:#0d47a1,stroke:#42a5f5,stroke-width:2px,color:#e3f2fd
style Agent fill:#1b5e20,stroke:#66bb6a,stroke-width:3px,color:#f1f8e9
style Parse fill:#1a3d2b,stroke:#4caf7d,stroke-width:2px,color:#e8f5e9
style Knowledge fill:#1a3d2b,stroke:#4caf7d,stroke-width:2px,color:#e8f5e9
style Learnings fill:#1a3d2b,stroke:#4caf7d,stroke-width:2px,color:#e8f5e9
style Schema fill:#1a3d2b,stroke:#4caf7d,stroke-width:2px,color:#e8f5e9
style SQLGen fill:#1a3d2b,stroke:#4caf7d,stroke-width:2px,color:#e8f5e9
style Execute fill:#1a3d2b,stroke:#4caf7d,stroke-width:2px,color:#e8f5e9
style Format fill:#1a3d2b,stroke:#4caf7d,stroke-width:2px,color:#e8f5e9
style VectorDB fill:#006064,stroke:#00e5ff,stroke-width:2px,color:#e0f7fa
style LearningDB fill:#006064,stroke:#00e5ff,stroke-width:2px,color:#e0f7fa
style PostgreSQL fill:#bf360c,stroke:#ff8a65,stroke-width:2px,color:#fbe9e7
style Gemini fill:#4a148c,stroke:#ce93d8,stroke-width:2px,color:#f3e5f5
style OTEL fill:#004d40,stroke:#4db6ac,stroke-width:2px,color:#e0f2f1
style Prometheus fill:#004d40,stroke:#4db6ac,stroke-width:2px,color:#e0f2f1
Most SQL agents use 1-2 layers of context. Recall uses 6 distinct layers that are semantically retrieved and combined at query time.
flowchart TB
L1[Layer 1: Table Usage Patterns<br/>tables/*.json]
L2[Layer 2: Business Rules & Metrics<br/>business/*.json]
L3[Layer 3: Query Patterns<br/>queries/*.sql]
L4[Layer 4: Institutional Knowledge<br/>Exa MCP]
L5[Layer 5: Dynamic Learnings<br/>Learning Machine]
L6[Layer 6: Runtime Context<br/>introspect_schema]
L1 --> L2 --> L3 --> L4 --> L5 --> L6
style L1 fill:#1565c0,stroke:#4a9eff,stroke-width:2px,color:#e3f2fd
style L2 fill:#4a148c,stroke:#ce93d8,stroke-width:2px,color:#f3e5f5
style L3 fill:#1b5e20,stroke:#66bb6a,stroke-width:2px,color:#f1f8e9
style L4 fill:#bf360c,stroke:#ff8a65,stroke-width:2px,color:#fbe9e7
style L5 fill:#6a1b9a,stroke:#ba68c8,stroke-width:2px,color:#f3e5f5
style L6 fill:#006064,stroke:#00e5ff,stroke-width:2px,color:#e0f7fa
Example (Layer 1):
{
"table": "race_wins",
"columns": ["driver_name", "race_name", "date"],
"common_joins": ["drivers_championship"],
"date_format": "DD Mon YYYY"
}Click the thumbnail above to watch the full project demo on YouTube.
flowchart LR
P1[1. Parse] --> P2[2. Knowledge]
P2 --> P3[3. Learnings]
P3 --> P4[4. Schema]
P4 --> P5[5. SQL]
P5 --> P6[6. Execute]
P6 --> P7[7. Format]
style P1 fill:#1565c0,stroke:#4a9eff,stroke-width:2px,color:#e3f2fd
style P2 fill:#4a148c,stroke:#ce93d8,stroke-width:2px,color:#f3e5f5
style P3 fill:#1b5e20,stroke:#66bb6a,stroke-width:2px,color:#f1f8e9
style P4 fill:#bf360c,stroke:#ff8a65,stroke-width:2px,color:#fbe9e7
style P5 fill:#6a1b9a,stroke:#ba68c8,stroke-width:2px,color:#f3e5f5
style P6 fill:#006064,stroke:#00e5ff,stroke-width:2px,color:#e0f7fa
style P7 fill:#004d40,stroke:#4db6ac,stroke-width:2px,color:#e0f2f1
ChatGPT-style UI with dark theme, expandable data views per step, natural language responses, and copy-to-clipboard for SQL.
flowchart TB
subgraph core [Core Intelligence]
Agno[Agno MCP]
MCP[MCP Protocol]
Gemini[Gemini]
Ollama[Ollama / nomic-embed]
end
subgraph backend [Backend & API]
FastAPI[FastAPI]
Python[Python 3.12]
Pydantic[Pydantic v2]
PG[(PostgreSQL + pgvector)]
end
subgraph knowledge [Knowledge]
pgvec[pgvector]
JSON[JSON Knowledge]
Learning[Learning Machine]
end
subgraph frontend [Frontend]
React[React 18 + Vite]
Tailwind[Tailwind CSS]
Inter[Inter Font]
end
subgraph infra [Infrastructure]
Docker[Docker + K8s]
OTEL[OpenTelemetry]
Prom[Prometheus + Grafana]
end
style Agno fill:#1b5e20,stroke:#66bb6a,stroke-width:2px,color:#f1f8e9
style MCP fill:#1b5e20,stroke:#66bb6a,stroke-width:2px,color:#f1f8e9
style Gemini fill:#4a148c,stroke:#ce93d8,stroke-width:2px,color:#f3e5f5
style Ollama fill:#4a148c,stroke:#ce93d8,stroke-width:2px,color:#f3e5f5
style FastAPI fill:#1a3d2b,stroke:#4caf7d,stroke-width:2px,color:#e8f5e9
style Python fill:#1a3d2b,stroke:#4caf7d,stroke-width:2px,color:#e8f5e9
style Pydantic fill:#1a3d2b,stroke:#4caf7d,stroke-width:2px,color:#e8f5e9
style PG fill:#006064,stroke:#00e5ff,stroke-width:2px,color:#e0f7fa
style pgvec fill:#006064,stroke:#00e5ff,stroke-width:2px,color:#e0f7fa
style JSON fill:#bf360c,stroke:#ff8a65,stroke-width:2px,color:#fbe9e7
style Learning fill:#006064,stroke:#00e5ff,stroke-width:2px,color:#e0f7fa
style React fill:#1565c0,stroke:#4a9eff,stroke-width:2px,color:#e3f2fd
style Tailwind fill:#1565c0,stroke:#4a9eff,stroke-width:2px,color:#e3f2fd
style Inter fill:#1565c0,stroke:#4a9eff,stroke-width:2px,color:#e3f2fd
style Docker fill:#004d40,stroke:#4db6ac,stroke-width:2px,color:#e0f2f1
style OTEL fill:#004d40,stroke:#4db6ac,stroke-width:2px,color:#e0f2f1
style Prom fill:#004d40,stroke:#4db6ac,stroke-width:2px,color:#e0f2f1
- Docker & Docker Compose (for simplest setup)
- Gemini API Key from Google AI Studio
- Ollama installed on host machine for embeddings:
nomic-embed-text:latest(embeddings only)
- Node.js 18+ (for frontend development)
- Python 3.12+ (for backend development)
flowchart TB
S1[1. Clone] --> S2[2. Config]
S2 --> S3[3. Ollama]
S3 --> S4[4. Docker Up]
S4 --> S5[5. Load Data]
S5 --> S6[6. Load Knowledge]
S6 --> S7[7. Frontend]
S7 --> S8[8. Open Browser]
style S1 fill:#1565c0,stroke:#4a9eff,stroke-width:2px,color:#e3f2fd
style S2 fill:#4a148c,stroke:#ce93d8,stroke-width:2px,color:#f3e5f5
style S3 fill:#1b5e20,stroke:#66bb6a,stroke-width:2px,color:#f1f8e9
style S4 fill:#bf360c,stroke:#ff8a65,stroke-width:2px,color:#fbe9e7
style S5 fill:#6a1b9a,stroke:#ba68c8,stroke-width:2px,color:#f3e5f5
style S6 fill:#006064,stroke:#00e5ff,stroke-width:2px,color:#e0f7fa
style S7 fill:#1565c0,stroke:#4a9eff,stroke-width:2px,color:#e3f2fd
style S8 fill:#1b5e20,stroke:#66bb6a,stroke-width:2px,color:#f1f8e9
git clone https://github.com/Keerthivasan-Venkitajalam/Recall.git && cd Recall
cp example.env .env # Add GEMINI_API_KEY
ollama pull nomic-embed-text:latest
docker compose up -d --build
docker exec recall-api python -m recall.scripts.load_data
docker exec recall-api python -m recall.scripts.load_knowledge
cd web && npm install && npm run dev
# Open http://localhost:5174/ and http://localhost:8000/docscurl http://localhost:8000/health/dependencies | jqExpected response:
{
"status": "healthy",
"checks": {
"database": { "status": "healthy" },
"vector_db_knowledge": { "status": "healthy", "sample_count": 8 },
"vector_db_learnings": { "status": "healthy" }
}
}Recall is configured via environment variables in .env:
| Variable | Description | Default |
|---|---|---|
GEMINI_API_KEY |
Google Gemini API key | (required) |
MODEL_PROVIDER |
LLM provider (not used with Gemini) | - |
EMBEDDER_PROVIDER |
Embedding provider (nomic via Ollama) | nomic |
OLLAMA_BASE_URL |
Ollama API endpoint for embeddings | http://host.docker.internal:11434 |
DB_HOST |
PostgreSQL host | recall-db |
DB_PORT |
PostgreSQL port | 5432 |
DB_NAME |
Database name | ai |
DB_USER |
Database user | keerthi |
DB_PASSWORD |
Database password | ***** |
LOG_LEVEL |
Logging level | INFO |
The compose.yaml includes:
- recall-api: FastAPI backend with Agno MCP framework and Gemini
- recall-db: PostgreSQL 18 with pgvector extension
- Health checks on both services
- Persistent volumes for database
- Host network access via
extra_hostsfor Ollama embeddings
flowchart TB
User[User Query] --> Agent[Recall Agent]
Agent --> Static[Static Knowledge<br/>recall/knowledge/]
Agent --> Dynamic[Dynamic Learnings<br/>Learning Machine]
Static --> VectorDB[(pgvector)]
Dynamic --> VectorDB
VectorDB --> Future[Future Queries]
subgraph curated [Curated by You]
Static
end
subgraph discovered [Auto-Discovered]
Dynamic
end
style User fill:#546e7a,stroke:#90a4ae,stroke-width:2px,color:#eceff1
style Agent fill:#1b5e20,stroke:#66bb6a,stroke-width:2px,color:#f1f8e9
style Static fill:#1565c0,stroke:#4a9eff,stroke-width:2px,color:#e3f2fd
style Dynamic fill:#4a148c,stroke:#ce93d8,stroke-width:2px,color:#f3e5f5
style VectorDB fill:#006064,stroke:#00e5ff,stroke-width:2px,color:#e0f7fa
style Future fill:#1b5e20,stroke:#66bb6a,stroke-width:2px,color:#f1f8e9
Example learning (auto-captured):
{
"pattern": "date_parsing_f1",
"context": "F1 race_wins table has dates as 'DD Mon YYYY' text format",
"solution": "Use TO_DATE(date, 'DD Mon YYYY') for year extraction"
}recall/
├── agents.py # Recall and Reasoning agents
├── paths.py # Path constants
├── context/ # Context layer implementations
│ ├── semantic_model.py # Layer 1: Table usage
│ └── business_rules.py # Layer 2: Business rules
├── tools/ # Agent tools
│ ├── introspect.py # Layer 6: Runtime context
│ └── save_query.py # Save validated queries
├── scripts/
│ ├── load_data.py # Load F1 sample data
│ └── load_knowledge.py # Load knowledge files
├── knowledge/ # Static knowledge files
│ ├── tables/ # Layer 1 schemas
│ ├── queries/ # Layer 3 patterns
│ └── business/ # Layer 2 rules
├── evals/
│ ├── test_cases.py # Test cases with golden SQL
│ ├── grader.py # LLM-based response grader
│ └── run_evals.py # Run evaluations
web/
├── src/
│ ├── App.jsx # Main React component
│ ├── main.jsx # React entry point
│ └── styles.css # Tailwind + custom styles
├── index.html # HTML template
├── vite.config.js # Vite configuration
├── tailwind.config.cjs # Tailwind configuration
└── package.json # Frontend dependencies
db/
├── session.py # PostgreSQL session factory
└── url.py # Database URL builder
k8s/
└── base/ # Kubernetes manifests
# Backend tests
python -m recall.evals.run_evals # All evals (string matching)
python -m recall.evals.run_evals -c basic # Specific category
python -m recall.evals.run_evals -v # Verbose mode
python -m recall.evals.run_evals -g # Use LLM grader
python -m recall.evals.run_evals -r # Compare against golden SQL results
python -m recall.evals.run_evals -g -r -v # All modes combined
# Frontend (in web/ directory)
npm test # Run tests
npm run build # Production build# Backend
./scripts/format.sh # Format code (black, isort)
./scripts/validate.sh # Lint + type check (ruff, mypy)
# Frontend
cd web
npm run lint # ESLint# Build and deploy
docker compose -f compose.yaml up -d --build
# View logs
docker logs -f recall-api
docker logs -f recall-db
# Health check
curl http://localhost:8000/health/dependencies# Apply configurations
kubectl apply -k k8s/base/
# Check status
kubectl get pods -n recall
kubectl logs -n recall deployment/recall-api
# Port forward for local access
kubectl port-forward -n recall svc/recall-api 8000:80001. Complex Data Analysis — "Who won the most races in 2019?"
flowchart LR
Parse[Parse] --> Know[Knowledge]
Know --> Learn[Learnings]
Learn --> Schema[Schema]
Schema --> SQL[SQL Gen]
SQL --> Exec[Execute]
Exec --> Format[Format]
Format --> Result["Lewis Hamilton, 11 wins"]
style Parse fill:#1565c0,stroke:#4a9eff,color:#e3f2fd
style Know fill:#4a148c,stroke:#ce93d8,color:#f3e5f5
style Learn fill:#1b5e20,stroke:#66bb6a,color:#f1f8e9
style Schema fill:#bf360c,stroke:#ff8a65,color:#fbe9e7
style SQL fill:#6a1b9a,stroke:#ba68c8,color:#f3e5f5
style Exec fill:#006064,stroke:#00e5ff,color:#e0f7fa
style Format fill:#004d40,stroke:#4db6ac,color:#e0f2f1
style Result fill:#1b5e20,stroke:#66bb6a,color:#f1f8e9
2. Self-Correction on Error — "Constructor standings for 2020"
flowchart LR
Try1[year = 2020] --> Err[Error]
Err --> Capture[Learning: use season]
Capture --> Retry[season = 2020]
Retry --> Success[Success]
Success --> Save[Saved for future]
style Try1 fill:#546e7a,stroke:#90a4ae,color:#eceff1
style Err fill:#b71c1c,stroke:#ef5350,color:#ffebee
style Capture fill:#bf360c,stroke:#ff8a65,color:#fbe9e7
style Retry fill:#1565c0,stroke:#4a9eff,color:#e3f2fd
style Success fill:#1b5e20,stroke:#66bb6a,color:#f1f8e9
style Save fill:#1b5e20,stroke:#66bb6a,color:#f1f8e9
3. Cross-Table Insights — "Fastest lap times in Monaco 2019"
flowchart TB
L1[Layer 1: fastest_laps + race_results]
L2[Layer 2: Monaco = race_name]
L3[Layer 3: Query pattern]
L4[Layer 4: Web context]
L1 --> Join[Complex Join]
L2 --> Join
L3 --> Join
L4 --> Join
Join --> FirstTry[First-try success]
style L1 fill:#1565c0,stroke:#4a9eff,color:#e3f2fd
style L2 fill:#4a148c,stroke:#ce93d8,color:#f3e5f5
style L3 fill:#1b5e20,stroke:#66bb6a,color:#f1f8e9
style L4 fill:#bf360c,stroke:#ff8a65,color:#fbe9e7
style Join fill:#6a1b9a,stroke:#ba68c8,color:#f3e5f5
style FirstTry fill:#1b5e20,stroke:#66bb6a,color:#f1f8e9
flowchart LR
Req[Request] --> Parse[parse_query]
Parse --> Know[search_knowledge]
Know --> Learn[search_learnings]
Learn --> Schema[introspect_schema]
Schema --> SQL[generate_sql]
SQL --> Exec[execute_query]
Exec --> Format[format_insight]
Format --> OTEL[OpenTelemetry]
OTEL --> Prom[Prometheus]
Prom --> Grafana[Grafana]
style Req fill:#546e7a,stroke:#90a4ae,color:#eceff1
style Parse fill:#1a3d2b,stroke:#4caf7d,color:#e8f5e9
style Know fill:#1a3d2b,stroke:#4caf7d,color:#e8f5e9
style Learn fill:#1a3d2b,stroke:#4caf7d,color:#e8f5e9
style Schema fill:#1a3d2b,stroke:#4caf7d,color:#e8f5e9
style SQL fill:#1a3d2b,stroke:#4caf7d,color:#e8f5e9
style Exec fill:#1a3d2b,stroke:#4caf7d,color:#e8f5e9
style Format fill:#1a3d2b,stroke:#4caf7d,color:#e8f5e9
style OTEL fill:#004d40,stroke:#4db6ac,color:#e0f2f1
style Prom fill:#bf360c,stroke:#ff8a65,color:#fbe9e7
style Grafana fill:#4a148c,stroke:#ce93d8,color:#f3e5f5
Prometheus Metrics (at /metrics)
recall_queries_total{status="success"} 127
recall_query_duration_seconds_bucket{le="5.0"} 120
recall_knowledge_searches_total{layer="table_usage"} 127
recall_llm_tokens_total{model="gemini"} 52341
Grafana dashboard: query throughput, latency percentiles (p50/p95/p99), token usage, learning rate.
Common issues and quick fixes. Ensure
GEMINI_API_KEY, Ollama (nomic-embed-text), and Docker are configured.
| Issue | Solution |
|---|---|
| "Gemini API error" | Verify GEMINI_API_KEY in .env is valid. Check API quota at Google AI Studio. |
| "Knowledge base empty" | Run docker exec recall-api python -m recall.scripts.load_knowledge to index documents. |
| "Frontend can't connect to API" | Check VITE_API_URL in web/.env. Ensure backend is running on port 8000. |
| "Embedding model error" | Ensure Ollama is running and nomic-embed-text is pulled: ollama list. |
| "Database connection failed" | Check DB_* environment variables. Ensure recall-db container is healthy: docker ps. |
| "Process monitor not expanding" | Click on completed steps (green checkmark). Only completed steps have data. |
Contributions are welcome! Whether you're fixing bugs, adding new context layers, improving the UI, or enhancing documentation, your help is appreciated.
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add some amazing feature') - Add tests for new functionality
- Run the eval suite (
python -m recall.evals.run_evals) - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
- Follow PEP 8 style guidelines
- Write type hints for all functions
- Add docstrings for public APIs
- Update knowledge files when schema changes
- Keep commits atomic and well-described
- Test both backend and frontend changes
| Dev | GitHub |
|---|---|
| Keerthivasan S V | Keerthivasan-Venkitajalam |
| Sri Krishna Vundavalli | Sri-Krishna-V |
| Kavinesh | Kavinesh11 |
| Sai Nivedh | SaiNivedh26 |
Apache License 2.0. See LICENSE for details.
Built with Agno MCP Framework. Inspired by OpenAI's in-house data agent. Learning Machine powered by pgvector. Built for the 2 Fast 2 MCP Hackathon.

