-
Notifications
You must be signed in to change notification settings - Fork 0
API Providers
This guide covers all supported AI providers, their features, and how to configure them.
- Overview
- Ollama (Local)
- Groq
- HuggingFace
- OpenRouter
- OpenAI
- Provider Comparison
- Adding Custom Providers
Chat Linux Client supports multiple AI providers, giving you flexibility in choosing the best model for your needs:
| Provider | Type | Cost | Speed | Quality | Offline |
|---|---|---|---|---|---|
| Ollama | Local | Free | Variable | Good | Yes |
| Groq | Cloud | Free tier | Very Fast | Good | No |
| HuggingFace | Cloud | Free tier | Medium | Variable | No |
| OpenRouter | Cloud | Pay-per-use | Fast | Excellent | No |
| OpenAI | Cloud | Pay-per-use | Fast | Excellent | No |
Ollama provides local AI models that run entirely on your machine.
- Free: No API costs
- Privacy: Data never leaves your machine
- Offline: Works without internet
- No Rate Limits: Use as much as you want
- Hardware: Requires capable CPU/GPU
- Model Size: Models take disk space (1-5GB each)
- Speed: Slower than cloud for large models
- Model Selection: Limited to available models
curl -fsSL https://ollama.ai/install.sh | sh# Lightweight (1.3GB)
ollama pull llama3.2:1b
# Balanced (1.9GB)
ollama pull qwen2.5:3b
# Capable (2.2GB)
ollama pull phi3.5:3.8b
# Large (4.4GB)
ollama pull mistral:7bOllama is automatically detected. Ensure it's running:
ollama serve- Privacy-sensitive conversations
- Offline work
- Cost-sensitive users
- Development and testing
Groq provides ultra-low latency inference using their LPU (Language Processing Unit).
- Speed: Fastest inference available
- Free Tier: Generous free usage
- Quality: Good model selection
- Latency: Sub-100ms response times
- Rate Limits: Free tier has limits
- Internet Required: Cloud-based
- Privacy: Data sent to Groq servers
- Visit https://console.groq.com/
- Sign up or log in
- Navigate to API Keys section
- Create a new API key
Add to .env file or settings:
GROQ_API_KEY=gsk_your_actual_api_key_here-
llama-3.1-8b-instant- Fast, balanced -
llama-3.1-70b-versatile- Capable -
mixtral-8x7b-32768- Large context
- Real-time applications
- Speed-critical tasks
- Free tier usage
- Interactive conversations
HuggingFace provides access to thousands of open-source models.
- Variety: Thousands of models available
- Free Tier: Many models are free
- Open Source: Community-driven models
- Customization: Can use custom models
- Variable Quality: Quality varies by model
- Speed: Slower than dedicated providers
- Complexity: More configuration options
- Rate Limits: Free tier has limits
- Visit https://huggingface.co/settings/tokens
- Sign up or log in
- Create a new token
- Copy the token
Add to .env file or settings:
HUGGINGFACE_API_KEY=hf_your_actual_api_key_heremeta-llama/Llama-2-7b-chat-hfmistralai/Mistral-7B-Instruct-v0.2google/gemma-7b
- Experimenting with different models
- Using specialized models
- Open-source preference
- Custom model deployment
OpenRouter provides access to multiple models from various providers through a single API.
- Variety: Access to many models
- Unified API: Single key for multiple models
- Comparison: Easy to compare models
- Flexible: Pay-per-use pricing
- Cost: Pay-per-use (no free tier)
- Complexity: Many options to choose from
- Internet Required: Cloud-based
- Privacy: Data sent to OpenRouter
- Visit https://openrouter.ai/keys
- Sign up or log in
- Add credits to your account
- Create an API key
Add to .env file or settings:
OPENROUTER_API_KEY=sk-or-your_actual_api_key_hereanthropic/claude-3-opusopenai/gpt-4-turbogoogle/gemini-pro- And many more
- Accessing premium models
- Comparing different models
- Production use
- Flexible model selection
OpenAI provides state-of-the-art GPT models.
- Quality: Best-in-class models
- Reliability: Highly reliable service
- Documentation: Excellent documentation
- Ecosystem: Large ecosystem of tools
- Cost: Most expensive option
- Rate Limits: Strict rate limits
- Internet Required: Cloud-based
- Privacy: Data sent to OpenAI
- Visit https://platform.openai.com/account/api-keys
- Sign up or log in
- Create a new API key
- Add credits to your account
Add to .env file or settings:
OPENAI_API_KEY=sk-your_actual_api_key_here-
gpt-4o- Latest, most capable -
gpt-4-turbo- High quality -
gpt-3.5-turbo- Cost-effective
- Highest quality requirements
- Professional use
- Complex tasks
- Production applications
- Groq - Fastest (sub-100ms)
- OpenAI - Fast (200-500ms)
- OpenRouter - Fast (200-500ms)
- HuggingFace - Medium (500-1000ms)
- Ollama - Variable (depends on hardware)
- Ollama - Free (hardware cost only)
- Groq - Free tier available
- HuggingFace - Free tier available
- OpenRouter - Pay-per-use (moderate)
- OpenAI - Pay-per-use (expensive)
- OpenAI - Best overall
- OpenRouter - Excellent (depends on model)
- Groq - Good
- HuggingFace - Variable
- Ollama - Good (depends on model)
Chat Linux Client has an extensible architecture for adding custom providers.
- Create a new client file in
core/(e.g.,custom_provider.py) - Inherit from
APIClientbase class - Implement required methods:
chat_completion()chat_completion_stream()test_connection()
- Add provider configuration to
core/settings.py - Register in
core/provider_router.py - Add tests in
tests/
See existing providers in core/ for reference:
core/groq_client.pycore/huggingface_client.pycore/openrouter_client.pycore/openai_client.py