Single API. 16 models. 10-20x faster than typical direct endpoints.
Sixfinger is a production-ready AI gateway that routes requests to the best available model family with streaming, plan-aware access control, and multilingual performance built in.
| Provider | Speed |
|---|---|
| Sixfinger API | ~1,100 char/s |
| Claude-class APIs | ~80–120 char/s |
| Typical GPT APIs | ~50–100 char/s |
One key. One endpoint. No provider switching.
curl -X POST https://sfapi.pythonanywhere.com/api/v1/chat \
-H "X-API-Key: YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{"message": "Hello!", "model": "gpt54-nano", "stream": false}'{
"response": "Hello! How can I help you?",
"model_key": "gpt54-nano",
"usage": { "total_tokens": 12 }
}Streaming (SSE):
import requests
url = "https://sfapi.pythonanywhere.com/api/v1/chat"
headers = {"X-API-Key": "YOUR_KEY", "Content-Type": "application/json"}
body = {"message": "Tell me a story", "model": "gpt54-nano", "stream": True}
with requests.post(url, headers=headers, json=body, stream=True) as r:
for chunk in r.iter_content(chunk_size=None):
print(chunk.decode(), end="", flush=True)| Key | Model | Size | Language | Plan |
|---|---|---|---|---|
llama-8b-instant |
Llama 3.1 8B Instant | 8B | Multilingual | Free+ |
allam-2-7b |
Allam 2 7B | 7B | Turkish / Arabic | Free+ |
gpt54-nano |
GPT-5.4 Nano | Nano | Multilingual | Free+ |
step-3.5-flash |
Step 3.5 Flash | - | Multilingual | Free+ |
nemotron-3-super-120b-a12b |
Nemotron 3 Super 120B A12B | 120B | Multilingual | Free+ |
gpt4-nano |
GPT-4.1 Nano | Nano | Multilingual | Starter+ |
qwen3-32b |
Qwen3 32B ⭐ | 32B | Turkish / Chinese | Starter+ |
llama-70b |
Llama 3.3 70B | 70B | Multilingual | Starter+ |
llama-maverick-17b |
Llama Maverick 17B | 17B | Multilingual | Starter+ |
llama-scout-17b |
Llama Scout 17B | 17B | Multilingual | Starter+ |
gpt-oss-20b |
GPT-OSS 20B | 20B | Multilingual | Starter+ |
glm-4.5-air |
GLM 4.5 Air | - | Multilingual | Starter+ |
qwen3-coder |
Qwen3 Coder | - | Multilingual | Starter+ |
lfm-2.5-1.2b-thinking |
LFM 2.5 1.2B Thinking | 1.2B | Multilingual | Starter+ |
gpt-oss-120b |
GPT-OSS 120B | 120B | Multilingual | Pro+ |
kimi-k2 |
Kimi K2 | — | Chinese | Pro+ |
| Plan | Price | Requests/mo | Tokens/mo | RPM | RPH |
|---|---|---|---|---|---|
| Free | 0 USD | 200 | 20,000 | 3 | 60 |
| Starter | 5 USD | 3,000 | 300,000 | 15 | 300 |
| Pro | 15 USD | 75,000 | 7,500,000 | 50 | 1,500 |
| Plus | 39 USD | 500,000 | 50,000,000 | 150 | 5,000 |
All plans include streaming and monthly quota tracking without daily caps. Start free →
- Sign up at sfapi.pythonanywhere.com
- Verify your email
- Grab your API key from the dashboard
- Support bots - low-latency streaming responses with plan-safe quotas
- Coding assistants - route heavy tasks to GPT-OSS or coding-focused models
- Multilingual apps - Turkish-focused and multilingual models under one key
- Content pipelines - scale generation with usage analytics and upgrade paths
- Internal automation - summarization, tagging, and classification bots
Full API docs at sfapi.pythonanywhere.com/docs
Endpoints:
POST /api/v1/chat — Chat (stream or sync)
GET /api/v1/stats — Usage stats
GET /health — Health check
sixfingerdev@gmail.com · Built by @sixfingerdev
⭐ If this saved you time, a star helps a lot!
