Streaming tool call input parameters dropped when translating Gemini → Anthropic Messages API format

## Bug: Streaming tool call input parameters dropped when translating Gemini responses to Anthropic Messages API format

### Description

When a client uses the Anthropic Messages API with `stream: true` to call a Gemini model through LiteLLM, tool call input parameters are silently dropped. The `content_block_start` event has `input: {}` (correct per spec), but no `input_json_delta` events follow to deliver the actual arguments. The tool name is preserved — only the parameters are lost.

Non-streaming requests to the same model with the same tools work correctly.

### Steps to reproduce

**Non-streaming (works):**
```bash
curl -X POST http://localhost:4000/v1/messages \
  -H "Authorization: Bearer sk-key" \
  -H "Content-Type: application/json" \
  -H "anthropic-version: 2023-06-01" \
  -d '{
    "model": "gemini-2.5-pro",
    "max_tokens": 200,
    "tools": [{
      "name": "add_numbers",
      "description": "Add two numbers together.",
      "input_schema": {
        "type": "object",
        "properties": {
          "a": {"type": "number"},
          "b": {"type": "number"}
        },
        "required": ["a", "b"]
      }
    }],
    "messages": [{"role": "user", "content": "Use the add_numbers tool to compute 17 + 25."}]
  }'
```

Response correctly includes:
```json
{"type": "tool_use", "name": "add_numbers", "input": {"a": 17, "b": 25}}
```

**Streaming (broken):**
```bash
curl -X POST http://localhost:4000/v1/messages \
  -H "Authorization: Bearer sk-key" \
  -H "Content-Type: application/json" \
  -H "anthropic-version: 2023-06-01" \
  -d '{
    "model": "gemini-2.5-pro",
    "max_tokens": 200,
    "stream": true,
    "tools": [{
      "name": "add_numbers",
      "description": "Add two numbers together.",
      "input_schema": {
        "type": "object",
        "properties": {
          "a": {"type": "number"},
          "b": {"type": "number"}
        },
        "required": ["a", "b"]
      }
    }],
    "messages": [{"role": "user", "content": "Use the add_numbers tool to compute 17 + 25."}]
  }'
```

Streaming response shows:
```
event: content_block_start
data: {"type": "content_block_start", "index": 1, "content_block": {"type": "tool_use", "id": "...", "name": "add_numbers", "input": {}}}

event: content_block_stop
data: {"type": "content_block_stop", "index": 1}
```

No `content_block_delta` events with `input_json_delta` are emitted between `content_block_start` and `content_block_stop`. The tool arguments are completely lost.

### Root cause analysis

The streaming translation path is: Gemini `functionCall.args` → OpenAI `tool_calls[].function.arguments` → Anthropic `input_json_delta` events.

**Gemini → OpenAI translation** (`vertex_and_google_ai_studio_gemini.py` line 1440-1446): correctly extracts `functionCall.args` and json-stringifies them into `function.arguments`. ✓

**OpenAI → Anthropic content_block_start** (`transformation.py` line 1213-1218): creates `ToolUseBlock(input={})` — correct per Anthropic streaming spec, arguments should follow as deltas. ✓

**OpenAI → Anthropic input_json_delta** (`transformation.py` lines 1263-1302): code exists to extract `tool.function.arguments` and emit `input_json_delta` events. ✓

**The gap**: The code to generate `input_json_delta` events exists, but the events never reach the client. When Gemini sends the function call name and arguments in the same streaming chunk (rather than incrementally), the chunk is consumed by `_translate_streaming_openai_chunk_to_anthropic_content_block()` to create the `content_block_start` event, but the arguments from that same chunk are not re-processed to generate `input_json_delta` events.

The streaming iterator (`streaming_iterator.py` lines 121-148) detects a new content block, queues `content_block_start`, but the arguments that arrived in the same chunk are lost — the chunk is consumed and subsequent processing doesn't see them.

### Impact

- All MCP tool-based workflows are broken for Gemini models through the Anthropic Messages API streaming path
- The Claude Agent SDK (`@anthropic-ai/claude-agent-sdk`) uses streaming by default, making all Gemini tool calls fail
- The model correctly understands it should call tools but can never pass parameters, causing infinite retry loops until `max_turns` is exceeded
- Affects `gemini-2.5-pro` and likely all Gemini models with tool use

### Environment

- LiteLLM v1.83.3-stable
- Gemini models via Vertex AI
- Client: Anthropic Messages API with `stream: true`
- Same tools work correctly with Claude models through the same proxy
- Same tools work correctly with Gemini in non-streaming mode

### Workarounds

1. Use Claude models instead of Gemini for tool-calling agent workflows
2. Use the Claude Agent SDK's single message (non-streaming) input mode instead of streaming input mode — but this loses image upload, hooks, and interruption support


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Streaming tool call input parameters dropped when translating Gemini → Anthropic Messages API format #25836

Bug: Streaming tool call input parameters dropped when translating Gemini responses to Anthropic Messages API format

Description

Steps to reproduce

Root cause analysis

Impact

Environment

Workarounds

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Streaming tool call input parameters dropped when translating Gemini → Anthropic Messages API format #25836

Description

Bug: Streaming tool call input parameters dropped when translating Gemini responses to Anthropic Messages API format

Description

Steps to reproduce

Root cause analysis

Impact

Environment

Workarounds

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions