Skip to main content

v1.80.7-stable - RAG API

Krrish Dholakia
CEO, LiteLLM
Ishaan Jaff
CTO, LiteLLM

Deploy this version​

docker run litellm
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.80.7

Key Highlights​


Organization Usage​

Users can now filter usage statistics by organization, providing the same granular filtering capabilities available for teams.

Details:

  • Filter usage analytics, spend logs, and activity metrics by organization ID
  • View organization-level breakdowns alongside existing team and user-level filters
  • Consistent filtering experience across all usage and analytics views

PR #16560, PR #17181


New Providers and Endpoints​

New Providers​

ProviderSupported EndpointsDescription
Public AIChat completionsSupport for publicai.co provider
Eleven LabsText-to-speechText-to-speech provider integration

New LLM API Endpoints​

EndpointMethodDescriptionDocumentation
/v1/skillsPOSTAnthropic Skills API for extended context tool callingSkills API
/rag/ingestPOSTUnified RAG API with Vertex AI RAG and Vector StoresRAG API

New Models / Updated Models​

New Model Support​

ProviderModelContext WindowInput ($/1M tokens)Output ($/1M tokens)Features
Anthropicclaude-opus-4-5-20251101200K$5.00$25.00Chat, reasoning, vision, function calling, prompt caching
Bedrockanthropic.claude-opus-4-5-20251101-v1:0200K$5.00$25.00Chat, reasoning, vision, function calling, prompt caching
Bedrockus.anthropic.claude-opus-4-5-20251101-v1:0200K$5.00$25.00Chat, reasoning, vision, function calling, prompt caching
Bedrockamazon.nova-canvas-v1:0--$0.06/imageImage generation
OpenRouteropenrouter/anthropic/claude-opus-4.5200K$5.00$25.00Chat, reasoning, vision, function calling, prompt caching
Vertex AIvertex_ai/claude-opus-4-5200K$5.00$25.00Chat, reasoning, vision, function calling, prompt caching
Vertex AIvertex_ai/claude-opus-4-5@20251101200K$5.00$25.00Chat, reasoning, vision, function calling, prompt caching
Azureazure_ai/claude-opus-4-1200K$15.00$75.00Chat, reasoning, vision, function calling, prompt caching
Azureazure_ai/claude-sonnet-4-5200K$3.00$15.00Chat, reasoning, vision, function calling, prompt caching
Azureazure_ai/claude-haiku-4-5200K$1.00$5.00Chat, reasoning, vision, function calling, prompt caching
Fireworks AIfireworks_ai/accounts/fireworks/models/glm-4p6202K$0.55$2.19Chat, function calling
Public AIpublicai/swiss-ai/apertus-8b-instruct8KFreeFreeChat, function calling
Public AIpublicai/swiss-ai/apertus-70b-instruct8KFreeFreeChat, function calling
Public AIpublicai/aisingapore/Gemma-SEA-LION-v4-27B-IT8KFreeFreeChat, function calling
Public AIpublicai/BSC-LT/salamandra-7b-instruct-tools-16k16KFreeFreeChat, function calling
Public AIpublicai/BSC-LT/ALIA-40b-instruct_Q8_08KFreeFreeChat, function calling
Public AIpublicai/allenai/Olmo-3-7B-Instruct32KFreeFreeChat, function calling
Public AIpublicai/aisingapore/Qwen-SEA-LION-v4-32B-IT32KFreeFreeChat, function calling
Public AIpublicai/allenai/Olmo-3-7B-Think32KFreeFreeChat, function calling, reasoning
Public AIpublicai/allenai/Olmo-3-32B-Think32KFreeFreeChat, function calling, reasoning
Cohereembed-multilingual-light-v3.01K$0.10-Embeddings, supports images
WatsonXwatsonx/whisper-large-v3-turbo-$0.0001/sec-Audio transcription

Features​

  • Anthropic

    • Add claude opus 4.5 model support - PR #17043
    • Add day 0 support for anthropic Tool Search, Programmatic Tool Calling, Input Examples, Effort Parameter - PR #17091, Docs
    • Add Anthropic Effort Parameter support - PR #17091
  • Bedrock

    • Fix bedrock claude opus 4.5 inference profile - only global currently - PR #17101
    • Add OpenAI compatible bedrock imported models (qwen etc) - PR #17097
    • Fix bedrock passthrough auth issue - PR #16879
    • Make Bedrock image generation more consistent - PR #17021
  • Azure

    • Add support for azure anthropic models via chat completion - PR #16886
    • Fix the azure auth format for videos - PR #17009
    • Fix reasoning_effort="none" not working on Azure for GPT-5.1 - PR #17071
    • Add GA protocol as configurable parameter for azure openai realtime api - PR #17096
  • OpenRouter

  • Fireworks AI

    • Add fireworks_ai/accounts/fireworks/models/glm-4p6 - PR #17154
  • Vertex AI

    • Add vertex ai image gen support for both gemini and imagen models - PR #17070
    • Handle global location in context caching - PR #16997
    • Fix CreateCachedContentRequest enum error - PR #16965
    • Use the correct domain for the global location when counting tokens - PR #17116
    • Support Vertex AI batch listing in LiteLLM proxy - PR #17079
    • Fix default sample count for image generation - PR #16403
  • Gemini

    • Add gemini file search support - PR #17124
    • Add gemini-3-pro-image-preview model support for imageSize parameter - PR #17019
    • Handle None or empty contents in Gemini token counter - PR #17020
    • Skip thinking config for image models - PR #17027
  • WatsonX

    • Add audio transcriptions for WatsonX - PR #17160
  • OpenAI

    • Fix gpt-5.1 temperature support when reasoning_effort is "none" or not specified - PR #17011
  • Public AI

  • Cohere

    • Add cost tracking for cohere embed passthrough endpoint - PR #17029
  • Eleven Labs

    • Integrate eleven labs text-to-speech - PR #16573

Bug Fixes​

  • OCI
    • Fix pydantic validation errors during tool call with streaming - PR #16899

LLM API Endpoints​

Features​

  • Skills API (Anthropic)

    • New API - Claude Skills API. Create, List, Delete, Update Claude Skills - PR #17042, Docs
  • RAG API

    • New RAG API on LiteLLM AI Gateway (use with OpenAI Vector Store, Bedrock Knowledge Bases, Vertex AI RAG Engine) - PR #17109
    • Add support for Vertex RAG engine - PR #17117
    • Allow internal user keys to access api, allow using litellm credentials with API - PR #17169
  • Search API

    • Add search API logging and cost tracking in LiteLLM Proxy - PR #17078
  • Responses API

    • Fix prevent duplicate spend logs in Responses API for non-OpenAI providers - PR #16992
    • Support response_format parameter in completion -> responses bridge - PR #16844
    • Fix mcp tool call response logging + remove unmapped param error mid-stream - allows gpt-5 web search to work via responses api - PR #16946
    • Add header passing support for MCP tools in Responses API - PR #16877
  • Image Edits API

  • Audio Transcription API

    • Add transcription exception handling for /audio/transcriptions - PR #16791
    • Fix 401 when audio/transcriptions - PR #17023
  • Embeddings API

    • Add header forwarding in embeddings - PR #16869
  • Passthrough Endpoints

    • Add cost tracking for streaming in vertex ai passthrough - PR #16874
    • Add cost tracking for cohere embed passthrough endpoint - PR #17029
  • Vector Stores

    • Add method for extracting vector store ids from path params - PR #16566
  • General

    • Fix propagate x-litellm-model-id in responses - PR #16986
    • Preserve content field even if null - PR #16988
    • Include server_tool_use in streaming usage - PR #16826
    • Fix Thinking may not be enabled when tool_choice forces tool use - PR #17129
    • Add missing standard logging object fields - PR #17135

Bugs​

  • General
    • Fix vector Store List Endpoint Returns 404 - PR #17229
    • Fix Videos lint errors - PR #17125
    • Do not include plaintext message in exception - PR #17216

Management Endpoints / UI​

Features​

  • Virtual Keys

  • Models + Endpoints

    • Allow adding Bedrock API Key when adding models - PR #17153
    • Add aws_bedrock_runtime_endpoint into Credential Types - PR #17053
    • Change provider create fields to JSON - PR #16985
    • Change model_hub_table to call getUiConfig before Fetching Public Data - PR #17166
    • Improve Wording for Config Models in Model Table - PR #17100
  • Teams & Users

    • Deleting a User From Team Deletes key User Created for Team - PR #17057
    • Hide Default Team Settings From Proxy Admin Viewers - PR #16900
    • Add No Default Models for Team and User Settings - PR #17037
    • User Table Sort by All - PR #17108
    • Org Admin Team Permissions Fix - PR #17110
    • Better Loading State for Internal User Page - PR #17168
  • Permission Management

    • Add reject_metadata_tags to prevent users from sending metadata.tags directly in requests - PR #17088
    • Disable global guardrails by key/team - PR #16983
    • Tool permission argument check - PR #16982
    • Add UI support for configuring tool permission guardrails - PR #17050
  • MCP Gateway

    • Add backend support for OAuth2 auth_type registration via UI - PR #17006
    • Add UI support for registering MCP OAuth2 auth_type - PR #17007
  • General UI Improvements

    • Ensure Unique Keys in Navbar Menu Items - PR #16987
    • Minor Cosmetic Changes for Buttons, Add Notification for Delete Team - PR #16984
    • Change Delete Modals to Common Component - PR #17068
    • Disable edit, delete, info for dynamically generated spend tags - PR #17098
    • Migrate modelInfoCall to ReactQuery - PR #17123
    • Migrate Provider Fields to React Query - PR #17177
    • Fix Flaky Test - PR #17161
    • Change Add Fallback Modal to use Antd Select - PR #17223
  • Infrastructure

    • Non Root Docker Build - PR #17060
    • Add nodejs and npm to docker image for prisma generate - PR #16903
    • Bump: version 0.4.8 → 0.4.9 - PR #17163
  • Helm

    • Enhancement: ServiceMonitor template rendering - PR #17038

Bugs​

  • Database
    • Distinguish permission errors from idempotent errors in Prisma migrations - PR #17064

AI Integrations​

Logging​

  • General
    • Model Armor - Logging guardrail response on llm responses - PR #16977
    • Add missing standard logging object fields - PR #17135
    • Add cost tracking for cohere embed passthrough endpoint - PR #17029
    • Add cost tracking for streaming in vertex ai passthrough - PR #16874

Guardrails​

  • Presidio

    • Add presidio pii masking tutorial with litellm - PR #16969
  • General

    • Prompt security litellm - PR #16365
    • Add guardrails for pass through endpoints - PR #17221
    • Allow adding pass through guardrails through UI - PR #17226

Prompt Management​

  • General
    • AI gateway prompt management documentation - PR #16990

MCP Gateway​

  • OAuth 2.0

    • Add backend support for OAuth2 auth_type registration via UI - PR #17006
    • Add UI support for registering MCP OAuth2 auth_type - PR #17007
  • Tool Permissions

    • Tool permission argument check - PR #16982
    • Add UI support for configuring tool permission guardrails - PR #17050
  • Configuration

    • Remove unused MCP_PROTOCOL_VERSION_HEADER_NAME constant - PR #17008
    • Add header passing support for MCP tools in Responses API - PR #16877
    • Fix missing await - PR #17103

Performance / Loadbalancing / Reliability improvements​

  • Memory Optimization

    • Lazy-load cost_calculator & logging to reduce memory + import time - PR #17089
  • Dependency Management

  • Database Performance

    • Optimize date filtering for spend logs queries - PR #17073
  • Request Handling

    • Add automatic LiteLLM context headers (Pillar integration) - PR #17076
  • Generic API Support

    • Make generic api OSS + support multiple generic API's - PR #17152

Documentation Updates​

  • Provider Documentation

  • General Documentation

    • AI gateway prompt management - PR #16990
    • Cleanup README and improve agent guides - PR #17003
    • Update broken documentation links in README - PR #17002
    • Update version and add preview tag - PR #17032
    • Document model pricing contribution process - PR #17031
    • Document event hook usage - PR #17035
    • Link to logging spec in callback docs - PR #17049
    • Add OpenAI Agents SDK to projects - PR #17203
    • Fix unspecified issue - PR #17034

New Contributors​

  • @prawaan made their first contribution in PR #16997
  • @lior-ps made their first contribution in PR #16365
  • @HaiyiMei made their first contribution in PR #17020
  • @yuya2017 made their first contribution in PR #17064
  • @saar-win made their first contribution in PR #17038
  • @sdip15fa made their first contribution in PR #16965
  • @KeremTurgutlu made their first contribution in PR #16826
  • @choigawoon made their first contribution in PR #17019
  • @SamAcctX made their first contribution in PR #17144
  • @naaa760 made their first contribution in PR #17079
  • @abi-jey made their first contribution in PR #17096
  • @hxyannay made their first contribution in PR #16734

Full Changelog​