base on Fastest LLM gateway (50x faster than LiteLLM) with adaptive load balancer, cluster mode, guardrails, 1000+ models support & <100 µs overhead at 5k RPS. # Bifrost [![Go Report Card](https://goreportcard.com/badge/github.com/maximhq/bifrost/core)](https://goreportcard.com/report/github.com/maximhq/bifrost/core) [![Discord badge](https://dcbadge.limes.pink/api/server/https://discord.gg/exN5KAydbU?style=flat)](https://discord.gg/exN5KAydbU) [![Known Vulnerabilities](https://snyk.io/test/github/maximhq/bifrost/badge.svg)](https://snyk.io/test/github/maximhq/bifrost) [![codecov](https://codecov.io/gh/maximhq/bifrost/branch/main/graph/badge.svg)](https://codecov.io/gh/maximhq/bifrost) ![Docker Pulls](https://img.shields.io/docker/pulls/maximhq/bifrost) [<img src="https://run.pstmn.io/button.svg" alt="Run In Postman" style="width: 95px; height: 21px;">](https://app.getpostman.com/run-collection/31642484-2ba0e658-4dcd-49f4-845a-0c7ed745b916?action=collection%2Ffork&source=rip_markdown&collection-url=entityId%3D31642484-2ba0e658-4dcd-49f4-845a-0c7ed745b916%26entityType%3Dcollection%26workspaceId%3D63e853c8-9aec-477f-909c-7f02f543150e) [![Artifact Hub](https://img.shields.io/endpoint?url=https://artifacthub.io/badge/repository/bifrost)](https://artifacthub.io/packages/search?repo=bifrost) [![License](https://img.shields.io/github/license/maximhq/bifrost)](LICENSE) ## The fastest way to build AI applications that never go down Bifrost is a high-performance AI gateway that unifies access to 15+ providers (OpenAI, Anthropic, AWS Bedrock, Google Vertex, and more) through a single OpenAI-compatible API. Deploy in seconds with zero configuration and get automatic failover, load balancing, semantic caching, and enterprise-grade features. ## Quick Start ![Get started](./docs/media/getting-started.png) **Go from zero to production-ready AI gateway in under a minute.** **Step 1:** Start Bifrost Gateway ```bash # Install and run locally npx -y @maximhq/bifrost # Or use Docker docker run -p 8080:8080 maximhq/bifrost ``` **Step 2:** Configure via Web UI ```bash # Open the built-in web interface open http://localhost:8080 ``` **Step 3:** Make your first API call ```bash curl -X POST http://localhost:8080/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ "model": "openai/gpt-4o-mini", "messages": [{"role": "user", "content": "Hello, Bifrost!"}] }' ``` **That's it!** Your AI gateway is running with a web interface for visual configuration, real-time monitoring, and analytics. **Complete Setup Guides:** - [Gateway Setup](https://docs.getbifrost.ai/quickstart/gateway/setting-up) - HTTP API deployment - [Go SDK Setup](https://docs.getbifrost.ai/quickstart/go-sdk/setting-up) - Direct integration --- ## Key Features ### Core Infrastructure - **[Unified Interface](https://docs.getbifrost.ai/features/unified-interface)** - Single OpenAI-compatible API for all providers - **[Multi-Provider Support](https://docs.getbifrost.ai/quickstart/gateway/provider-configuration)** - OpenAI, Anthropic, AWS Bedrock, Google Vertex, Azure, Cerebras, Cohere, Mistral, Ollama, Groq, and more - **[Automatic Fallbacks](https://docs.getbifrost.ai/features/fallbacks)** - Seamless failover between providers and models with zero downtime - **[Load Balancing](https://docs.getbifrost.ai/features/fallbacks)** - Intelligent request distribution across multiple API keys and providers ### Advanced Features - **[Model Context Protocol (MCP)](https://docs.getbifrost.ai/features/mcp)** - Enable AI models to use external tools (filesystem, web search, databases) - **[Semantic Caching](https://docs.getbifrost.ai/features/semantic-caching)** - Intelligent response caching based on semantic similarity to reduce costs and latency - **[Multimodal Support](https://docs.getbifrost.ai/quickstart/gateway/streaming)** - Support for text,images, audio, and streaming, all behind a common interface. - **[Custom Plugins](https://docs.getbifrost.ai/enterprise/custom-plugins)** - Extensible middleware architecture for analytics, monitoring, and custom logic - **[Governance](https://docs.getbifrost.ai/features/governance)** - Usage tracking, rate limiting, and fine-grained access control ### Enterprise & Security - **[Budget Management](https://docs.getbifrost.ai/features/governance)** - Hierarchical cost control with virtual keys, teams, and customer budgets - **[SSO Integration](https://docs.getbifrost.ai/features/sso-with-google-github)** - Google and GitHub authentication support - **[Observability](https://docs.getbifrost.ai/features/observability)** - Native Prometheus metrics, distributed tracing, and comprehensive logging - **[Vault Support](https://docs.getbifrost.ai/enterprise/vault-support)** - Secure API key management with HashiCorp Vault integration ### Developer Experience - **[Zero-Config Startup](https://docs.getbifrost.ai/quickstart/gateway/setting-up)** - Start immediately with dynamic provider configuration - **[Drop-in Replacement](https://docs.getbifrost.ai/features/drop-in-replacement)** - Replace OpenAI/Anthropic/GenAI APIs with one line of code - **[SDK Integrations](https://docs.getbifrost.ai/integrations/what-is-an-integration)** - Native support for popular AI SDKs with zero code changes - **[Configuration Flexibility](https://docs.getbifrost.ai/quickstart/gateway/provider-configuration)** - Web UI, API-driven, or file-based configuration options --- ## Repository Structure Bifrost uses a modular architecture for maximum flexibility: ```text bifrost/ ├── npx/ # NPX script for easy installation ├── core/ # Core functionality and shared components │ ├── providers/ # Provider-specific implementations (OpenAI, Anthropic, etc.) │ ├── schemas/ # Interfaces and structs used throughout Bifrost │ └── bifrost.go # Main Bifrost implementation ├── framework/ # Framework components for data persistence │ ├── configstore/ # Configuration storages │ ├── logstore/ # Request logging storages │ └── vectorstore/ # Vector storages ├── transports/ # HTTP gateway and other interface layers │ └── bifrost-http/ # HTTP transport implementation ├── ui/ # Web interface for HTTP gateway ├── plugins/ # Extensible plugin system │ ├── governance/ # Budget management and access control │ ├── jsonparser/ # JSON parsing and manipulation utilities │ ├── logging/ # Request logging and analytics │ ├── maxim/ # Maxim's observability integration │ ├── mocker/ # Mock responses for testing and development │ ├── semanticcache/ # Intelligent response caching │ └── telemetry/ # Monitoring and observability ├── docs/ # Documentation and guides └── tests/ # Comprehensive test suites ``` --- ## Getting Started Options Choose the deployment method that fits your needs: ### 1. Gateway (HTTP API) **Best for:** Language-agnostic integration, microservices, and production deployments ```bash # NPX - Get started in 30 seconds npx -y @maximhq/bifrost # Docker - Production ready docker run -p 8080:8080 -v $(pwd)/data:/app/data maximhq/bifrost ``` **Features:** Web UI, real-time monitoring, multi-provider management, zero-config startup **Learn More:** [Gateway Setup Guide](https://docs.getbifrost.ai/quickstart/gateway/setting-up) ### 2. Go SDK **Best for:** Direct Go integration with maximum performance and control ```bash go get github.com/maximhq/bifrost/core ``` **Features:** Native Go APIs, embedded deployment, custom middleware integration **Learn More:** [Go SDK Guide](https://docs.getbifrost.ai/quickstart/go-sdk/setting-up) ### 3. Drop-in Replacement **Best for:** Migrating existing applications with zero code changes ```diff # OpenAI SDK - base_url = "https://api.openai.com" + base_url = "http://localhost:8080/openai" # Anthropic SDK - base_url = "https://api.anthropic.com" + base_url = "http://localhost:8080/anthropic" # Google GenAI SDK - api_endpoint = "https://generativelanguage.googleapis.com" + api_endpoint = "http://localhost:8080/genai" ``` **Learn More:** [Integration Guides](https://docs.getbifrost.ai/integrations/what-is-an-integration) --- ## Performance Bifrost adds virtually zero overhead to your AI requests. In sustained 5,000 RPS benchmarks, the gateway added only **11 µs** of overhead per request. | Metric | t3.medium | t3.xlarge | Improvement | |--------|-----------|-----------|-------------| | Added latency (Bifrost overhead) | 59 µs | **11 µs** | **-81%** | | Success rate @ 5k RPS | 100% | 100% | No failed requests | | Avg. queue wait time | 47 µs | **1.67 µs** | **-96%** | | Avg. request latency (incl. provider) | 2.12 s | **1.61 s** | **-24%** | **Key Performance Highlights:** - **Perfect Success Rate** - 100% request success rate even at 5k RPS - **Minimal Overhead** - Less than 15 µs additional latency per request - **Efficient Queuing** - Sub-microsecond average wait times - **Fast Key Selection** - ~10 ns to pick weighted API keys **Complete Benchmarks:** [Performance Analysis](https://docs.getbifrost.ai/benchmarking/getting-started) --- ## Documentation **Complete Documentation:** [https://docs.getbifrost.ai](https://docs.getbifrost.ai) ### Quick Start - [Gateway Setup](https://docs.getbifrost.ai/quickstart/gateway/setting-up) - HTTP API deployment in 30 seconds - [Go SDK Setup](https://docs.getbifrost.ai/quickstart/go-sdk/setting-up) - Direct Go integration - [Provider Configuration](https://docs.getbifrost.ai/quickstart/gateway/provider-configuration) - Multi-provider setup ### Features - [Multi-Provider Support](https://docs.getbifrost.ai/features/unified-interface) - Single API for all providers - [MCP Integration](https://docs.getbifrost.ai/features/mcp) - External tool calling - [Semantic Caching](https://docs.getbifrost.ai/features/semantic-caching) - Intelligent response caching - [Fallbacks & Load Balancing](https://docs.getbifrost.ai/features/fallbacks) - Reliability features - [Budget Management](https://docs.getbifrost.ai/features/governance) - Cost control and governance ### Integrations - [OpenAI SDK](https://docs.getbifrost.ai/integrations/openai-sdk) - Drop-in OpenAI replacement - [Anthropic SDK](https://docs.getbifrost.ai/integrations/anthropic-sdk) - Drop-in Anthropic replacement - [AWS Bedrock SDK](https://docs.getbifrost.ai/integrations/bedrock-sdk) - AWS Bedrock integration - [Google GenAI SDK](https://docs.getbifrost.ai/integrations/genai-sdk) - Drop-in GenAI replacement - [LiteLLM SDK](https://docs.getbifrost.ai/integrations/litellm-sdk) - LiteLLM integration - [Langchain SDK](https://docs.getbifrost.ai/integrations/langchain-sdk) - Langchain integration ### Enterprise - [Custom Plugins](https://docs.getbifrost.ai/enterprise/custom-plugins) - Extend functionality - [Clustering](https://docs.getbifrost.ai/enterprise/clustering) - Multi-node deployment - [Vault Support](https://docs.getbifrost.ai/enterprise/vault-support) - Secure key management - [Production Deployment](https://docs.getbifrost.ai/deployment/docker-setup) - Scaling and monitoring --- ## Need Help? **[Join our Discord](https://discord.gg/exN5KAydbU)** for community support and discussions. Get help with: - Quick setup assistance and troubleshooting - Best practices and configuration tips - Community discussions and support - Real-time help with integrations --- ## Contributing We welcome contributions of all kinds! See our [Contributing Guide](https://docs.getbifrost.ai/contributing/setting-up-repo) for: - Setting up the development environment - Code conventions and best practices - How to submit pull requests - Building and testing locally For development requirements and build instructions, see our [Development Setup Guide](https://docs.getbifrost.ai/contributing/building-a-plugins). --- ## License This project is licensed under the Apache 2.0 License - see the [LICENSE](LICENSE) file for details. Built with ❤️ by [Maxim](https://github.com/maximhq) ", Assign "at most 3 tags" to the expected json: {"id":"14529","tags":[]} "only from the tags list I provide: [{"id":77,"name":"3d"},{"id":89,"name":"agent"},{"id":17,"name":"ai"},{"id":54,"name":"algorithm"},{"id":24,"name":"api"},{"id":44,"name":"authentication"},{"id":3,"name":"aws"},{"id":27,"name":"backend"},{"id":60,"name":"benchmark"},{"id":72,"name":"best-practices"},{"id":39,"name":"bitcoin"},{"id":37,"name":"blockchain"},{"id":1,"name":"blog"},{"id":45,"name":"bundler"},{"id":58,"name":"cache"},{"id":21,"name":"chat"},{"id":49,"name":"cicd"},{"id":4,"name":"cli"},{"id":64,"name":"cloud-native"},{"id":48,"name":"cms"},{"id":61,"name":"compiler"},{"id":68,"name":"containerization"},{"id":92,"name":"crm"},{"id":34,"name":"data"},{"id":47,"name":"database"},{"id":8,"name":"declarative-gui "},{"id":9,"name":"deploy-tool"},{"id":53,"name":"desktop-app"},{"id":6,"name":"dev-exp-lib"},{"id":59,"name":"dev-tool"},{"id":13,"name":"ecommerce"},{"id":26,"name":"editor"},{"id":66,"name":"emulator"},{"id":62,"name":"filesystem"},{"id":80,"name":"finance"},{"id":15,"name":"firmware"},{"id":73,"name":"for-fun"},{"id":2,"name":"framework"},{"id":11,"name":"frontend"},{"id":22,"name":"game"},{"id":81,"name":"game-engine "},{"id":23,"name":"graphql"},{"id":84,"name":"gui"},{"id":91,"name":"http"},{"id":5,"name":"http-client"},{"id":51,"name":"iac"},{"id":30,"name":"ide"},{"id":78,"name":"iot"},{"id":40,"name":"json"},{"id":83,"name":"julian"},{"id":38,"name":"k8s"},{"id":31,"name":"language"},{"id":10,"name":"learning-resource"},{"id":33,"name":"lib"},{"id":41,"name":"linter"},{"id":28,"name":"lms"},{"id":16,"name":"logging"},{"id":76,"name":"low-code"},{"id":90,"name":"message-queue"},{"id":42,"name":"mobile-app"},{"id":18,"name":"monitoring"},{"id":36,"name":"networking"},{"id":7,"name":"node-version"},{"id":55,"name":"nosql"},{"id":57,"name":"observability"},{"id":46,"name":"orm"},{"id":52,"name":"os"},{"id":14,"name":"parser"},{"id":74,"name":"react"},{"id":82,"name":"real-time"},{"id":56,"name":"robot"},{"id":65,"name":"runtime"},{"id":32,"name":"sdk"},{"id":71,"name":"search"},{"id":63,"name":"secrets"},{"id":25,"name":"security"},{"id":85,"name":"server"},{"id":86,"name":"serverless"},{"id":70,"name":"storage"},{"id":75,"name":"system-design"},{"id":79,"name":"terminal"},{"id":29,"name":"testing"},{"id":12,"name":"ui"},{"id":50,"name":"ux"},{"id":88,"name":"video"},{"id":20,"name":"web-app"},{"id":35,"name":"web-server"},{"id":43,"name":"webassembly"},{"id":69,"name":"workflow"},{"id":87,"name":"yaml"}]" returns me the "expected json"