Trendshift - Ask AI

base on A powerful framework for building realtime voice AI agents 🤖🎙️📹  <picture> <source media="(prefers-color-scheme: dark)" srcset="/.github/banner_dark.png"> <source media="(prefers-color-scheme: light)" srcset="/.github/banner_light.png"> <img style="width:100%;" alt="The LiveKit icon, the name of the repository and some sample code in the background." src="https://raw.githubusercontent.com/livekit/agents/main/.github/banner_light.png"> </picture>  <br /> ![PyPI - Version](https://img.shields.io/pypi/v/livekit-agents) [![PyPI Downloads](https://static.pepy.tech/badge/livekit-agents/month)](https://pepy.tech/projects/livekit-agents) [![Slack community](https://img.shields.io/endpoint?url=https%3A%2F%2Flivekit.io%2Fbadges%2Fslack)](https://livekit.io/join-slack) [![Twitter Follow](https://img.shields.io/twitter/follow/livekit)](https://twitter.com/livekit) [![Ask DeepWiki for understanding the codebase](https://deepwiki.com/badge.svg)](https://deepwiki.com/livekit/agents) [![License](https://img.shields.io/github/license/livekit/livekit)](https://github.com/livekit/livekit/blob/master/LICENSE) <br /> Looking for the JS/TS library? Check out [AgentsJS](https://github.com/livekit/agents-js) ## ✨ 1.0 release ✨ This README reflects the 1.0 release. For documentation on the previous 0.x release, see the [0.x branch](https://github.com/livekit/agents/tree/0.x) ## What is Agents?  The **Agents framework** enables you to build voice AI agents that can see, hear, and speak in realtime. It provides a fully open-source platform for creating server-side agentic applications.  ## Features - **Flexible integrations**: A comprehensive ecosystem to mix and match the right STT, LLM, TTS, and Realtime API to suit your use case. - **Integrated job scheduling**: Built-in task scheduling and distribution with [dispatch APIs](https://docs.livekit.io/agents/build/dispatch/) to connect end users to agents. - **Extensive WebRTC clients**: Build client applications using LiveKit's open-source SDK ecosystem, supporting nearly all major platforms. - **Telephony integration**: Works seamlessly with LiveKit's [telephony stack](https://docs.livekit.io/sip/), allowing your agent to make calls to or receive calls from phones. - **Exchange data with clients**: Use [RPCs](https://docs.livekit.io/home/client/data/rpc/) and other [Data APIs](https://docs.livekit.io/home/client/data/) to seamlessly exchange data with clients. - **Semantic turn detection**: Uses a transformer model to detect when a user is done with their turn, helps to reduce interruptions. - **MCP support**: Native support for MCP. Integrate tools provided by MCP servers with one loc. - **Open-source**: Fully open-source, allowing you to run the entire stack on your own servers, including [LiveKit server](https://github.com/livekit/livekit), one of the most widely used WebRTC media servers. ## Installation To install the core Agents library, along with plugins for popular model providers: ```bash pip install "livekit-agents[openai,silero,deepgram,cartesia,turn-detector]~=1.0" ``` ## Docs and guides Documentation on the framework and how to use it can be found [here](https://docs.livekit.io/agents/) ## Core concepts - Agent: An LLM-based application with defined instructions. - AgentSession: A container for agents that manages interactions with end users. - entrypoint: The starting point for an interactive session, similar to a request handler in a web server. - Worker: The main process that coordinates job scheduling and launches agents for user sessions. ## Usage ### Simple voice agent --- ```python from livekit.agents import ( Agent, AgentSession, JobContext, RunContext, WorkerOptions, cli, function_tool, ) from livekit.plugins import deepgram, elevenlabs, openai, silero @function_tool async def lookup_weather( context: RunContext, location: str, ): """Used to look up weather information.""" return {"weather": "sunny", "temperature": 70} async def entrypoint(ctx: JobContext): await ctx.connect() agent = Agent( instructions="You are a friendly voice assistant built by LiveKit.", tools=[lookup_weather], ) session = AgentSession( vad=silero.VAD.load(), # any combination of STT, LLM, TTS, or realtime API can be used stt=deepgram.STT(model="nova-3"), llm=openai.LLM(model="gpt-4o-mini"), tts=elevenlabs.TTS(), ) await session.start(agent=agent, room=ctx.room) await session.generate_reply(instructions="greet the user and ask about their day") if __name__ == "__main__": cli.run_app(WorkerOptions(entrypoint_fnc=entrypoint)) ``` You'll need the following environment variables for this example: - DEEPGRAM_API_KEY - OPENAI_API_KEY ### Multi-agent handoff --- This code snippet is abbreviated. For the full example, see [multi_agent.py](examples/voice_agents/multi_agent.py) ```python ... class IntroAgent(Agent): def __init__(self) -> None: super().__init__( instructions=f"You are a story teller. Your goal is to gather a few pieces of information from the user to make the story personalized and engaging." "Ask the user for their name and where they are from" ) async def on_enter(self): self.session.generate_reply(instructions="greet the user and gather information") @function_tool async def information_gathered( self, context: RunContext, name: str, location: str, ): """Called when the user has provided the information needed to make the story personalized and engaging. Args: name: The name of the user location: The location of the user """ context.userdata.name = name context.userdata.location = location story_agent = StoryAgent(name, location) return story_agent, "Let's start the story!" class StoryAgent(Agent): def __init__(self, name: str, location: str) -> None: super().__init__( instructions=f"You are a storyteller. Use the user's information in order to make the story personalized." f"The user's name is {name}, from {location}" # override the default model, switching to Realtime API from standard LLMs llm=openai.realtime.RealtimeModel(voice="echo"), chat_ctx=chat_ctx, ) async def on_enter(self): self.session.generate_reply() async def entrypoint(ctx: JobContext): await ctx.connect() userdata = StoryData() session = AgentSession[StoryData]( vad=silero.VAD.load(), stt=deepgram.STT(model="nova-3"), llm=openai.LLM(model="gpt-4o-mini"), tts=openai.TTS(voice="echo"), userdata=userdata, ) await session.start( agent=IntroAgent(), room=ctx.room, ) ... ``` ## Examples <table> <tr> <td width="50%"> <h3>🎙️ Starter Agent</h3> <p>A starter agent optimized for voice conversations.</p> <p> <a href="examples/voice_agents/basic_agent.py">Code</a> </p> </td> <td width="50%"> <h3>🔄 Multi-user push to talk</h3> <p>Responds to multiple users in the room via push-to-talk.</p> <p> <a href="examples/voice_agents/push_to_talk.py">Code</a> </p> </td> </tr> <tr> <td width="50%"> <h3>🎵 Background audio</h3> <p>Background ambient and thinking audio to improve realism.</p> <p> <a href="examples/voice_agents/background_audio.py">Code</a> </p> </td> <td width="50%"> <h3>🛠️ Dynamic tool creation</h3> <p>Creating function tools dynamically.</p> <p> <a href="examples/voice_agents/dynamic_tool_creation.py">Code</a> </p> </td> </tr> <tr> <td width="50%"> <h3>☎️ Outbound caller</h3> <p>Agent that makes outbound phone calls</p> <p> <a href="https://github.com/livekit-examples/outbound-caller-python">Code</a> </p> </td> <td width="50%"> <h3>📋 Structured output</h3> <p>Using structured output from LLM to guide TTS tone.</p> <p> <a href="examples/voice_agents/structured_output.py">Code</a> </p> </td> </tr> <tr> <td width="50%"> <h3>🔌 MCP support</h3> <p>Use tools from MCP servers</p> <p> <a href="examples/voice_agents/mcp">Code</a> </p> </td> <td width="50%"> <h3>💬 Text-only agent</h3> <p>Skip voice altogether and use the same code for text-only integrations</p> <p> <a href="examples/other/text_only.py">Code</a> </p> </td> </tr> <tr> <td width="50%"> <h3>📝 Multi-user transcriber</h3> <p>Produce transcriptions from all users in the room</p> <p> <a href="examples/other/transcription/multi-user-transcriber.py">Code</a> </p> </td> <td width="50%"> <h3>🎥 Video avatars</h3> <p>Add an AI avatar with Tavus, Beyond Presence, and Bithuman</p> <p> <a href="examples/avatar_agents/">Code</a> </p> </td> </tr> <tr> <td width="50%"> <h3>🍽️ Restaurant ordering and reservations</h3> <p>Full example of an agent that handles calls for a restaurant.</p> <p> <a href="examples/voice_agents/restaurant_agent.py">Code</a> </p> </td> <td width="50%"> <h3>👁️ Gemini Live vision</h3> <p>Full example (including iOS app) of Gemini Live agent that can see.</p> <p> <a href="https://github.com/livekit-examples/vision-demo">Code</a> </p> </td> </tr> </table> ## Running your agent ### Testing in terminal ```shell python myagent.py console ``` Runs your agent in terminal mode, enabling local audio input and output for testing. This mode doesn't require external servers or dependencies and is useful for quickly validating behavior. ### Developing with LiveKit clients ```shell python myagent.py dev ``` Starts the agent server and enables hot reloading when files change. This mode allows each process to host multiple concurrent agents efficiently. The agent connects to LiveKit Cloud or your self-hosted server. Set the following environment variables: - LIVEKIT_URL - LIVEKIT_API_KEY - LIVEKIT_API_SECRET You can connect using any LiveKit client SDK or telephony integration. To get started quickly, try the [Agents Playground](https://agents-playground.livekit.io/). ### Running for production ```shell python myagent.py start ``` Runs the agent with production-ready optimizations. ## Contributing The Agents framework is under active development in a rapidly evolving field. We welcome and appreciate contributions of any kind, be it feedback, bugfixes, features, new plugins and tools, or better documentation. You can file issues under this repo, open a PR, or chat with us in LiveKit's [Slack community](https://livekit.io/join-slack).  <br/><table> <thead><tr><th colspan="2">LiveKit Ecosystem</th></tr></thead> <tbody> <tr><td>LiveKit SDKs</td><td><a href="https://github.com/livekit/client-sdk-js">Browser</a> · <a href="https://github.com/livekit/client-sdk-swift">iOS/macOS/visionOS</a> · <a href="https://github.com/livekit/client-sdk-android">Android</a> · <a href="https://github.com/livekit/client-sdk-flutter">Flutter</a> · <a href="https://github.com/livekit/client-sdk-react-native">React Native</a> · <a href="https://github.com/livekit/rust-sdks">Rust</a> · <a href="https://github.com/livekit/node-sdks">Node.js</a> · <a href="https://github.com/livekit/python-sdks">Python</a> · <a href="https://github.com/livekit/client-sdk-unity">Unity</a> · <a href="https://github.com/livekit/client-sdk-unity-web">Unity (WebGL)</a></td></tr><tr></tr> <tr><td>Server APIs</td><td><a href="https://github.com/livekit/node-sdks">Node.js</a> · <a href="https://github.com/livekit/server-sdk-go">Golang</a> · <a href="https://github.com/livekit/server-sdk-ruby">Ruby</a> · <a href="https://github.com/livekit/server-sdk-kotlin">Java/Kotlin</a> · <a href="https://github.com/livekit/python-sdks">Python</a> · <a href="https://github.com/livekit/rust-sdks">Rust</a> · <a href="https://github.com/agence104/livekit-server-sdk-php">PHP (community)</a> · <a href="https://github.com/pabloFuente/livekit-server-sdk-dotnet">.NET (community)</a></td></tr><tr></tr> <tr><td>UI Components</td><td><a href="https://github.com/livekit/components-js">React</a> · <a href="https://github.com/livekit/components-android">Android Compose</a> · <a href="https://github.com/livekit/components-swift">SwiftUI</a></td></tr><tr></tr> <tr><td>Agents Frameworks</td><td><b>Python</b> · <a href="https://github.com/livekit/agents-js">Node.js</a> · <a href="https://github.com/livekit/agent-playground">Playground</a></td></tr><tr></tr> <tr><td>Services</td><td><a href="https://github.com/livekit/livekit">LiveKit server</a> · <a href="https://github.com/livekit/egress">Egress</a> · <a href="https://github.com/livekit/ingress">Ingress</a> · <a href="https://github.com/livekit/sip">SIP</a></td></tr><tr></tr> <tr><td>Resources</td><td><a href="https://docs.livekit.io">Docs</a> · <a href="https://github.com/livekit-examples">Example apps</a> · <a href="https://livekit.io/cloud">Cloud</a> · <a href="https://docs.livekit.io/home/self-hosting/deployment">Self-hosting</a> · <a href="https://github.com/livekit/livekit-cli">CLI</a></td></tr> </tbody> </table>  ", Assign "at most 3 tags" to the expected json: {"id":"10193","tags":[]} "only from the tags list I provide: [{"id":77,"name":"3d"},{"id":89,"name":"agent"},{"id":17,"name":"ai"},{"id":54,"name":"algorithm"},{"id":24,"name":"api"},{"id":44,"name":"authentication"},{"id":3,"name":"aws"},{"id":27,"name":"backend"},{"id":60,"name":"benchmark"},{"id":72,"name":"best-practices"},{"id":39,"name":"bitcoin"},{"id":37,"name":"blockchain"},{"id":1,"name":"blog"},{"id":45,"name":"bundler"},{"id":58,"name":"cache"},{"id":21,"name":"chat"},{"id":49,"name":"cicd"},{"id":4,"name":"cli"},{"id":64,"name":"cloud-native"},{"id":48,"name":"cms"},{"id":61,"name":"compiler"},{"id":68,"name":"containerization"},{"id":92,"name":"crm"},{"id":34,"name":"data"},{"id":47,"name":"database"},{"id":8,"name":"declarative-gui "},{"id":9,"name":"deploy-tool"},{"id":53,"name":"desktop-app"},{"id":6,"name":"dev-exp-lib"},{"id":59,"name":"dev-tool"},{"id":13,"name":"ecommerce"},{"id":26,"name":"editor"},{"id":66,"name":"emulator"},{"id":62,"name":"filesystem"},{"id":80,"name":"finance"},{"id":15,"name":"firmware"},{"id":73,"name":"for-fun"},{"id":2,"name":"framework"},{"id":11,"name":"frontend"},{"id":22,"name":"game"},{"id":81,"name":"game-engine "},{"id":23,"name":"graphql"},{"id":84,"name":"gui"},{"id":91,"name":"http"},{"id":5,"name":"http-client"},{"id":51,"name":"iac"},{"id":30,"name":"ide"},{"id":78,"name":"iot"},{"id":40,"name":"json"},{"id":83,"name":"julian"},{"id":38,"name":"k8s"},{"id":31,"name":"language"},{"id":10,"name":"learning-resource"},{"id":33,"name":"lib"},{"id":41,"name":"linter"},{"id":28,"name":"lms"},{"id":16,"name":"logging"},{"id":76,"name":"low-code"},{"id":90,"name":"message-queue"},{"id":42,"name":"mobile-app"},{"id":18,"name":"monitoring"},{"id":36,"name":"networking"},{"id":7,"name":"node-version"},{"id":55,"name":"nosql"},{"id":57,"name":"observability"},{"id":46,"name":"orm"},{"id":52,"name":"os"},{"id":14,"name":"parser"},{"id":74,"name":"react"},{"id":82,"name":"real-time"},{"id":56,"name":"robot"},{"id":65,"name":"runtime"},{"id":32,"name":"sdk"},{"id":71,"name":"search"},{"id":63,"name":"secrets"},{"id":25,"name":"security"},{"id":85,"name":"server"},{"id":86,"name":"serverless"},{"id":70,"name":"storage"},{"id":75,"name":"system-design"},{"id":79,"name":"terminal"},{"id":29,"name":"testing"},{"id":12,"name":"ui"},{"id":50,"name":"ux"},{"id":88,"name":"video"},{"id":20,"name":"web-app"},{"id":35,"name":"web-server"},{"id":43,"name":"webassembly"},{"id":69,"name":"workflow"},{"id":87,"name":"yaml"}]" returns me the "expected json"

AI prompts

AI prompts