User-facing interaction contracts—streaming, approvals, and UI affordances for agentic apps (PDF 249–252).
● Secure collaboration
● Task and state management
● Capability discovery
● Agents from different frameworks working together (LlamaIndex, CrewAI, etc.) Additionally, it can integrate with MCP. It's good to standardize Agent-to-Agent collaboration, similar to how MCP does for Agent-to-tool interaction. Agent-User Interaction Protocol(AG-UI) In the realm of Agents:
● MCP standardized Agent-to-Tool communication.
● Agent2Agent protocol standardized Agent-to-Agent communication.
But there’s one piece still missing… And that’s a protocol for Agent-to-User communication:
Let’s understand why this is important. The problem Today, you can build powerful multi-step agentic workflows using a toolkit like LangGraph, CrewAI, Mastra, etc.
But the moment you try to bring that Agent into a real-world app, things fall apart:
● You want to stream LLM responses token by token, without building a custom WebSocket server.
● You want to display tool execution progress as it happens, pause for human feedback, without blocking or losing context.
● You want to sync large, changing objects (like code or tables) without re-sending everything to the UI.
● You want to let users interrupt, cancel, or reply mid-agent run, without losing context. And here’s another issue: Every Agent backend has its own mechanisms for tool calling, ReAct-style planning, state diffs, and output formats. So if you use LangGraph, the front-end will implement custom WebSocket logic, messy JSON formats, and UI adapters specific to LangGraph. But to migrate to CrewAI, everything must be adapted. This doesn’t scale. The solution: AG-UI AG-UI (Agent-User Interaction Protocol) is an open-source protocol by CopilotKit that solves this. It standardizes the interaction layer between backend agents and frontend UIs (the green layer below).
Think of it this way:
● Just like REST is the standard for client-to-server requests…
● AG-UI is the standard for streaming real-time agent updates back to the
UI. Technically speaking… It uses Server-Sent Events (SSE) to stream structured JSON events to the frontend. Each event has an explicit payload (like keys in a Python dictionary) like: