MCP — Model Context Protocol

What is MCP?

Model Context Protocol (MCP) is an open standard that standardizes how AI applications (clients) connect to external data sources and tools (servers). Instead of every agent framework inventing its own plugin format, MCP defines JSON-RPC messages, capability discovery, and transport layers so a GitHub server written once can plug into Claude Desktop, Cursor, or your custom host.

Before MCP, tool integration looked like this: copy-paste OpenAI function schemas into your repo, wrap each SaaS API in Python, redeploy when Slack changes scopes, and hope the next model version still calls tools correctly. MCP inverts ownership: the server team publishes a versioned capability surface; the host discovers and allowlists tools at runtime.

MCP vs ad-hoc tool wrappers

Concern	Ad-hoc tools in repo	MCP server
Discovery	Hard-coded list in agent code	tools/list at connect time
Schema drift	Agent redeploy on every API change	Server version bump; host picks compatible semver
Ownership	Platform team owns all integrations	Domain teams ship their own servers
Transport	In-process function calls	stdio (local) or HTTP/SSE (remote)
Auth	Secrets in agent env	OAuth on transport; scoped tokens per server

MCP is not a replacement for your agent loop—it is the wire format between host and capability providers. Your orchestration (LangGraph, Spring AI, custom) still decides when to call tools, how many steps, and what guardrails apply.

from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client

async def discover_tools(server_cmd: list[str]) -> list[dict]:
    params = StdioServerParameters(command=server_cmd[0], args=server_cmd[1:])
    async with stdio_client(params) as (read, write):
        async with ClientSession(read, write) as session:
            await session.initialize()
            result = await session.list_tools()
            return [{"name": t.name, "description": t.description,
                     "input_schema": t.inputSchema} for t in result.tools]

@Service
public class McpToolDiscovery {
    public List discover(McpServerConfig cfg) {
        try (McpSyncClient client = mcpClientFactory.connectStdio(cfg)) {
            client.initialize();
            return client.listTools().tools().stream()
                .map(t -> new McpToolDescriptor(t.name(), t.description(), t.inputSchema()))
                .toList();
        }
    }
}

🔬 Under the Hood

MCP messages follow JSON-RPC 2.0. Core methods include initialize, tools/list, tools/call, resources/list, resources/read, and prompts/list. The spec evolves on modelcontextprotocol.io; pin SDK versions in production hosts.

⚖️ Trade-off

MCP adds process boundary overhead (stdio spawn or HTTP round trip). For sub-millisecond in-memory calls, keep hot-path tools in-process and expose only cross-team or high-churn integrations via MCP.

📦 Real World

Cursor, Claude Desktop, and VS Code Copilot extensions use MCP for filesystem, GitHub, and database servers. Enterprise teams mirror the pattern: internal MCP registry with signed server bundles per business unit.

🎯 Interview Tip

When asked about extensible agent platforms, mention discovery + allowlisting + transport auth—not just "we expose REST APIs." MCP is the standard answer for pluggable tool ecosystems in 2025–2026.

Architecture — Host, Client, Server, transport

An MCP deployment has three roles: the Host (your AI application), one or more Clients (connectors inside the host), and Servers (processes exposing tools, resources, and prompts). Transport choice—stdio vs SSE—determines latency, security boundary, and ops model.

End-to-end request flow

User sends message to host application.
Host loads MCP manifest; ensures server processes are running (stdio pool or SSE connections).
Host calls tools/list (cached with TTL) and builds merged tool registry.
LLM receives user message + allowlisted tools; may emit tool call.
Host validates tool name and arguments; forwards tools/call to correct client.
Server executes with its credentials; returns content blocks.
Host normalizes result into model API format; continues loop until final answer.

Debugging MCP issues usually means tracing which hop failed: manifest misconfig, allowlist drop, schema mapping bug, server exception, or model choosing wrong tool. Log all six stages with correlation IDs.

Debug signal	Likely cause
Tool not visible to model	Allowlist filter or mapping error
initialize timeout	Server crash on boot; check stderr
Invalid arguments	Schema mismatch between bridge and server
Intermittent 401 on SSE	Expired OAuth token; refresh logic missing

Role diagram

Role	Examples	Responsibilities
Host	Cursor, Claude Desktop, your FastAPI agent service	UI, model calls, policy, aggregates MCP clients
Client	SDK session inside host	One server connection; routes JSON-RPC
Server	GitHub MCP, Postgres MCP, internal ticket MCP	Implements tools/resources; holds domain credentials

stdio vs SSE transport

Transport	Mechanism	Best for	Ops notes
stdio	Host spawns subprocess; stdin/stdout JSON-RPC	Local dev, sidecar on same node, CLI tools	Process per server; restart on crash; no network exposure
SSE (HTTP)	Server-Sent Events + POST for client→server	Remote shared services, multi-tenant registry	TLS, OAuth, rate limits, horizontal scale
Streamable HTTP	Unified HTTP streaming (newer spec)	Cloud-native MCP gateways	Check SDK support before adopting

A single host often runs multiple clients—one per configured server. The agent sees a flattened tool list (with namespaced prefixes like github__create_issue) after your registry merges and allowlists.

# mcp_manifest.yaml
servers:
  - name: acme-support
    transport: stdio
    command: python
    args: ["servers/acme_support_mcp.py"]
    allowed_tools: [lookup_order, search_policy]
  - name: github
    transport: sse
    url: https://mcp.internal.example/github/sse
    allowed_tools: [search_code, get_file]
    oauth:
      provider: github
      scopes: [repo:read]

@ConfigurationProperties(prefix = "mcp")
public record McpManifest(List servers) {
    public record ServerEntry(
        String name, String transport, String command,
        List args, String url, List allowedTools) {}
}

💡 Pro Tip

Namespace tool names when merging servers: {server}__{tool} prevents collisions and makes audit logs readable.

⚠️ Pitfall

Spawning ten stdio servers per request exhausts file descriptors. Pool long-lived server processes and reuse sessions across agent turns within a conversation.

🔒 Security

stdio servers inherit the host OS user. Run them as dedicated service accounts with minimal filesystem and network access—never as root.

Primitives — tools, resources, prompts, sampling

MCP exposes four capability families. Tools are model-invokable functions with JSON schemas. Resources are readable data URIs (files, DB rows, config). Prompts are reusable template slots the host can fetch. Sampling lets a server ask the host's LLM to complete text—a reverse call path for nested agents.

Sampling (server → host LLM)

Sampling is the least-used primitive but important for advanced servers: a tool can request the host to run a completion (e.g., summarize a large file server-side without shipping API keys to the server process). The host policy layer must approve sampling requests—rate limit and cap tokens per server identity.

🔒 Security

Sampling inverts control flow: untrusted servers could burn token budgets. Disable sampling by default; enable per trusted server with daily quotas.

Capability comparison

Primitive	Who initiates	Typical use	Maps to OpenAI
Tools	Model (via host)	API calls, DB queries, ticket updates	tools / function calling
Resources	Host (often pre-model)	Inject file contents, schema snapshots	System message / file attachments
Prompts	Host or user slash command	Standard review templates, runbooks	Stored prompt partials
Sampling	Server → Host LLM	Server-side summarization without owning API keys	N/A (server uses host model)

Tool lifecycle

Host connects and calls initialize with protocol version.
Host calls tools/list; server returns name, description, input JSON Schema.
Host maps schemas to model-native tool format (OpenAI, Anthropic, Gemini).
Model emits tool call; host forwards tools/call with arguments.
Server returns structured content (text, image, resource refs); host feeds result to model.

Resources vs tools

Use resources when data is large, static-ish, or should not be invoked blindly by the model—think "attach the OpenAPI spec" rather than "call search_openapi." The host reads resources and decides what enters context. Tools are for actions with side effects: create ticket, run query, post message.

def mcp_tool_to_openai(mcp_tool: dict, prefix: str = "") -> dict:
    name = f"{prefix}{mcp_tool['name']}" if prefix else mcp_tool["name"]
    return {
        "type": "function",
        "function": {
            "name": name[:64],
            "description": mcp_tool.get("description", "")[:1024],
            "parameters": mcp_tool.get("input_schema", {"type": "object", "properties": {}}),
        },
    }

def merge_allowlisted(tools: list[dict], allowed: set[str], prefix: str) -> list[dict]:
    return [mcp_tool_to_openai(t, prefix=f"{prefix}__")
            for t in tools if t["name"] in allowed]

public OpenAiToolDefinition toOpenAi(McpToolDescriptor tool, String prefix) {
    String name = prefix.isEmpty() ? tool.name() : prefix + "__" + tool.name();
    return OpenAiToolDefinition.builder()
        .name(name.substring(0, Math.min(64, name.length())))
        .description(truncate(tool.description(), 1024))
        .parameters(tool.inputSchema())
        .build();
}

💰 Cost

Every tool call is still a model round trip plus server execution. Resources preloaded into context consume tokens—prefer resource subscriptions with ETags and read only changed URIs.

🔬 Under the Hood

Tool results can include multiple content blocks: text, images (base64), and embedded resource links. Hosts must normalize these into what the active model API accepts—Anthropic tool_result blocks differ from OpenAI tool role messages.

⚠️ Pitfall

Exposing 200 discovered tools to the model degrades routing accuracy. Allowlist 5–15 tools per agent persona; rotate sets by workflow stage.

Python server — FastMCP

The official Python SDK ships FastMCP, a decorator-driven server builder. Annotate functions with @mcp.tool() and @mcp.resource(); run with stdio for local sidecars or configure SSE for remote deployment.

Project layout

File	Purpose
servers/support_mcp.py	FastMCP tool + resource definitions
host/manifest.yaml	Which servers to spawn + allowlists
host/schema_bridge.py	MCP → OpenAI / Anthropic mapping
host/agent_loop.py	Existing tool loop; swap local tools for MCP calls

from mcp.server.fastmcp import FastMCP

mcp = FastMCP("acme-support")

POLICIES = {"refund": "Annual plans: pro-rata credit within 30 days."}

@mcp.resource("policy://{topic}")
def get_policy(topic: str) -> str:
    """Return policy text for a topic slug."""
    return POLICIES.get(topic.lower(), "Policy not found.")

@mcp.tool()
def lookup_order(order_id: str) -> dict:
    """Fetch order status. IDs must start with ORD-."""
    if not order_id.upper().startswith("ORD-"):
        return {"error": "invalid_order_id"}
    return {"order_id": order_id.upper(), "status": "shipped", "eta_days": 2}

@mcp.tool()
def search_policy(query: str, limit: int = 5) -> list[dict]:
    """Keyword search over policy snippets."""
    hits = [k for k in POLICIES if query.lower() in k.lower()]
    return [{"topic": h, "text": POLICIES[h]} for h in hits[:limit]]

if __name__ == "__main__":
    mcp.run()  # stdio transport by default

// Python FastMCP — use Java Spring AI section for JVM servers

async def run_tool(session: ClientSession, name: str, args: dict) -> str:
    result = await session.call_tool(name, arguments=args)
    parts = []
    for block in result.content:
        if block.type == "text":
            parts.append(block.text)
    return "\n".join(parts)

async def agent_turn(session, messages: list, openai_tools: list):
    response = client.chat.completions.create(
        model="gpt-4o", messages=messages, tools=openai_tools)
    msg = response.choices[0].message
    if not msg.tool_calls:
        return msg.content
    for tc in msg.tool_calls:
        out = await run_tool(session, tc.function.name.split("__")[-1],
                             json.loads(tc.function.arguments))
        messages.append({"role": "tool", "tool_call_id": tc.id, "content": out})
    return await agent_turn(session, messages, openai_tools)

public String agentTurn(McpSyncClient client, List messages) {
    ChatResponse response = chatClient.prompt().messages(messages).tools(openAiTools).call();
    if (!response.hasToolCalls()) return response.getResult().getOutput().getContent();
    for (ToolCall call : response.getToolCalls()) {
        String result = client.callTool(call.name(), call.arguments());
        messages.add(ToolResponseMessage.from(call.id(), result));
    }
    return agentTurn(client, messages);
}

💡 Pro Tip

Docstrings on FastMCP functions become tool descriptions seen by the model—write them like API docs: constraints, enums, error shapes.

📦 Real World

Pin mcp package version in both host and server repos. Protocol mismatches surface as silent initialize failures—add integration tests that list tools in CI.

🎯 Interview Tip

Demonstrate you know the boundary: FastMCP server owns business logic; host owns model choice, allowlists, and logging.

Java server — Spring AI @McpTool

Spring AI 1.0+ integrates MCP servers into Spring Boot applications. Annotate service methods with @McpTool and expose them over stdio or WebMvc SSE endpoints—ideal when your enterprise stack is already JVM-centric.

Spring AI MCP auto-generates JSON Schema from method signatures and JavaDoc. Controllers stay thin; domain services implement the actual ticket lookup or policy search. Deploy as a standalone jar sidecar or as a microservice behind your MCP gateway.

Dependencies (Maven)

Artifact	Role
spring-ai-starter-mcp-server	MCP server autoconfiguration
spring-ai-starter-mcp-server-webmvc	SSE/HTTP transport
spring-boot-starter-validation	Argument validation on tools

@Service
public class SupportMcpTools {
    private final OrderService orders;
    private final PolicyService policies;

    @McpTool(name = "lookup_order",
             description = "Fetch order status. orderId must start with ORD-")
    public OrderStatus lookupOrder(
            @McpToolParam(description = "Order id like ORD-1042") String orderId) {
        if (!orderId.toUpperCase().startsWith("ORD-")) {
            throw new IllegalArgumentException("invalid_order_id");
        }
        return orders.getStatus(orderId.toUpperCase());
    }

    @McpTool(name = "search_policy",
             description = "Search refund and shipping policy snippets")
    public List searchPolicy(
            @McpToolParam(description = "Search query") String query,
            @McpToolParam(description = "Max results", required = false) Integer limit) {
        return policies.search(query, limit != null ? limit : 5);
    }
}

@SpringBootApplication
public class AcmeMcpServerApplication {
    public static void main(String[] args) {
        SpringApplication.run(AcmeMcpServerApplication.class, args);
    }
}

// application.yml
// spring.ai.mcp.server.name: acme-support
// spring.ai.mcp.server.stdio.enabled: true

@Service
public class McpAgentBridge {
    private final ChatClient chatClient;
    private final List mcpClients;

    public String run(String userMessage) {
        List tools = mcpClients.stream()
            .flatMap(c -> c.listTools().tools().stream())
            .filter(t -> allowlist.contains(t.name()))
            .map(this::toToolCallback)
            .toList();

        return chatClient.prompt()
            .user(userMessage)
            .toolCallbacks(tools)
            .call()
            .content();
    }
}

@Component
public class McpToolCallback implements ToolCallback {
    private final McpSyncClient client;
    private final McpToolDescriptor descriptor;

    @Override
    public ToolDefinition getToolDefinition() { return map(descriptor); }

    @Override
    public String call(String argumentsJson) {
        return client.callTool(descriptor.name(), parseJson(argumentsJson));
    }
}

⚖️ Trade-off

JVM cold start (2–8 s) hurts stdio spawn-per-request patterns. Keep MCP server processes warm or use SSE deployment on Kubernetes with min replicas ≥ 1.

🔬 Under the Hood

Spring AI maps @McpToolParam to JSON Schema properties with required flags. Nullable types and validation annotations propagate to the schema the model sees.

💡 Pro Tip

Share DTO records between REST controllers and MCP tools so OpenAPI and MCP schemas stay aligned via one source of truth.

Security — poisoning, injection, OAuth, least privilege

MCP moves trust boundaries: servers run with credentials, models choose tools, and tool descriptions are prompt surface. Production deployments need allowlists, description auditing, OAuth-scoped transports, and least-privilege server accounts.

Threat model

Threat	Mechanism	Mitigation
Tool poisoning	Malicious server embeds hidden instructions in tool descriptions	Sign servers; review descriptions; strip HTML; human approve new tools
Resource injection	Resource URI returns attacker-controlled text into context	Host validates URI schemes; sandbox reads; max byte limits
Over-privileged tools	Model calls delete_database when read was enough	Separate read/write servers; per-persona allowlists
Token theft	Compromised host leaks OAuth refresh tokens	Short-lived tokens; vault sidecars; no tokens in prompts
Prompt injection via tool output	Ticket body contains "ignore previous instructions"	Output sanitization; structured JSON; secondary policy model

OAuth for remote MCP (SSE)

User authorizes host app via standard OAuth (PKCE for public clients).
Host stores refresh token in vault; never passes to model.
MCP client attaches access token to SSE connection per server registration.
Server validates token scopes before executing tools.
Rotate credentials; audit tool calls with user + token subject id.

OAuth scope	Tools enabled
kb:read	search_policy, fetch_chunk
orders:read	lookup_order only
orders:write	create_refund (human approval gate)

Map OAuth scopes to MCP tool allowlists in the host manifest—never expose write tools when the token is read-only.

Refresh tokens on a background schedule before expiry so SSE sessions do not fail mid-conversation.

TRUSTED_SERVERS = {"acme-support", "github-readonly"}

def audit_tool_description(name: str, desc: str) -> bool:
    blocked = ["ignore previous", "system prompt", "override", "secret"]
    lower = desc.lower()
    if any(b in lower for b in blocked):
        log_security_event("tool_description_flag", tool=name)
        return False
    if len(desc) > 2000:
        return False  # excessively long descriptions are suspicious
    return True

def filter_tools(server_name: str, tools: list, allowed: set) -> list:
    if server_name not in TRUSTED_SERVERS:
        raise SecurityError(f"untrusted server: {server_name}")
    return [t for t in tools if t["name"] in allowed and audit_tool_description(t["name"], t.get("description", ""))]

public List filterTools(String serverName, List tools) {
    if (!trustedServers.contains(serverName)) {
        throw new SecurityException("untrusted server: " + serverName);
    }
    return tools.stream()
        .filter(t -> allowlist.contains(t.name()))
        .filter(t -> descriptionAuditor.isSafe(t.description()))
        .toList();
}

🔒 Security

Treat MCP tool descriptions like user input—they enter the model context. Scan for injection patterns in CI when servers update.

⚠️ Pitfall

Connecting to community MCP servers without code review is equivalent to installing unaudited browser extensions with DB credentials.

🎯 Interview Tip

For security questions: least privilege allowlists, signed server artifacts, OAuth on SSE, output sanitization, and logging every tool call with arguments redacted.

📦 Real World

Enterprise MCP registries require SBOM + security review before a server enters the catalog. Dev servers run in isolated namespaces with synthetic data only.

What this looks like in production

Production MCP is a registry + gateway + observability problem—not just a Python script. Remote SSE servers, semver contracts, staged rollouts, and unified tool audit logs let platform teams scale agents without N× custom integrations.

Production architecture

Component	Function
Registry	Catalog of approved servers, versions, owners, allowed scopes
Gateway	TLS termination, OAuth, rate limits, routing to server pools
Host runtime	Agent service; merges tools; enforces per-tenant allowlists
Observability	Tool latency, error rates, argument hashes (redacted), server version

Versioning strategy

Servers publish semver; breaking schema changes bump major.
Hosts pin server_version in manifest; auto-upgrade only patch.
Contract tests: golden tools/list snapshots in CI.
Blue/green server deploys; hosts drain old connections gracefully.

Track 4 production checklist

Signed server bundles in internal registry
Per-agent allowlist ≤ 15 tools
OAuth or mTLS on all remote transports
Tool description audit in CI
Structured logs: server, tool, latency, user, tenant
Fallback when server unavailable (degraded mode message)
Load test: concurrent SSE sessions at peak QPS

@dataclass
class ProductionMcpConfig:
    registry_url: str
    server_pins: dict[str, str]  # name -> semver
    allowed_tools: dict[str, set[str]]
    oauth_vault_path: str
    max_tool_calls_per_turn: int = 8

async def load_production_servers(cfg: ProductionMcpConfig) -> list:
    catalog = await fetch_registry(cfg.registry_url)
    sessions = []
    for entry in catalog:
        if entry.version != cfg.server_pins.get(entry.name):
            raise ConfigError(f"version pin mismatch: {entry.name}")
        session = await connect_sse(entry.url, vault_token(cfg.oauth_vault_path, entry.name))
        tools = filter_tools(entry.name, await session.list_tools(), cfg.allowed_tools[entry.name])
        sessions.append((session, tools))
    return sessions

@Scheduled(fixedRate = 60_000)
public void healthCheckMcpServers() {
    for (McpServerRegistration reg : registry.active()) {
        HealthStatus status = gateway.ping(reg);
        metrics.counter("mcp.server.health", "name", reg.name()).increment(status.isUp() ? 1 : 0);
        if (!status.isUp()) notificationService.alert(reg);
    }
}

💰 Cost

Remote MCP adds network hop (10–80 ms) per tool call. Batch read-only operations in single tools where possible; avoid chatty micro-tool designs.

📦 Real World

Platform teams expose one internal MCP gateway URL; product teams register servers via GitOps PR to the registry repo—mirroring internal API catalog patterns.

Latency budget per MCP tool call

Transport	Connect	tools/list	tools/call p95
stdio (warm process)	0 ms (pooled)	5–20 ms	20–200 ms
SSE remote	30–100 ms	40–120 ms	80–500 ms

Budget MCP like external APIs: set per-tool timeouts, circuit-break after N failures, and return structured {"error":"server_unavailable"} to the model instead of hanging the agent loop.

Mapping MCP to Anthropic and Bedrock

OpenAI uses tools with function schemas. Anthropic uses tool_use blocks with input objects. Bedrock Converse API unifies tool config across model vendors. Your schema bridge should emit provider-specific payloads from one canonical MCP descriptor list—do not fork business logic per provider.

def to_provider_tools(mcp_tools: list[dict], provider: str, prefix: str) -> list[dict]:
    if provider == "openai":
        return [mcp_tool_to_openai(t, prefix) for t in mcp_tools]
    if provider == "anthropic":
        return [{"name": f"{prefix}__{t['name']}", "description": t["description"],
                 "input_schema": t["input_schema"]} for t in mcp_tools]
    raise ValueError(f"unsupported provider: {provider}")

public List<ToolDefinition> toProviderTools(
        List<McpToolDescriptor> tools, Provider provider, String prefix) {
    return tools.stream().map(t -> switch (provider) {
        case OPENAI -> openAiMapper.map(t, prefix);
        case ANTHROPIC -> anthropicMapper.map(t, prefix);
        case BEDROCK -> bedrockMapper.map(t, prefix);
    }).toList();
}

Related Track 4 guides

Agents explained — tool loops and when agents beat chains
Tool calling — function schemas before MCP wiring
Multi-agent orchestration — supervisors calling MCP specialists
Agentic RAG — retrieval tools exposed via MCP servers

🎯 Interview Tip

Contrast MCP with OpenAPI: OpenAPI describes HTTP APIs for humans and codegen; MCP describes model-invokable capabilities with discovery and bidirectional sampling. Both can coexist—MCP server wraps internal REST.

Local development checklist

Run mcp_inspect.py to list tools without invoking the LLM
Snapshot golden tools/list JSON in unit tests
Test allowlist drops: ensure removed tools never appear in OpenAI payload
Simulate server crash mid-session; host should reconnect or degrade gracefully
Record one full trace (initialize → list → call) for onboarding docs

Spec resources

The MCP specification lives at modelcontextprotocol.io. SDKs for Python and TypeScript track spec versions—when Anthropic or the steering committee publishes transport updates, upgrade host and server SDKs in lockstep. Internal wrappers should not reimplement JSON-RPC; use official clients to avoid subtle protocol bugs.