Lang
API

MCP — Model Context Protocol

Tools used to live entirely inside your agent repository—every Slack, GitHub, and internal API wired by hand. Model Context Protocol (MCP) is the open standard that standardizes how a host (your agent runtime) talks to a server (a process exposing tools, resources, and prompts over JSON-RPC). This guide covers architecture, primitives, FastMCP and Spring AI servers, security threats, and production registry patterns.

After reading, you should be able to: explain MCP host/client/server roles; choose stdio vs SSE transport; implement tools and resources in FastMCP and Spring AI; map MCP schemas to OpenAI function calling; mitigate tool poisoning and resource injection; and operate remote MCP servers with versioning and allowlists.

developer platform architect Track 4 MCP FastMCP Spring AI stdio + SSE OAuth

What is MCP?

Model Context Protocol (MCP) is an open standard that standardizes how AI applications (clients) connect to external data sources and tools (servers). Instead of every agent framework inventing its own plugin format, MCP defines JSON-RPC messages, capability discovery, and transport layers so a GitHub server written once can plug into Claude Desktop, Cursor, or your custom host.

Before MCP, tool integration looked like this: copy-paste OpenAI function schemas into your repo, wrap each SaaS API in Python, redeploy when Slack changes scopes, and hope the next model version still calls tools correctly. MCP inverts ownership: the server team publishes a versioned capability surface; the host discovers and allowlists tools at runtime.

MCP vs ad-hoc tool wrappers

ConcernAd-hoc tools in repoMCP server
DiscoveryHard-coded list in agent codetools/list at connect time
Schema driftAgent redeploy on every API changeServer version bump; host picks compatible semver
OwnershipPlatform team owns all integrationsDomain teams ship their own servers
TransportIn-process function callsstdio (local) or HTTP/SSE (remote)
AuthSecrets in agent envOAuth on transport; scoped tokens per server

MCP is not a replacement for your agent loop—it is the wire format between host and capability providers. Your orchestration (LangGraph, Spring AI, custom) still decides when to call tools, how many steps, and what guardrails apply.

MCP host discovers tools at startup
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client

async def discover_tools(server_cmd: list[str]) -> list[dict]:
    params = StdioServerParameters(command=server_cmd[0], args=server_cmd[1:])
    async with stdio_client(params) as (read, write):
        async with ClientSession(read, write) as session:
            await session.initialize()
            result = await session.list_tools()
            return [{"name": t.name, "description": t.description,
                     "input_schema": t.inputSchema} for t in result.tools]
@Service
public class McpToolDiscovery {
    public List discover(McpServerConfig cfg) {
        try (McpSyncClient client = mcpClientFactory.connectStdio(cfg)) {
            client.initialize();
            return client.listTools().tools().stream()
                .map(t -> new McpToolDescriptor(t.name(), t.description(), t.inputSchema()))
                .toList();
        }
    }
}
🔬 Under the Hood

MCP messages follow JSON-RPC 2.0. Core methods include initialize, tools/list, tools/call, resources/list, resources/read, and prompts/list. The spec evolves on modelcontextprotocol.io; pin SDK versions in production hosts.

⚖️ Trade-off

MCP adds process boundary overhead (stdio spawn or HTTP round trip). For sub-millisecond in-memory calls, keep hot-path tools in-process and expose only cross-team or high-churn integrations via MCP.

📦 Real World

Cursor, Claude Desktop, and VS Code Copilot extensions use MCP for filesystem, GitHub, and database servers. Enterprise teams mirror the pattern: internal MCP registry with signed server bundles per business unit.

🎯 Interview Tip

When asked about extensible agent platforms, mention discovery + allowlisting + transport auth—not just "we expose REST APIs." MCP is the standard answer for pluggable tool ecosystems in 2025–2026.

Architecture — Host, Client, Server, transport

An MCP deployment has three roles: the Host (your AI application), one or more Clients (connectors inside the host), and Servers (processes exposing tools, resources, and prompts). Transport choice—stdio vs SSE—determines latency, security boundary, and ops model.

End-to-end request flow

  1. User sends message to host application.
  2. Host loads MCP manifest; ensures server processes are running (stdio pool or SSE connections).
  3. Host calls tools/list (cached with TTL) and builds merged tool registry.
  4. LLM receives user message + allowlisted tools; may emit tool call.
  5. Host validates tool name and arguments; forwards tools/call to correct client.
  6. Server executes with its credentials; returns content blocks.
  7. Host normalizes result into model API format; continues loop until final answer.

Debugging MCP issues usually means tracing which hop failed: manifest misconfig, allowlist drop, schema mapping bug, server exception, or model choosing wrong tool. Log all six stages with correlation IDs.

Debug signalLikely cause
Tool not visible to modelAllowlist filter or mapping error
initialize timeoutServer crash on boot; check stderr
Invalid argumentsSchema mismatch between bridge and server
Intermittent 401 on SSEExpired OAuth token; refresh logic missing

Role diagram

RoleExamplesResponsibilities
HostCursor, Claude Desktop, your FastAPI agent serviceUI, model calls, policy, aggregates MCP clients
ClientSDK session inside hostOne server connection; routes JSON-RPC
ServerGitHub MCP, Postgres MCP, internal ticket MCPImplements tools/resources; holds domain credentials

stdio vs SSE transport

TransportMechanismBest forOps notes
stdioHost spawns subprocess; stdin/stdout JSON-RPCLocal dev, sidecar on same node, CLI toolsProcess per server; restart on crash; no network exposure
SSE (HTTP)Server-Sent Events + POST for client→serverRemote shared services, multi-tenant registryTLS, OAuth, rate limits, horizontal scale
Streamable HTTPUnified HTTP streaming (newer spec)Cloud-native MCP gatewaysCheck SDK support before adopting

A single host often runs multiple clients—one per configured server. The agent sees a flattened tool list (with namespaced prefixes like github__create_issue) after your registry merges and allowlists.

Host manifest — multiple MCP servers
# mcp_manifest.yaml
servers:
  - name: acme-support
    transport: stdio
    command: python
    args: ["servers/acme_support_mcp.py"]
    allowed_tools: [lookup_order, search_policy]
  - name: github
    transport: sse
    url: https://mcp.internal.example/github/sse
    allowed_tools: [search_code, get_file]
    oauth:
      provider: github
      scopes: [repo:read]
@ConfigurationProperties(prefix = "mcp")
public record McpManifest(List servers) {
    public record ServerEntry(
        String name, String transport, String command,
        List args, String url, List allowedTools) {}
}
💡 Pro Tip

Namespace tool names when merging servers: {server}__{tool} prevents collisions and makes audit logs readable.

⚠️ Pitfall

Spawning ten stdio servers per request exhausts file descriptors. Pool long-lived server processes and reuse sessions across agent turns within a conversation.

🔒 Security

stdio servers inherit the host OS user. Run them as dedicated service accounts with minimal filesystem and network access—never as root.

Primitives — tools, resources, prompts, sampling

MCP exposes four capability families. Tools are model-invokable functions with JSON schemas. Resources are readable data URIs (files, DB rows, config). Prompts are reusable template slots the host can fetch. Sampling lets a server ask the host's LLM to complete text—a reverse call path for nested agents.

Sampling (server → host LLM)

Sampling is the least-used primitive but important for advanced servers: a tool can request the host to run a completion (e.g., summarize a large file server-side without shipping API keys to the server process). The host policy layer must approve sampling requests—rate limit and cap tokens per server identity.

🔒 Security

Sampling inverts control flow: untrusted servers could burn token budgets. Disable sampling by default; enable per trusted server with daily quotas.

Capability comparison

PrimitiveWho initiatesTypical useMaps to OpenAI
ToolsModel (via host)API calls, DB queries, ticket updatestools / function calling
ResourcesHost (often pre-model)Inject file contents, schema snapshotsSystem message / file attachments
PromptsHost or user slash commandStandard review templates, runbooksStored prompt partials
SamplingServer → Host LLMServer-side summarization without owning API keysN/A (server uses host model)

Tool lifecycle

  1. Host connects and calls initialize with protocol version.
  2. Host calls tools/list; server returns name, description, input JSON Schema.
  3. Host maps schemas to model-native tool format (OpenAI, Anthropic, Gemini).
  4. Model emits tool call; host forwards tools/call with arguments.
  5. Server returns structured content (text, image, resource refs); host feeds result to model.

Resources vs tools

Use resources when data is large, static-ish, or should not be invoked blindly by the model—think "attach the OpenAPI spec" rather than "call search_openapi." The host reads resources and decides what enters context. Tools are for actions with side effects: create ticket, run query, post message.

Map MCP tools to OpenAI function schema
def mcp_tool_to_openai(mcp_tool: dict, prefix: str = "") -> dict:
    name = f"{prefix}{mcp_tool['name']}" if prefix else mcp_tool["name"]
    return {
        "type": "function",
        "function": {
            "name": name[:64],
            "description": mcp_tool.get("description", "")[:1024],
            "parameters": mcp_tool.get("input_schema", {"type": "object", "properties": {}}),
        },
    }

def merge_allowlisted(tools: list[dict], allowed: set[str], prefix: str) -> list[dict]:
    return [mcp_tool_to_openai(t, prefix=f"{prefix}__")
            for t in tools if t["name"] in allowed]
public OpenAiToolDefinition toOpenAi(McpToolDescriptor tool, String prefix) {
    String name = prefix.isEmpty() ? tool.name() : prefix + "__" + tool.name();
    return OpenAiToolDefinition.builder()
        .name(name.substring(0, Math.min(64, name.length())))
        .description(truncate(tool.description(), 1024))
        .parameters(tool.inputSchema())
        .build();
}
💰 Cost

Every tool call is still a model round trip plus server execution. Resources preloaded into context consume tokens—prefer resource subscriptions with ETags and read only changed URIs.

🔬 Under the Hood

Tool results can include multiple content blocks: text, images (base64), and embedded resource links. Hosts must normalize these into what the active model API accepts—Anthropic tool_result blocks differ from OpenAI tool role messages.

⚠️ Pitfall

Exposing 200 discovered tools to the model degrades routing accuracy. Allowlist 5–15 tools per agent persona; rotate sets by workflow stage.

Python server — FastMCP

The official Python SDK ships FastMCP, a decorator-driven server builder. Annotate functions with @mcp.tool() and @mcp.resource(); run with stdio for local sidecars or configure SSE for remote deployment.

Project layout

FilePurpose
servers/support_mcp.pyFastMCP tool + resource definitions
host/manifest.yamlWhich servers to spawn + allowlists
host/schema_bridge.pyMCP → OpenAI / Anthropic mapping
host/agent_loop.pyExisting tool loop; swap local tools for MCP calls
FastMCP server with tool and resource
from mcp.server.fastmcp import FastMCP

mcp = FastMCP("acme-support")

POLICIES = {"refund": "Annual plans: pro-rata credit within 30 days."}

@mcp.resource("policy://{topic}")
def get_policy(topic: str) -> str:
    """Return policy text for a topic slug."""
    return POLICIES.get(topic.lower(), "Policy not found.")

@mcp.tool()
def lookup_order(order_id: str) -> dict:
    """Fetch order status. IDs must start with ORD-."""
    if not order_id.upper().startswith("ORD-"):
        return {"error": "invalid_order_id"}
    return {"order_id": order_id.upper(), "status": "shipped", "eta_days": 2}

@mcp.tool()
def search_policy(query: str, limit: int = 5) -> list[dict]:
    """Keyword search over policy snippets."""
    hits = [k for k in POLICIES if query.lower() in k.lower()]
    return [{"topic": h, "text": POLICIES[h]} for h in hits[:limit]]

if __name__ == "__main__":
    mcp.run()  # stdio transport by default
// Python FastMCP — use Java Spring AI section for JVM servers
Host agent loop calling MCP tool
async def run_tool(session: ClientSession, name: str, args: dict) -> str:
    result = await session.call_tool(name, arguments=args)
    parts = []
    for block in result.content:
        if block.type == "text":
            parts.append(block.text)
    return "\n".join(parts)

async def agent_turn(session, messages: list, openai_tools: list):
    response = client.chat.completions.create(
        model="gpt-4o", messages=messages, tools=openai_tools)
    msg = response.choices[0].message
    if not msg.tool_calls:
        return msg.content
    for tc in msg.tool_calls:
        out = await run_tool(session, tc.function.name.split("__")[-1],
                             json.loads(tc.function.arguments))
        messages.append({"role": "tool", "tool_call_id": tc.id, "content": out})
    return await agent_turn(session, messages, openai_tools)
public String agentTurn(McpSyncClient client, List messages) {
    ChatResponse response = chatClient.prompt().messages(messages).tools(openAiTools).call();
    if (!response.hasToolCalls()) return response.getResult().getOutput().getContent();
    for (ToolCall call : response.getToolCalls()) {
        String result = client.callTool(call.name(), call.arguments());
        messages.add(ToolResponseMessage.from(call.id(), result));
    }
    return agentTurn(client, messages);
}
💡 Pro Tip

Docstrings on FastMCP functions become tool descriptions seen by the model—write them like API docs: constraints, enums, error shapes.

📦 Real World

Pin mcp package version in both host and server repos. Protocol mismatches surface as silent initialize failures—add integration tests that list tools in CI.

🎯 Interview Tip

Demonstrate you know the boundary: FastMCP server owns business logic; host owns model choice, allowlists, and logging.

Java server — Spring AI @McpTool

Spring AI 1.0+ integrates MCP servers into Spring Boot applications. Annotate service methods with @McpTool and expose them over stdio or WebMvc SSE endpoints—ideal when your enterprise stack is already JVM-centric.

Spring AI MCP auto-generates JSON Schema from method signatures and JavaDoc. Controllers stay thin; domain services implement the actual ticket lookup or policy search. Deploy as a standalone jar sidecar or as a microservice behind your MCP gateway.

Dependencies (Maven)

ArtifactRole
spring-ai-starter-mcp-serverMCP server autoconfiguration
spring-ai-starter-mcp-server-webmvcSSE/HTTP transport
spring-boot-starter-validationArgument validation on tools
Spring AI MCP tool server
@Service
public class SupportMcpTools {
    private final OrderService orders;
    private final PolicyService policies;

    @McpTool(name = "lookup_order",
             description = "Fetch order status. orderId must start with ORD-")
    public OrderStatus lookupOrder(
            @McpToolParam(description = "Order id like ORD-1042") String orderId) {
        if (!orderId.toUpperCase().startsWith("ORD-")) {
            throw new IllegalArgumentException("invalid_order_id");
        }
        return orders.getStatus(orderId.toUpperCase());
    }

    @McpTool(name = "search_policy",
             description = "Search refund and shipping policy snippets")
    public List searchPolicy(
            @McpToolParam(description = "Search query") String query,
            @McpToolParam(description = "Max results", required = false) Integer limit) {
        return policies.search(query, limit != null ? limit : 5);
    }
}
@SpringBootApplication
public class AcmeMcpServerApplication {
    public static void main(String[] args) {
        SpringApplication.run(AcmeMcpServerApplication.class, args);
    }
}

// application.yml
// spring.ai.mcp.server.name: acme-support
// spring.ai.mcp.server.stdio.enabled: true
Spring AI MCP client in host agent
@Service
public class McpAgentBridge {
    private final ChatClient chatClient;
    private final List mcpClients;

    public String run(String userMessage) {
        List tools = mcpClients.stream()
            .flatMap(c -> c.listTools().tools().stream())
            .filter(t -> allowlist.contains(t.name()))
            .map(this::toToolCallback)
            .toList();

        return chatClient.prompt()
            .user(userMessage)
            .toolCallbacks(tools)
            .call()
            .content();
    }
}
@Component
public class McpToolCallback implements ToolCallback {
    private final McpSyncClient client;
    private final McpToolDescriptor descriptor;

    @Override
    public ToolDefinition getToolDefinition() { return map(descriptor); }

    @Override
    public String call(String argumentsJson) {
        return client.callTool(descriptor.name(), parseJson(argumentsJson));
    }
}
⚖️ Trade-off

JVM cold start (2–8 s) hurts stdio spawn-per-request patterns. Keep MCP server processes warm or use SSE deployment on Kubernetes with min replicas ≥ 1.

🔬 Under the Hood

Spring AI maps @McpToolParam to JSON Schema properties with required flags. Nullable types and validation annotations propagate to the schema the model sees.

💡 Pro Tip

Share DTO records between REST controllers and MCP tools so OpenAPI and MCP schemas stay aligned via one source of truth.

Security — poisoning, injection, OAuth, least privilege

MCP moves trust boundaries: servers run with credentials, models choose tools, and tool descriptions are prompt surface. Production deployments need allowlists, description auditing, OAuth-scoped transports, and least-privilege server accounts.

Threat model

ThreatMechanismMitigation
Tool poisoningMalicious server embeds hidden instructions in tool descriptionsSign servers; review descriptions; strip HTML; human approve new tools
Resource injectionResource URI returns attacker-controlled text into contextHost validates URI schemes; sandbox reads; max byte limits
Over-privileged toolsModel calls delete_database when read was enoughSeparate read/write servers; per-persona allowlists
Token theftCompromised host leaks OAuth refresh tokensShort-lived tokens; vault sidecars; no tokens in prompts
Prompt injection via tool outputTicket body contains "ignore previous instructions"Output sanitization; structured JSON; secondary policy model

OAuth for remote MCP (SSE)

  1. User authorizes host app via standard OAuth (PKCE for public clients).
  2. Host stores refresh token in vault; never passes to model.
  3. MCP client attaches access token to SSE connection per server registration.
  4. Server validates token scopes before executing tools.
  5. Rotate credentials; audit tool calls with user + token subject id.
OAuth scopeTools enabled
kb:readsearch_policy, fetch_chunk
orders:readlookup_order only
orders:writecreate_refund (human approval gate)

Map OAuth scopes to MCP tool allowlists in the host manifest—never expose write tools when the token is read-only.

Refresh tokens on a background schedule before expiry so SSE sessions do not fail mid-conversation.

Allowlist and description audit
TRUSTED_SERVERS = {"acme-support", "github-readonly"}

def audit_tool_description(name: str, desc: str) -> bool:
    blocked = ["ignore previous", "system prompt", "override", "secret"]
    lower = desc.lower()
    if any(b in lower for b in blocked):
        log_security_event("tool_description_flag", tool=name)
        return False
    if len(desc) > 2000:
        return False  # excessively long descriptions are suspicious
    return True

def filter_tools(server_name: str, tools: list, allowed: set) -> list:
    if server_name not in TRUSTED_SERVERS:
        raise SecurityError(f"untrusted server: {server_name}")
    return [t for t in tools if t["name"] in allowed and audit_tool_description(t["name"], t.get("description", ""))]
public List filterTools(String serverName, List tools) {
    if (!trustedServers.contains(serverName)) {
        throw new SecurityException("untrusted server: " + serverName);
    }
    return tools.stream()
        .filter(t -> allowlist.contains(t.name()))
        .filter(t -> descriptionAuditor.isSafe(t.description()))
        .toList();
}
🔒 Security

Treat MCP tool descriptions like user input—they enter the model context. Scan for injection patterns in CI when servers update.

⚠️ Pitfall

Connecting to community MCP servers without code review is equivalent to installing unaudited browser extensions with DB credentials.

🎯 Interview Tip

For security questions: least privilege allowlists, signed server artifacts, OAuth on SSE, output sanitization, and logging every tool call with arguments redacted.

📦 Real World

Enterprise MCP registries require SBOM + security review before a server enters the catalog. Dev servers run in isolated namespaces with synthetic data only.

What this looks like in production

Production MCP is a registry + gateway + observability problem—not just a Python script. Remote SSE servers, semver contracts, staged rollouts, and unified tool audit logs let platform teams scale agents without N× custom integrations.

Production architecture

ComponentFunction
RegistryCatalog of approved servers, versions, owners, allowed scopes
GatewayTLS termination, OAuth, rate limits, routing to server pools
Host runtimeAgent service; merges tools; enforces per-tenant allowlists
ObservabilityTool latency, error rates, argument hashes (redacted), server version

Versioning strategy

  • Servers publish semver; breaking schema changes bump major.
  • Hosts pin server_version in manifest; auto-upgrade only patch.
  • Contract tests: golden tools/list snapshots in CI.
  • Blue/green server deploys; hosts drain old connections gracefully.

Track 4 production checklist

  • Signed server bundles in internal registry
  • Per-agent allowlist ≤ 15 tools
  • OAuth or mTLS on all remote transports
  • Tool description audit in CI
  • Structured logs: server, tool, latency, user, tenant
  • Fallback when server unavailable (degraded mode message)
  • Load test: concurrent SSE sessions at peak QPS
Production MCP host config
@dataclass
class ProductionMcpConfig:
    registry_url: str
    server_pins: dict[str, str]  # name -> semver
    allowed_tools: dict[str, set[str]]
    oauth_vault_path: str
    max_tool_calls_per_turn: int = 8

async def load_production_servers(cfg: ProductionMcpConfig) -> list:
    catalog = await fetch_registry(cfg.registry_url)
    sessions = []
    for entry in catalog:
        if entry.version != cfg.server_pins.get(entry.name):
            raise ConfigError(f"version pin mismatch: {entry.name}")
        session = await connect_sse(entry.url, vault_token(cfg.oauth_vault_path, entry.name))
        tools = filter_tools(entry.name, await session.list_tools(), cfg.allowed_tools[entry.name])
        sessions.append((session, tools))
    return sessions
@Scheduled(fixedRate = 60_000)
public void healthCheckMcpServers() {
    for (McpServerRegistration reg : registry.active()) {
        HealthStatus status = gateway.ping(reg);
        metrics.counter("mcp.server.health", "name", reg.name()).increment(status.isUp() ? 1 : 0);
        if (!status.isUp()) notificationService.alert(reg);
    }
}
💰 Cost

Remote MCP adds network hop (10–80 ms) per tool call. Batch read-only operations in single tools where possible; avoid chatty micro-tool designs.

📦 Real World

Platform teams expose one internal MCP gateway URL; product teams register servers via GitOps PR to the registry repo—mirroring internal API catalog patterns.

Latency budget per MCP tool call

TransportConnecttools/listtools/call p95
stdio (warm process)0 ms (pooled)5–20 ms20–200 ms
SSE remote30–100 ms40–120 ms80–500 ms

Budget MCP like external APIs: set per-tool timeouts, circuit-break after N failures, and return structured {"error":"server_unavailable"} to the model instead of hanging the agent loop.

Mapping MCP to Anthropic and Bedrock

OpenAI uses tools with function schemas. Anthropic uses tool_use blocks with input objects. Bedrock Converse API unifies tool config across model vendors. Your schema bridge should emit provider-specific payloads from one canonical MCP descriptor list—do not fork business logic per provider.

Provider-aware schema bridge
def to_provider_tools(mcp_tools: list[dict], provider: str, prefix: str) -> list[dict]:
    if provider == "openai":
        return [mcp_tool_to_openai(t, prefix) for t in mcp_tools]
    if provider == "anthropic":
        return [{"name": f"{prefix}__{t['name']}", "description": t["description"],
                 "input_schema": t["input_schema"]} for t in mcp_tools]
    raise ValueError(f"unsupported provider: {provider}")
public List<ToolDefinition> toProviderTools(
        List<McpToolDescriptor> tools, Provider provider, String prefix) {
    return tools.stream().map(t -> switch (provider) {
        case OPENAI -> openAiMapper.map(t, prefix);
        case ANTHROPIC -> anthropicMapper.map(t, prefix);
        case BEDROCK -> bedrockMapper.map(t, prefix);
    }).toList();
}

Related Track 4 guides

🎯 Interview Tip

Contrast MCP with OpenAPI: OpenAPI describes HTTP APIs for humans and codegen; MCP describes model-invokable capabilities with discovery and bidirectional sampling. Both can coexist—MCP server wraps internal REST.

Local development checklist

  • Run mcp_inspect.py to list tools without invoking the LLM
  • Snapshot golden tools/list JSON in unit tests
  • Test allowlist drops: ensure removed tools never appear in OpenAI payload
  • Simulate server crash mid-session; host should reconnect or degrade gracefully
  • Record one full trace (initialize → list → call) for onboarding docs

Spec resources

The MCP specification lives at modelcontextprotocol.io. SDKs for Python and TypeScript track spec versions—when Anthropic or the steering committee publishes transport updates, upgrade host and server SDKs in lockstep. Internal wrappers should not reimplement JSON-RPC; use official clients to avoid subtle protocol bugs.