How the Model Context Protocol Works Under the Hood

The Problem MCP Is Solving

Before MCP, every AI assistant that wanted to call an external tool had to implement its own integration layer. OpenAI had function calling. Anthropic had tool use. LangChain had its own abstraction. Each one required the tool developer to write a separate adapter, and each one made slightly different assumptions about schemas, error handling, and context passing.

The Model Context Protocol, published by Anthropic in late 2024, is an attempt to standardize that interface. Not just for Claude, but for any AI agent that wants to consume external capabilities. The goal is a world where a tool built once can work across any compliant host, without rewriting the integration for each one.

That's a reasonable ambition. Whether it succeeds depends entirely on the protocol's technical design, and that design is worth understanding in detail.

The Client-Server Architecture

MCP uses a clear client-server split. The host is the AI application, something like Claude Desktop, Cursor, or a custom agent you've built. The client is a component inside that host that manages the connection to an MCP server. The server is the external process that exposes tools, resources, or prompts.

This separation matters because it means the host application doesn't talk directly to your database or API. It talks to a local or remote MCP server, which acts as a controlled intermediary. The server defines exactly what it exposes, and the client negotiates what it actually uses.

One host can connect to multiple MCP servers simultaneously. A developer using Cursor might have one MCP server for their GitHub integration, another for their internal documentation system, and a third for a Postgres query tool, all running at the same time, all managed through the same protocol layer.

JSON-RPC as the Transport Foundation

The wire format for MCP is JSON-RPC 2.0. If you've worked with language servers (the LSP protocol that powers IDE features like go-to-definition) this will feel familiar. JSON-RPC is a lightweight remote procedure call protocol that encodes method calls and responses as JSON objects.

A basic MCP request looks like this:

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "query_database",
    "arguments": {
      "sql": "SELECT * FROM orders WHERE status = 'pending'"
    }
  }
}

The response follows the same envelope structure, with either a result field on success or an error field with a code and message on failure. The id field ties requests to responses, which is how the protocol handles concurrent calls without mixing up results.

MCP supports two primary transport mechanisms. stdio transport runs the MCP server as a subprocess and communicates over standard input and output. This is the default for local tools and is what most MCP servers you'll find on registries use. HTTP with Server-Sent Events (SSE) is the remote transport option, where the client sends requests over HTTP POST and receives streaming responses via SSE. A third transport, Streamable HTTP, was introduced in the March 2025 spec revision to consolidate some of the SSE complexity.

The stdio approach is simple and secure for local use. The HTTP approach enables hosted MCP servers, which is where things get more interesting for enterprise deployments.

The Initialization Handshake and Capability Negotiation

When a client connects to an MCP server, the first thing that happens is a structured handshake. This isn't optional ceremony; it's how the two sides agree on what they can actually do together.

The client sends an initialize request that includes its protocol version and a capabilities object declaring what features it supports:

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "initialize",
  "params": {
    "protocolVersion": "2024-11-05",
    "capabilities": {
      "roots": { "listChanged": true },
      "sampling": {}
    },
    "clientInfo": {
      "name": "my-agent",
      "version": "1.0.0"
    }
  }
}

The server responds with its own capability declaration, listing what it supports. Tools, resources, prompts, logging, and experimental features all have their own capability flags. If the server supports tools, it says so. If it supports resource subscriptions (where the client can watch a resource for changes), it declares that too.

After the server responds, the client sends an initialized notification to confirm the handshake is complete. Only after this three-step sequence does the session become active. This design means a client can gracefully handle servers that don't support certain features, rather than failing at runtime when it tries to call something that doesn't exist.

Protocol versioning is handled through this same handshake. The current stable version as of mid-2025 is 2024-11-05, with 2025-03-26 available for implementations that want the newer transport options. If a client and server can't agree on a compatible version, the connection fails cleanly at initialization rather than producing undefined behavior later.

The Three Primitive Types: Tools, Resources, and Prompts

MCP servers expose capabilities through three distinct primitives, and understanding the difference between them is important for both building and evaluating MCP servers.

Tools are callable functions. They have a name, a description, and an input schema defined in JSON Schema format. When an AI model decides to use a tool, the host calls tools/call with the tool name and arguments. The server executes the function and returns a result. This is the most common primitive and the one most people think of when they think about MCP.

Resources are data sources that the server exposes for the model to read. A resource has a URI, a MIME type, and content. Resources can be static (a configuration file) or dynamic (a live database record). Clients discover available resources via resources/list and fetch them with resources/read. Some servers support resources/subscribe, which pushes updates to the client when a resource changes.

Prompts are reusable prompt templates that servers can expose. This is the least-used primitive currently, but it's useful for cases where a server wants to provide standardized ways to invoke its capabilities, pre-built instructions for common workflows that the host can surface to users.

A single MCP server can expose all three types simultaneously. A GitHub MCP server might expose tools for creating issues and PRs, resources for reading file contents and repository metadata, and prompts for common code review workflows.

Sampling: When the Server Talks Back to the Model

One of the more architecturally interesting features in MCP is sampling. Most people assume the data flow is one-directional: the model calls tools, tools return data. Sampling inverts this.

With sampling, an MCP server can request that the host ask the language model to generate something. The server sends a sampling/createMessage request to the client, which passes it to the model, and returns the model's response back to the server. This enables agentic patterns where the tool itself needs LLM reasoning as part of its execution.

Imagine an MCP server that processes documents. It might extract raw text, then use sampling to ask the model to classify the document type before deciding how to structure the output. The server orchestrates the model call without the host application needing to know about this intermediate step.

Sampling is a capability that the client declares during initialization. If a client doesn't support it, servers that require it will know upfront and can either degrade gracefully or refuse the connection.

Why This Architecture Matters for Interoperability

The practical value of MCP's design becomes clear when you look at adoption numbers. As of mid-2025, Skillful.sh indexes over 137,000 MCP servers across 50+ directories. That scale is only possible because the protocol gives tool developers a single target to build against.

Before MCP, a developer building a Notion integration would write one adapter for Claude, another for GPT-4 function calling, another for Gemini, and so on. With MCP, they write one server and it works with any compliant host. Claude Desktop, Cursor, Zed, Continue, and a growing list of custom agents all speak the same protocol.

The security implications of this architecture are also significant. Because MCP servers are separate processes with explicit capability declarations, it's possible to analyze what a server actually does before connecting to it. Static analysis can check whether a server's declared tools match its actual code behavior. Prompt injection risks, where a malicious server tries to hijack the model's behavior through tool responses, can be detected by examining response patterns. This is the basis for security scoring systems that grade MCP servers on their actual risk profile rather than just their documentation.

The JSON-RPC foundation also means MCP is inspectable. You can log every message between client and server, replay sessions for debugging, and validate protocol compliance with standard tooling. This is a meaningful advantage over opaque SDK-based integrations where the communication layer is buried inside library code.

Where the Current Spec Has Rough Edges

MCP is still maturing, and a few areas show the seams. Authentication is one of them. The 2025-03-26 spec added OAuth 2.1 support for remote servers, but local stdio servers have no standardized auth mechanism. For enterprise deployments where you're running MCP servers that access sensitive internal systems, this means rolling your own access control at the server level.

The resource subscription model is powerful but not universally implemented. Many MCP servers that claim resource support only implement the basic read operations, not subscriptions. Clients that rely on push updates for real-time data will find inconsistent behavior across servers.

Tool schema validation is also left largely to the server. The protocol requires servers to declare JSON Schema for tool inputs, but doesn't mandate strict validation of arguments before execution. A poorly written server might accept malformed input and produce confusing errors rather than clean validation failures. When evaluating MCP servers for production use, it's worth testing edge cases in tool argument handling explicitly.

Building on MCP: What to Know Before You Start

If you're building an MCP server, the official SDKs for TypeScript and Python handle the protocol mechanics, so you're mostly writing the business logic for your tools and resources. The TypeScript SDK in particular has good ergonomics for defining tool schemas with Zod and handling the initialization lifecycle.

If you're building a host application that consumes MCP servers, the client SDK manages connection state, capability negotiation, and request routing. The main decision is whether you're connecting to local stdio servers, remote HTTP servers, or both, since the transport configuration differs.

For evaluating existing MCP servers before integrating them, the capability negotiation design means you can inspect what a server declares versus what it actually does. Servers that declare minimal capabilities but have broad filesystem or network access in their implementation code are a red flag worth investigating. Cross-referencing a server's adoption metrics, directory presence, and security score gives you a reasonable signal on whether it's worth the integration effort.

The protocol is specific enough to be useful and flexible enough to accommodate a wide range of tool types. That balance is what makes it a credible foundation for AI tool interoperability, not a guarantee, but a solid starting point for the ecosystem that's building on top of it.

How the Model Context Protocol Actually Works: Transport, Negotiation, and Interoperability