MCP servers and agentic workflows
A technical deep-dive into the Model Context Protocol โ building persistent memory, tool servers, and multi-agent pipelines.
---
The gap between a language model and a useful agent is tools. A model that can only read its input and produce text is a very sophisticated autocomplete system. A model that can read files, query databases, call APIs, and remember things across sessions is something qualitatively different. The Model Context Protocol is the layer that makes that second thing possible in a structured, composable way.
I have been building with MCP since early 2026, using it across most of the production applications in this portfolio. What follows is a practical account of how it works, where it is genuinely useful, and where the friction still lives.
---
What MCP actually is
MCP is a protocol โ a standardised way for a language model host to expose tools and resources to an AI agent. The host defines a server. The server declares what it can do. The agent calls those capabilities and gets structured results back.
The key insight is that this is not prompt engineering. You are not describing capabilities in natural language and hoping the model interprets them correctly. You are defining them formally โ with typed inputs, typed outputs, and explicit error handling. The model calls a tool the way code calls a function. The result is predictable and inspectable.
This matters because one of the main failure modes of naive AI integrations is opacity. Something goes wrong and you cannot tell whether the model misunderstood the task, the tool produced wrong output, or the tool was called with wrong parameters. MCP makes each of those failure points visible and separate.
---
Persistent memory as a first-class capability
The most immediately useful thing I built with MCP was a memory system. Language models have no persistent state between conversations by default. Every session starts cold. This is fine for one-off tasks but becomes a significant constraint when you are doing ongoing work โ building a product, managing a programme, maintaining a codebase.
The pattern I settled on: a file-based memory store, exposed through MCP, with typed memory categories โ user context, feedback, project state, references. The agent reads from this store at session start, writes to it when something worth remembering happens, and the next session begins with genuine context rather than a blank slate.
The important design decision is granularity. A memory store that holds everything becomes noise. The discipline is deciding what is worth persisting โ not raw conversation history, which is voluminous and mostly redundant, but the distilled facts that change how the next session should go. A preference about code style. A decision about architecture. A constraint that would otherwise be re-explained from scratch.
---
Tool servers in practice
Beyond memory, I have built MCP servers for file system access, database queries, deployment pipelines, and cross-project search. The pattern is consistent: identify a capability the agent needs, define it as a typed tool, implement it as a small server, and expose it through the MCP host configuration.
The discipline that matters here is interface design, not implementation complexity. A well-designed tool is narrow โ it does one thing, takes clear inputs, and returns a clean result. A poorly designed tool tries to be too flexible, ends up with ambiguous parameters, and produces inconsistent results that the model has to interpret.
The analogy to API design is exact. Everything you know about building a good REST API applies to building a good MCP tool. Simple, consistent, predictable. The model calling your tool is the client. Design for the client.
---
Multi-agent pipelines
The more interesting architecture is when agents compose. One agent orchestrates. Others specialise โ one for research, one for code generation, one for review, one for verification. The orchestrator breaks the problem down, delegates to specialists, and assembles the result.
I use this pattern for complex builds where a single context window is insufficient or where parallel work is possible. A research agent explores the codebase. A planning agent designs the approach. An implementation agent writes the code. A review agent checks it. These run with different context, different tools, and different prompts โ but share state through the memory layer.
The failure mode to watch for is coordination overhead. Multi-agent pipelines are slower and more expensive than single-agent sessions. They are worth the cost when the problem genuinely requires parallel exploration or when specialisation produces meaningfully better output. They are not worth it as a default pattern. The right tool for most tasks is still a single agent with good tools and clear instructions.
---
Where the friction lives
The tooling is immature. MCP servers need to be configured manually, the debugging experience when something goes wrong is rough, and the ecosystem of pre-built servers is still thin compared to what it will eventually be.
The deeper friction is conceptual. Building with MCP well requires thinking clearly about what capabilities an agent actually needs, how those capabilities should be structured, and what the failure modes are. Most of the time I have spent on MCP infrastructure has not been writing server code โ it has been thinking through interface design, deciding what to make a tool versus what to handle in the prompt, and debugging subtle mismatches between what I thought a tool did and what it actually did.
That thinking is worth doing. The agents that work well in my workflow are the ones backed by carefully designed tools. The ones that don't are the ones where I took shortcuts on the interface.
---
The protocol will evolve. The tooling will improve. But the underlying principle โ structured, typed, composable capabilities as the foundation for useful agents โ is sound and worth understanding now, before it becomes invisible infrastructure that everyone uses without thinking about.