Thoughts & Ideas

Thoughts & Ideas

May 1, 2025

0 Min Read

MCP Servers: Why you need to give your AI agents access to your infra with MCP

If you’re reading this, you’ve probably tried using AI to help with incidents, infra costs, or DevOps automation — only to be disappointed.

Most tools are either too shallow (RAG spam) or too manual (internal portals).

That’s where MCP comes in.

It’s a new protocol from Anthropic (released in late 2024) that gives LLMs structured access to your tools so they can reason, act, and stop hallucinating.

MCP is to AI tools what HTTP is to the web: a shared language and structure that lets clients (like an LLM agent) talk to servers (your tools, APIs, and data sources) in a predictable way.

And as LLMs are embedded into developer tools, monitoring stacks, and cloud infrastructure, prompting alone isn’t enough. You need a way for the model to act in your context and on your systems. That’s what MCP enables: it gives the AI structured access to tools, resources, and prompts it can use to reason about the problem. And there are already quite a few MCP servers available for use.

In this short guide, we’ll break down in a little more detail:

  • prompting in 2025

  • how MCP fits in

  • why everyone’s talking about it

So without further ado, a brief foray into prompting 😄

LLM prompting patterns and techniques

Prompting isn't just “talking to an AI.” It’s an exercise in encoding context, intent, and constraints into a structured message that LLMs can reason about and act on. And as LLMs grow more agentic, prompting becomes the bridge between natural language and real-world operations.

There are more than a few prompt engineering techniques that help engineers effectively design and improve their models to get better results for different tasks. Because of the context problem I mentioned above, these techniques are massively important when it comes to designing an AI tool that is actually useful. Here are a few common approaches:

  • Retrieval Augmented Generation (RAG):

    • How it works: User asks a question ➝ relevant documents or data retrieved ➝ everything passed to the LLM ➝ model generates an answer.

    • Why it’s tricky: RAG quality depends on retrieval relevance, document chunking, and embedding accuracy.


  • Few-shot prompting:

    • How it works: User shows the model what a good output looks like. It’s basically mini fine tuning scoped to a single prompt.

    • Why it’s tricky: Increases token usage, reduces space for novel input, may degrade performance if overused.


  • ReAct (Reasoning + Acting)

    • How it works: The model reasons through a problem and chooses actions (i.e., tools) to take. So, the process would look like: model asks “what do I need to solve this problem?” → calls a tool (like an API or function) → analyzes the results → continues to call other tools or returns an answer to the user.

These techniques help — but only go so far without structured access to the right data. That’s where MCP comes in.

Part 2: What is MCP?

Like I mentioned previously, MCP provides us with a structured way to give AI access to your stack. Before MCP, every tool integration with an LLM was bespoke: you’d write your own function registry, define schemas, and prompt the model to use them correctly.

Now, you can expose your tool via an MCP server so that any compatible AI client can use it.

💡 Note: Before MCP, if you wanted to connect an LLM to your tools, you had to write the agent. That's no longer true: now you can write just the tools and connect an agent like Claude Desktop or Cursor to them.

With MCP:

  • Any tool or service can expose its functionality in a standard way

  • Any LLM-compatible agent with MCP support (e.g. Claude, Cursor, Windsurf) can discover and use that functionality

It enables:

  • Structured access to resources

  • Consistent tool invocation

  • Interoperability across hosts and clients

Anatomy of MCP

There are three main components of MCP:

  • Host: the application that contains the agent (e.g., Claude Desktop)

  • MCP Client: a library that speaks the MCP protocol and can handle connections to the servers (your host application typically provides this)

  • MCP Server: lightweight interfaces that expose resources in a standardized way

The MCP server is where all the fun stuff happens. It can expose:

  • Tools: perform operations with side effects (i.e., deploy a service, fetch metrics)

  • Resources: retrieve raw data (i.e., list EC2 instances, query a DB)

  • Prompts: predefined prompt templates the AI can use (pre-defined by the server owner)

💡 The distinction between TOOLS and RESOURCES can be a little fuzzy. Because some "resources" (like querying a DB) behave like tools.

A helpful rule of thumb is:

  • Resources = fetching data (GET)

  • Tools = doing things (POST)

Keep in mind that many of the most popular tools with MCP support do not actually support resources and/or prompts. (Here’s a table that shows which ones do and don’t.) This has led a lot of MCP server authors to implement things that should be resources as tools instead, since most things support tools.

Part 3: Prompting + MCP = Real AI Agents

MCP doesn’t replace prompting; it supercharges it.

A prompt might say something like, “find all databases with CPU usage < 10%”, but the model needs access to tools the fetch and filter that data.* That’s where an MCP server comes in.

💡 Note: a model could filter the data for you without a tool, but it would come with some important tradeoffs:

  • It would mean that the model needs to pull all of the data into its context window (so if there’s a lot of data, you’ll quickly blow out the context window completely)

  • You’re spending a lot of money (comparatively) on something you could easily do with a couple lines of code

  • You don’t have control over what “< 10%” actually means: what if the average across your 12 cores is under 10%, but one of your cores is pinned at 90%? The LLM may decide that’s not important

Here’s a simple way to frame it:

Techniques

Prompting behavior

How MCP enhances it

RAG

Retrieve external data → inject into context → generate answer

MCP can expose Resources that the LLM retrieves via standardized GET-like calls

Few-shot

Show a few examples of ideal input/output to guide generation

MCP can expose Prompts (reusable templates with slots) that serve as few-shot examples

ReAct

LLM iteratively reasons and chooses tools to interact with the environment

MCP exposes Tools that the LLM can call, observe results, and act again if needed

Here are several hypothetical use cases:

  • RAG + MCP:

    Prompt: “Why is my app’s error rate high?”

    MCP Tool: getMetrics(app_id) → inject into context

    ✅ The LLM interprets context-rich metrics to explain issues.


  • Few-shot + MCP:

    Prompt: “Classify this error message into categories”

    MCP Prompt: classificationPrompt with labeled examples

    ✅ LLM mimics example structure using a stable template.

  • ReAct + MCP:

    Prompt: “Spin down all EC2 instances under 10% CPU”

    Tool Chain: listInstancesgetCPUUsageterminateInstance

    ✅ LLM reasons through conditionals and triggers tool calls step-by-step.

Part 4: How we built an MCP server at Aptible

Not too long ago, we built a prototype using Python, Steampipe to query AWS APIs, and a set of tool definitions that returned things like RDS utilization.

So, the flow looked like:

  • User prompt: "Which RDS instances are under 10% CPU?"

  • Tool call: Get all RDS databases

  • Tool call: Get metrics for each

  • LLM summarizes: "These three are underutilized"

Here’s a quick demo where you can see:

  • A lightweight MCP server was built on top of Steampipe, exposing AWS RDS usage data

  • Tools like getDatabases() and getCPUUsage(db_id) were defined

  • The model (via Cursor) reasoned through the data to find underutilized RDS instance



💡 Pro tip: tool names, schemas, and resource descriptions are part of your prompt. You're designing the agent experience through your MCP API surface.

Final thoughts about MCP servers for Builders

If you’re building with LLMs:

  • Stop reinventing integration logic—use MCP

  • Treat tools & schemas as part of your prompt engineering surface

  • Start with read-only Resources before letting agents take action

If you’re building infra:

  • MCP lets your stack become agent-readable & agent-operable

  • You can expose logs, metrics, databases, and eventually actions

Used together, prompting and MCP are not just how AI understands your world — they’re how AI acts in it.

As LLMs become more embedded in production workflows, the combination of thoughtful prompting and structured tool access (via MCP servers) will be the foundation for AI-native systems. And if you’re building any kind of AI system that goes beyond chat — anything that touches infra, retrieves state, or executes decisions — MCP is going to play heavily into your process.

🚨 Coming soon: an open source MCP server

Very soon, we’ll be dropping an open source MCP server that:

  • Auto-generates a software catalog of your infra

  • Connects to AWS, GCP, GitHub, Datadog, and more

  • Powers AI agents to understand not just what's running, but why

  • Plugs into any tool that speaks MCP (CLI, IDE, Cursor, Claude, etc.)

And you’ll be able to install it using homebrew with one line of code.

Want to be the first to try it?

✉️ Drop your email below to join the early access list, and we’ll let you know as soon as it’s available to use.

Further reading:

If you’re reading this, you’ve probably tried using AI to help with incidents, infra costs, or DevOps automation — only to be disappointed.

Most tools are either too shallow (RAG spam) or too manual (internal portals).

That’s where MCP comes in.

It’s a new protocol from Anthropic (released in late 2024) that gives LLMs structured access to your tools so they can reason, act, and stop hallucinating.

MCP is to AI tools what HTTP is to the web: a shared language and structure that lets clients (like an LLM agent) talk to servers (your tools, APIs, and data sources) in a predictable way.

And as LLMs are embedded into developer tools, monitoring stacks, and cloud infrastructure, prompting alone isn’t enough. You need a way for the model to act in your context and on your systems. That’s what MCP enables: it gives the AI structured access to tools, resources, and prompts it can use to reason about the problem. And there are already quite a few MCP servers available for use.

In this short guide, we’ll break down in a little more detail:

  • prompting in 2025

  • how MCP fits in

  • why everyone’s talking about it

So without further ado, a brief foray into prompting 😄

LLM prompting patterns and techniques

Prompting isn't just “talking to an AI.” It’s an exercise in encoding context, intent, and constraints into a structured message that LLMs can reason about and act on. And as LLMs grow more agentic, prompting becomes the bridge between natural language and real-world operations.

There are more than a few prompt engineering techniques that help engineers effectively design and improve their models to get better results for different tasks. Because of the context problem I mentioned above, these techniques are massively important when it comes to designing an AI tool that is actually useful. Here are a few common approaches:

  • Retrieval Augmented Generation (RAG):

    • How it works: User asks a question ➝ relevant documents or data retrieved ➝ everything passed to the LLM ➝ model generates an answer.

    • Why it’s tricky: RAG quality depends on retrieval relevance, document chunking, and embedding accuracy.


  • Few-shot prompting:

    • How it works: User shows the model what a good output looks like. It’s basically mini fine tuning scoped to a single prompt.

    • Why it’s tricky: Increases token usage, reduces space for novel input, may degrade performance if overused.


  • ReAct (Reasoning + Acting)

    • How it works: The model reasons through a problem and chooses actions (i.e., tools) to take. So, the process would look like: model asks “what do I need to solve this problem?” → calls a tool (like an API or function) → analyzes the results → continues to call other tools or returns an answer to the user.

These techniques help — but only go so far without structured access to the right data. That’s where MCP comes in.

Part 2: What is MCP?

Like I mentioned previously, MCP provides us with a structured way to give AI access to your stack. Before MCP, every tool integration with an LLM was bespoke: you’d write your own function registry, define schemas, and prompt the model to use them correctly.

Now, you can expose your tool via an MCP server so that any compatible AI client can use it.

💡 Note: Before MCP, if you wanted to connect an LLM to your tools, you had to write the agent. That's no longer true: now you can write just the tools and connect an agent like Claude Desktop or Cursor to them.

With MCP:

  • Any tool or service can expose its functionality in a standard way

  • Any LLM-compatible agent with MCP support (e.g. Claude, Cursor, Windsurf) can discover and use that functionality

It enables:

  • Structured access to resources

  • Consistent tool invocation

  • Interoperability across hosts and clients

Anatomy of MCP

There are three main components of MCP:

  • Host: the application that contains the agent (e.g., Claude Desktop)

  • MCP Client: a library that speaks the MCP protocol and can handle connections to the servers (your host application typically provides this)

  • MCP Server: lightweight interfaces that expose resources in a standardized way

The MCP server is where all the fun stuff happens. It can expose:

  • Tools: perform operations with side effects (i.e., deploy a service, fetch metrics)

  • Resources: retrieve raw data (i.e., list EC2 instances, query a DB)

  • Prompts: predefined prompt templates the AI can use (pre-defined by the server owner)

💡 The distinction between TOOLS and RESOURCES can be a little fuzzy. Because some "resources" (like querying a DB) behave like tools.

A helpful rule of thumb is:

  • Resources = fetching data (GET)

  • Tools = doing things (POST)

Keep in mind that many of the most popular tools with MCP support do not actually support resources and/or prompts. (Here’s a table that shows which ones do and don’t.) This has led a lot of MCP server authors to implement things that should be resources as tools instead, since most things support tools.

Part 3: Prompting + MCP = Real AI Agents

MCP doesn’t replace prompting; it supercharges it.

A prompt might say something like, “find all databases with CPU usage < 10%”, but the model needs access to tools the fetch and filter that data.* That’s where an MCP server comes in.

💡 Note: a model could filter the data for you without a tool, but it would come with some important tradeoffs:

  • It would mean that the model needs to pull all of the data into its context window (so if there’s a lot of data, you’ll quickly blow out the context window completely)

  • You’re spending a lot of money (comparatively) on something you could easily do with a couple lines of code

  • You don’t have control over what “< 10%” actually means: what if the average across your 12 cores is under 10%, but one of your cores is pinned at 90%? The LLM may decide that’s not important

Here’s a simple way to frame it:

Techniques

Prompting behavior

How MCP enhances it

RAG

Retrieve external data → inject into context → generate answer

MCP can expose Resources that the LLM retrieves via standardized GET-like calls

Few-shot

Show a few examples of ideal input/output to guide generation

MCP can expose Prompts (reusable templates with slots) that serve as few-shot examples

ReAct

LLM iteratively reasons and chooses tools to interact with the environment

MCP exposes Tools that the LLM can call, observe results, and act again if needed

Here are several hypothetical use cases:

  • RAG + MCP:

    Prompt: “Why is my app’s error rate high?”

    MCP Tool: getMetrics(app_id) → inject into context

    ✅ The LLM interprets context-rich metrics to explain issues.


  • Few-shot + MCP:

    Prompt: “Classify this error message into categories”

    MCP Prompt: classificationPrompt with labeled examples

    ✅ LLM mimics example structure using a stable template.

  • ReAct + MCP:

    Prompt: “Spin down all EC2 instances under 10% CPU”

    Tool Chain: listInstancesgetCPUUsageterminateInstance

    ✅ LLM reasons through conditionals and triggers tool calls step-by-step.

Part 4: How we built an MCP server at Aptible

Not too long ago, we built a prototype using Python, Steampipe to query AWS APIs, and a set of tool definitions that returned things like RDS utilization.

So, the flow looked like:

  • User prompt: "Which RDS instances are under 10% CPU?"

  • Tool call: Get all RDS databases

  • Tool call: Get metrics for each

  • LLM summarizes: "These three are underutilized"

Here’s a quick demo where you can see:

  • A lightweight MCP server was built on top of Steampipe, exposing AWS RDS usage data

  • Tools like getDatabases() and getCPUUsage(db_id) were defined

  • The model (via Cursor) reasoned through the data to find underutilized RDS instance



💡 Pro tip: tool names, schemas, and resource descriptions are part of your prompt. You're designing the agent experience through your MCP API surface.

Final thoughts about MCP servers for Builders

If you’re building with LLMs:

  • Stop reinventing integration logic—use MCP

  • Treat tools & schemas as part of your prompt engineering surface

  • Start with read-only Resources before letting agents take action

If you’re building infra:

  • MCP lets your stack become agent-readable & agent-operable

  • You can expose logs, metrics, databases, and eventually actions

Used together, prompting and MCP are not just how AI understands your world — they’re how AI acts in it.

As LLMs become more embedded in production workflows, the combination of thoughtful prompting and structured tool access (via MCP servers) will be the foundation for AI-native systems. And if you’re building any kind of AI system that goes beyond chat — anything that touches infra, retrieves state, or executes decisions — MCP is going to play heavily into your process.

🚨 Coming soon: an open source MCP server

Very soon, we’ll be dropping an open source MCP server that:

  • Auto-generates a software catalog of your infra

  • Connects to AWS, GCP, GitHub, Datadog, and more

  • Powers AI agents to understand not just what's running, but why

  • Plugs into any tool that speaks MCP (CLI, IDE, Cursor, Claude, etc.)

And you’ll be able to install it using homebrew with one line of code.

Want to be the first to try it?

✉️ Drop your email below to join the early access list, and we’ll let you know as soon as it’s available to use.

Further reading:

If you’re reading this, you’ve probably tried using AI to help with incidents, infra costs, or DevOps automation — only to be disappointed.

Most tools are either too shallow (RAG spam) or too manual (internal portals).

That’s where MCP comes in.

It’s a new protocol from Anthropic (released in late 2024) that gives LLMs structured access to your tools so they can reason, act, and stop hallucinating.

MCP is to AI tools what HTTP is to the web: a shared language and structure that lets clients (like an LLM agent) talk to servers (your tools, APIs, and data sources) in a predictable way.

And as LLMs are embedded into developer tools, monitoring stacks, and cloud infrastructure, prompting alone isn’t enough. You need a way for the model to act in your context and on your systems. That’s what MCP enables: it gives the AI structured access to tools, resources, and prompts it can use to reason about the problem. And there are already quite a few MCP servers available for use.

In this short guide, we’ll break down in a little more detail:

  • prompting in 2025

  • how MCP fits in

  • why everyone’s talking about it

So without further ado, a brief foray into prompting 😄

LLM prompting patterns and techniques

Prompting isn't just “talking to an AI.” It’s an exercise in encoding context, intent, and constraints into a structured message that LLMs can reason about and act on. And as LLMs grow more agentic, prompting becomes the bridge between natural language and real-world operations.

There are more than a few prompt engineering techniques that help engineers effectively design and improve their models to get better results for different tasks. Because of the context problem I mentioned above, these techniques are massively important when it comes to designing an AI tool that is actually useful. Here are a few common approaches:

  • Retrieval Augmented Generation (RAG):

    • How it works: User asks a question ➝ relevant documents or data retrieved ➝ everything passed to the LLM ➝ model generates an answer.

    • Why it’s tricky: RAG quality depends on retrieval relevance, document chunking, and embedding accuracy.


  • Few-shot prompting:

    • How it works: User shows the model what a good output looks like. It’s basically mini fine tuning scoped to a single prompt.

    • Why it’s tricky: Increases token usage, reduces space for novel input, may degrade performance if overused.


  • ReAct (Reasoning + Acting)

    • How it works: The model reasons through a problem and chooses actions (i.e., tools) to take. So, the process would look like: model asks “what do I need to solve this problem?” → calls a tool (like an API or function) → analyzes the results → continues to call other tools or returns an answer to the user.

These techniques help — but only go so far without structured access to the right data. That’s where MCP comes in.

Part 2: What is MCP?

Like I mentioned previously, MCP provides us with a structured way to give AI access to your stack. Before MCP, every tool integration with an LLM was bespoke: you’d write your own function registry, define schemas, and prompt the model to use them correctly.

Now, you can expose your tool via an MCP server so that any compatible AI client can use it.

💡 Note: Before MCP, if you wanted to connect an LLM to your tools, you had to write the agent. That's no longer true: now you can write just the tools and connect an agent like Claude Desktop or Cursor to them.

With MCP:

  • Any tool or service can expose its functionality in a standard way

  • Any LLM-compatible agent with MCP support (e.g. Claude, Cursor, Windsurf) can discover and use that functionality

It enables:

  • Structured access to resources

  • Consistent tool invocation

  • Interoperability across hosts and clients

Anatomy of MCP

There are three main components of MCP:

  • Host: the application that contains the agent (e.g., Claude Desktop)

  • MCP Client: a library that speaks the MCP protocol and can handle connections to the servers (your host application typically provides this)

  • MCP Server: lightweight interfaces that expose resources in a standardized way

The MCP server is where all the fun stuff happens. It can expose:

  • Tools: perform operations with side effects (i.e., deploy a service, fetch metrics)

  • Resources: retrieve raw data (i.e., list EC2 instances, query a DB)

  • Prompts: predefined prompt templates the AI can use (pre-defined by the server owner)

💡 The distinction between TOOLS and RESOURCES can be a little fuzzy. Because some "resources" (like querying a DB) behave like tools.

A helpful rule of thumb is:

  • Resources = fetching data (GET)

  • Tools = doing things (POST)

Keep in mind that many of the most popular tools with MCP support do not actually support resources and/or prompts. (Here’s a table that shows which ones do and don’t.) This has led a lot of MCP server authors to implement things that should be resources as tools instead, since most things support tools.

Part 3: Prompting + MCP = Real AI Agents

MCP doesn’t replace prompting; it supercharges it.

A prompt might say something like, “find all databases with CPU usage < 10%”, but the model needs access to tools the fetch and filter that data.* That’s where an MCP server comes in.

💡 Note: a model could filter the data for you without a tool, but it would come with some important tradeoffs:

  • It would mean that the model needs to pull all of the data into its context window (so if there’s a lot of data, you’ll quickly blow out the context window completely)

  • You’re spending a lot of money (comparatively) on something you could easily do with a couple lines of code

  • You don’t have control over what “< 10%” actually means: what if the average across your 12 cores is under 10%, but one of your cores is pinned at 90%? The LLM may decide that’s not important

Here’s a simple way to frame it:

Techniques

Prompting behavior

How MCP enhances it

RAG

Retrieve external data → inject into context → generate answer

MCP can expose Resources that the LLM retrieves via standardized GET-like calls

Few-shot

Show a few examples of ideal input/output to guide generation

MCP can expose Prompts (reusable templates with slots) that serve as few-shot examples

ReAct

LLM iteratively reasons and chooses tools to interact with the environment

MCP exposes Tools that the LLM can call, observe results, and act again if needed

Here are several hypothetical use cases:

  • RAG + MCP:

    Prompt: “Why is my app’s error rate high?”

    MCP Tool: getMetrics(app_id) → inject into context

    ✅ The LLM interprets context-rich metrics to explain issues.


  • Few-shot + MCP:

    Prompt: “Classify this error message into categories”

    MCP Prompt: classificationPrompt with labeled examples

    ✅ LLM mimics example structure using a stable template.

  • ReAct + MCP:

    Prompt: “Spin down all EC2 instances under 10% CPU”

    Tool Chain: listInstancesgetCPUUsageterminateInstance

    ✅ LLM reasons through conditionals and triggers tool calls step-by-step.

Part 4: How we built an MCP server at Aptible

Not too long ago, we built a prototype using Python, Steampipe to query AWS APIs, and a set of tool definitions that returned things like RDS utilization.

So, the flow looked like:

  • User prompt: "Which RDS instances are under 10% CPU?"

  • Tool call: Get all RDS databases

  • Tool call: Get metrics for each

  • LLM summarizes: "These three are underutilized"

Here’s a quick demo where you can see:

  • A lightweight MCP server was built on top of Steampipe, exposing AWS RDS usage data

  • Tools like getDatabases() and getCPUUsage(db_id) were defined

  • The model (via Cursor) reasoned through the data to find underutilized RDS instance



💡 Pro tip: tool names, schemas, and resource descriptions are part of your prompt. You're designing the agent experience through your MCP API surface.

Final thoughts about MCP servers for Builders

If you’re building with LLMs:

  • Stop reinventing integration logic—use MCP

  • Treat tools & schemas as part of your prompt engineering surface

  • Start with read-only Resources before letting agents take action

If you’re building infra:

  • MCP lets your stack become agent-readable & agent-operable

  • You can expose logs, metrics, databases, and eventually actions

Used together, prompting and MCP are not just how AI understands your world — they’re how AI acts in it.

As LLMs become more embedded in production workflows, the combination of thoughtful prompting and structured tool access (via MCP servers) will be the foundation for AI-native systems. And if you’re building any kind of AI system that goes beyond chat — anything that touches infra, retrieves state, or executes decisions — MCP is going to play heavily into your process.

🚨 Coming soon: an open source MCP server

Very soon, we’ll be dropping an open source MCP server that:

  • Auto-generates a software catalog of your infra

  • Connects to AWS, GCP, GitHub, Datadog, and more

  • Powers AI agents to understand not just what's running, but why

  • Plugs into any tool that speaks MCP (CLI, IDE, Cursor, Claude, etc.)

And you’ll be able to install it using homebrew with one line of code.

Want to be the first to try it?

✉️ Drop your email below to join the early access list, and we’ll let you know as soon as it’s available to use.

Further reading:

Thoughts & Ideas

Thoughts & Ideas

May 1, 2025

0 Min Read

MCP Servers: Why you need to give your AI agents access to your infra with MCP

Eric Abruzzese

Software Engineer

Eric Abruzzese

Software Engineer

Get an AI Agent for Automated Root Cause Analysis

Get an AI Agent for Automated Root Cause Analysis

Get an AI Agent for Automated Root Cause Analysis

© APTIBLE INC.

© APTIBLE INC.