MCP Servers for AWS, GCP, and Kubernetes

What Cloud MCP Servers Actually Do

Cloud infrastructure MCP servers act as structured bridges between AI agents and your cloud provider APIs. Instead of an agent trying to construct raw API calls or shell out to the AWS CLI, it talks to an MCP server that exposes discrete, typed tools: list_ec2_instances, describe_gke_cluster, get_pod_logs. The agent gets a clean interface; the MCP server handles authentication, pagination, and error normalization.

This matters because cloud APIs are sprawling and inconsistent. The AWS SDK alone covers 300+ services. MCP servers scope that surface area down to what's actually useful for a given workflow, which also reduces the attack surface for prompt injection and scope creep.

AWS: What's Available and What It Covers

The most widely adopted AWS MCP servers on Skillful.sh currently cover EC2, S3, RDS, Lambda, IAM, CloudWatch, and ECS. A few mature ones, like the aws-mcp-server project with 800+ GitHub stars, support cross-service workflows: you can ask an agent to find all Lambda functions in a region, check their CloudWatch error rates for the past 24 hours, and return the top five by error count.

Supported operations typically fall into three buckets. Read operations include listing resources, describing configurations, fetching logs, and querying CloudWatch metrics. Write operations cover things like starting and stopping EC2 instances, updating Lambda environment variables, and modifying Auto Scaling group sizes. Destructive operations, like terminating instances or deleting S3 objects, are usually gated behind explicit confirmation steps or disabled by default.

The permission model almost universally relies on IAM roles. You attach a least-privilege IAM policy to the role the MCP server assumes, and that policy defines exactly what the server can touch. A read-only deployment might use ReadOnlyAccess plus specific CloudWatch permissions. A more capable server for incident response might include ec2:StopInstances and lambda:UpdateFunctionConfiguration. The MCP server itself doesn't manage credentials; it inherits them from the execution environment, whether that's an EC2 instance profile, an ECS task role, or a local ~/.aws/credentials file.

GCP: A Different Permission Surface

GCP MCP servers are less numerous than AWS ones right now, but the quality is solid. The main ones cover Compute Engine, Cloud Storage, BigQuery, GKE, Cloud Run, and Pub/Sub. GCP's IAM model is more granular than AWS in some ways: roles are applied at the project, folder, or organization level, and predefined roles like roles/compute.viewer map cleanly to read-only MCP server deployments.

One area where GCP MCP servers shine is BigQuery integration. An agent can run parameterized queries, fetch schema information, and return results in structured format, all without the agent needing to know anything about the Jobs API or how to handle async query execution. The MCP server handles polling the job status and returning results when ready. For data engineering workflows, this is genuinely useful.

Service account key management is the main operational concern. Most GCP MCP servers support Application Default Credentials, so you can run them on a GCE instance with a service account attached and avoid storing key files entirely. If you're running locally or in a non-GCP environment, you'll need a service account JSON key, which introduces the usual secret management considerations.

Kubernetes: The Most Operationally Dense Integration

Kubernetes MCP servers are where things get interesting from a complexity standpoint. The operations they support map closely to kubectl verbs: get, list, describe, apply, delete, exec, logs, port-forward. Some servers expose higher-level abstractions, like a rollout_restart tool that handles the deployment restart sequence correctly rather than requiring the agent to patch the deployment spec directly.

The permission model here's RBAC. You create a ServiceAccount, bind it to a ClusterRole or Role with specific resource permissions, and the MCP server runs with that identity. A typical read-only setup grants get, list, and watch on pods, deployments, services, and events. An ops-capable setup adds update on deployments and create on pods for exec sessions.

One thing worth noting: Kubernetes MCP servers that support exec into pods carry meaningful security implications. If an agent can exec into a pod, it can potentially read secrets mounted as environment variables or access internal network endpoints. Skillful.sh's security scoring flags this specifically; servers with exec support and no additional access controls tend to score in the C-D range on the security rubric.

Real Scenarios Where These Save Time

Incident response is the clearest use case. When something is on fire at 2am, an agent with access to a well-scoped AWS or Kubernetes MCP server can pull CloudWatch logs, describe the affected resources, check recent deployments, and surface the relevant context in under a minute. The engineer still makes the call, but they're not spending 10 minutes running kubectl describe commands and grepping log output.

Cost auditing is another one. An agent connected to an AWS MCP server can enumerate all EC2 instances across regions, filter for ones that have been stopped for more than 30 days, cross-reference with EBS volumes attached to those instances, and produce a report. Doing this manually involves the AWS console, Cost Explorer, and a spreadsheet. With an MCP server, it's a single agent task.

Environment provisioning for staging is a third scenario that comes up frequently. A developer asks an agent to spin up a staging environment matching production configuration. The agent uses the MCP server to read the production ECS task definitions, create equivalent staging services with reduced resource limits, and return the endpoints. What used to require a DevOps engineer's time for 30-45 minutes becomes a self-service operation.

Evaluating Security Before You Deploy

Before connecting any cloud MCP server to a production environment, check a few things. Look at what tools the server exposes and whether destructive operations are disabled or require confirmation. Check whether the server logs tool invocations, because audit trails matter when an agent is making changes to infrastructure. Review the dependency tree for known CVEs; cloud SDK libraries update frequently and older MCP servers sometimes lag behind.

On Skillful.sh, the security scores for cloud infrastructure MCP servers range considerably. Servers that scope permissions tightly, avoid exec-style operations, and maintain active dependency updates tend to score A or B. Servers with broad wildcard tool exposure or stale dependencies often land in the C-F range. Cross-referencing a server's directory presence count with its security score gives you a reasonable signal on whether adoption is outpacing scrutiny.

The practical advice: start with a read-only deployment, validate the agent's behavior against non-production resources, then incrementally expand permissions as you build confidence in the specific workflows you're automating.

MCP Servers for Cloud Infrastructure: AWS, GCP, and Kubernetes Integrations