Looking for AI consulting services?Talk to the Padiso team
All posts
Guide

Building Custom MCP Servers: When Off-the-Shelf Integrations Aren't Enough

Learn how to design, build, and deploy custom MCP servers for proprietary tools. A technical guide for AI teams needing integrations beyond standard APIs.

TPThe Padiso Team
16 minutes read

Why Custom MCP Servers Matter for Production AI Agents

You've deployed an AI agent team. It's running 24/7 on Padiso's agent orchestration platform, handling tasks your engineering team used to do manually. But then you hit a wall: your proprietary internal tool-the one that powers your core workflow-doesn't have an off-the-shelf integration. The vendor hasn't built an API wrapper. The SaaS platform you're using isn't in any standard integration library.

This is where building a custom MCP server becomes essential.

The Model Context Protocol (MCP) is the standardized way AI agents communicate with external tools and data sources. Rather than agents calling APIs directly with brittle, one-off code, MCP servers act as a translation layer. They define a contract: here's what tools your agent can use, here's the schema for inputs and outputs, here's how to handle errors and timeouts.

When you're running a headless company-one where AI agents handle operations, sourcing, portfolio management, or internal automation-you can't afford to wait for vendors to build integrations. You need to build them yourself. And you need to build them right: secure, testable, monitored, and maintainable.

This guide walks you through designing, shipping, and securing a custom MCP server for the tools that matter to your business. We'll cover the fundamentals, show you real patterns, and explain how to integrate it with Padiso's platform so your agents can use it in production.

Understanding the MCP Architecture

Before writing code, you need to understand what an MCP server actually is and how it fits into your agent infrastructure.

An MCP server is a lightweight process that exposes a set of tools and resources to an AI model through a standardized protocol. Think of it as a reverse API: instead of your application calling an external service, the AI model calls your MCP server, and your server responds with structured data or performs actions on behalf of the agent.

The architecture has three key layers:

The Protocol Layer: This is the communication mechanism between the AI model and your MCP server. It's typically JSON-RPC over stdio or HTTP, depending on your deployment model. The protocol is stateless-each request contains all the information needed to fulfill it.

The Tool Layer: Tools are the discrete actions your MCP server exposes. A tool has a name, a description, input parameters (defined as a JSON schema), and a handler function that executes when the agent calls it. If your proprietary system has a database of customer records, you might define a tool called query_customer_by_id that accepts a customer ID and returns relevant fields.

The Resource Layer: Resources are read-only data that your MCP server can expose. Unlike tools, which perform actions, resources provide context. You might expose a resource that lists all available customer segments, which the agent can read to understand what queries are possible.

According to the official Model Context Protocol documentation from Anthropic, the server architecture is designed to be simple, stateless, and composable. This simplicity is intentional: it means you can build a custom MCP server without worrying about complex state management or distributed systems concerns.

The key insight is that MCP servers are not monolithic. You can run multiple MCP servers, each handling a specific domain. One server talks to your billing system, another to your internal wiki, another to your proprietary ML model. Your agent team can use all of them simultaneously, and Padiso's orchestration layer manages the connections, routing, and lifecycle.

When to Build vs. When to Use Off-the-Shelf Integrations

Not every tool needs a custom MCP server. Before you commit to building one, you should understand when custom development is actually justified.

Use off-the-shelf integrations when:

  • The tool has an official API and a community-maintained MCP server already exists
  • The integration requirements are simple (read-only access to public data)
  • The tool is a commodity service (Slack, GitHub, Jira, etc.) where the integration is stable and unlikely to change
  • You don't need real-time bidirectional communication

Build a custom MCP server when:

  • Your tool is proprietary or internal-only (no public API)
  • You need to expose custom business logic that doesn't map cleanly to existing APIs
  • The off-the-shelf integration doesn't support the specific operations your agents need
  • You need to add authentication, rate limiting, or caching logic specific to your infrastructure
  • You're integrating multiple internal systems and want a unified interface for your agents
  • You need to implement security policies (field-level access control, audit logging) that the generic integration doesn't support

The decision ultimately comes down to this: if your agent team needs to operate autonomously, and they need access to a tool that doesn't have a standard integration, you build the MCP server. The cost of building is lower than the cost of not automating.

Designing Your MCP Server: Schema and Tool Definition

Good MCP server design starts with clear, well-defined tools. This is not the time to be vague.

Every tool you expose needs a schema. The schema defines what inputs the agent can pass and what outputs it will receive. This schema is not just documentation-it's a contract that your MCP server enforces, and it's what allows the AI model to use your tool correctly.

Let's say you're building an MCP server for an internal customer management system. You might define a tool like this:

Tool Name: get_customer_details
Description: Retrieve detailed information about a customer by ID. Includes contact info, account status, and recent transaction history.
Input Schema:
  - customer_id (string, required): The unique identifier for the customer
  - include_transactions (boolean, optional, default=false): Whether to include transaction history
Output Schema:
  - customer_id (string)
  - name (string)
  - email (string)
  - status (enum: active, inactive, suspended)
  - created_at (ISO 8601 timestamp)
  - transactions (array of objects, only if include_transactions=true):
    - transaction_id (string)
    - amount (number)
    - timestamp (ISO 8601 timestamp)
Error Cases:
  - 404: Customer not found
  - 403: Insufficient permissions
  - 500: Database error

Notice the specificity. The agent knows exactly what to expect. It knows that customer_id is required. It knows that status is an enum, not a free-form string. It knows what error codes are possible.

This specificity matters because it allows the AI model to reason about what the tool can and cannot do. If the agent needs to find a customer by email address, and your schema doesn't include an email parameter, the agent knows it can't do that-and it won't waste tokens trying.

According to best practices outlined by Snyk's guide to MCP server development, tool naming should be consistent, descriptive, and follow a predictable pattern. Use underscores for multi-word tool names. Avoid generic names like query or get; be specific about what the tool does.

When designing tools, also think about idempotency and side effects. Some tools should be safe to call multiple times (reading data). Others have side effects (creating records, sending messages). Document this clearly. If a tool creates a record and returns an ID, make sure calling it twice with the same inputs doesn't create two records.

Building Your MCP Server: From Framework Selection to First Deploy

Once you've defined your tools, it's time to build. The good news: you don't need to build the MCP protocol handling from scratch.

Several frameworks handle the protocol layer for you. The most mature options are:

FastMCP (Python): A lightweight framework that uses FastAPI under the hood. You define tools as Python functions and decorate them with @server.tool(). The framework handles all protocol details. As detailed in the Clarifai guide to building custom MCP servers, FastMCP is ideal for rapid development and works well with existing Python infrastructure.

TypeScript/Node.js SDK: Anthropic provides an official SDK for building MCP servers in JavaScript/TypeScript. If your team already uses Node.js and TypeScript, this is the natural choice.

Go: For high-performance, low-latency servers, Go is a solid option. It compiles to a single binary and has minimal runtime overhead.

The choice depends on your team's language preferences and existing infrastructure. For most teams, Python with FastMCP is the fastest path to production.

Here's a minimal example of an MCP server in Python:

from fastmcp import FastMCP
import httpx
 
server = FastMCP("customer-service")
 
@server.tool()
async def get_customer(customer_id: str) -> dict:
    """Retrieve customer details by ID."""
    async with httpx.AsyncClient() as client:
        response = await client.get(
            f"https://internal-api.company.com/customers/{customer_id}",
            headers={"Authorization": f"Bearer {os.getenv('API_KEY')}"},
        )
        response.raise_for_status()
        return response.json()
 
if __name__ == "__main__":
    server.run()

This server exposes one tool: get_customer. When an AI agent calls this tool, the MCP protocol handler receives the request, the function executes, and the result is returned to the agent.

The real work, of course, is in the implementation. Your function needs to:

  1. Validate inputs: Check that customer_id is in the right format before calling the backend API
  2. Handle authentication: Use API keys, OAuth tokens, or mutual TLS to authenticate with your internal service
  3. Handle errors gracefully: If the customer doesn't exist, return a clear error. If the API is down, retry or return a timeout error
  4. Transform data: Your internal API might return fields your agent doesn't need. Transform the response to match your tool's output schema
  5. Log and monitor: Every call should be logged so you can debug issues and track usage

Once your server is working locally, you need to deploy it. For Padiso, you have several options:

Docker Container: Package your MCP server as a Docker image. This is the most portable approach and works with any deployment infrastructure.

Standalone Binary: Compile your server to a binary and run it as a systemd service or in a container orchestrator like Kubernetes.

Serverless Function: For simple, low-traffic MCP servers, you can deploy as a Lambda or Cloud Function. The cold start latency might be noticeable, but it's cost-effective.

The Docker best practices guide emphasizes containerization as the standard approach: it ensures consistency across environments, makes versioning explicit, and simplifies deployment to Padiso's orchestration platform.

Security: Authentication, Authorization, and Data Protection

Security is not an afterthought when building MCP servers. Your server is a direct line into your internal systems. If it's compromised, your agents can compromise your business.

Authentication: Your MCP server must authenticate to your internal services. Use API keys, OAuth tokens, or mutual TLS. Never hardcode credentials in your code; use environment variables or a secrets management system.

When your agent calls your MCP server, the server itself authenticates to your backend. The agent doesn't see the credentials. This is important: it means you can rotate credentials without redeploying agents.

Authorization: Just because an agent can call your MCP server doesn't mean it should be able to read all customer data. Implement authorization checks:

  • Can this agent read customer data? (Yes, if it's the support agent; no, if it's the marketing agent)
  • Can this agent write to the database? (Only if it's the provisioning agent)
  • Can this agent access sensitive fields like credit card numbers? (Probably not)

Authorization is often context-specific. If your agent is handling a support request for customer X, it should only be able to access customer X's data, not all customers.

Data Protection: Your MCP server is a conduit for data. Ensure that:

  • Data in transit is encrypted (use HTTPS)
  • Data at rest is encrypted (in your backend database)
  • Sensitive data (passwords, API keys, PII) is not logged
  • Access is audited (log every read and write)

According to Snyk's best practices, a critical practice is ensuring freedom from vulnerabilities in third-party dependencies. Use pip audit or npm audit regularly. Pin dependency versions and review updates before deploying.

On the Padiso security page, you'll find details on how the platform itself handles security. Your custom MCP server inherits some of that security (network isolation, encrypted communication channels), but you're responsible for the security of your own code and integrations.

Testing and Monitoring Your MCP Server

You wouldn't deploy an API without tests. Don't deploy an MCP server without them either.

Unit Tests: Test each tool function in isolation. Mock the backend API. Verify that inputs are validated, outputs match the schema, and errors are handled correctly.

import pytest
from unittest.mock import AsyncMock, patch
from your_server import get_customer
 
@pytest.mark.asyncio
async def test_get_customer_success():
    with patch('httpx.AsyncClient.get') as mock_get:
        mock_get.return_value.json.return_value = {
            "id": "123",
            "name": "Alice",
            "email": "[email protected]"
        }
        result = await get_customer("123")
        assert result["name"] == "Alice"
 
@pytest.mark.asyncio
async def test_get_customer_not_found():
    with patch('httpx.AsyncClient.get') as mock_get:
        mock_get.return_value.status_code = 404
        mock_get.return_value.raise_for_status.side_effect = Exception("Not found")
        with pytest.raises(Exception):
            await get_customer("nonexistent")

Integration Tests: Test the full MCP server, including the protocol layer. Call it as if you were an AI agent. Verify that responses match the schema.

Load Tests: Your MCP server will be called by agents running 24/7. Test it under load. How many concurrent requests can it handle? What's the latency at the 99th percentile?

Monitoring: Once deployed, monitor your MCP server continuously:

  • Request rate: How many times per day is each tool called?
  • Error rate: What percentage of requests fail? Why?
  • Latency: How long does each tool take to respond?
  • Dependency health: Is your backend API up? Is the database responding?

Set up alerts. If error rate spikes, you want to know immediately. If latency increases, you want to investigate. If a tool hasn't been called in a week, you want to know if it's dead code.

Tools like Prometheus and Grafana work well for monitoring. Log all requests and responses (with PII redacted) to a centralized logging system like ELK or Datadog.

On Padiso's blog, you'll find posts on monitoring agent behavior and debugging integration issues. Many of those principles apply to MCP servers as well.

Integrating Your Custom MCP Server with Padiso

Once your MCP server is built, tested, and deployed, you need to connect it to your agent team.

Padiso's integration system supports custom MCP servers. The process is straightforward:

  1. Register the server: Provide the endpoint (HTTP URL or stdio command) and any authentication credentials
  2. Discover tools: Padiso automatically discovers all tools your MCP server exposes
  3. Assign to agents: Specify which agents can use which tools
  4. Deploy: Your agents now have access to your custom integration

The beauty of this approach is that your agents don't need to know about the MCP protocol. From their perspective, they just have a new tool available. They call it, get a result, and move on.

When you need to update your MCP server (fix a bug, add a new tool, change authentication), you can do it without redeploying your agents. The Padiso documentation walks through the integration process step by step.

Real-World Patterns: Common Use Cases for Custom MCP Servers

Let's look at some concrete examples where teams are building custom MCP servers.

Internal Database Access: A venture capital firm is running agents that research portfolio companies. The agents need to query the firm's internal database of company information, cap tables, and investment history. They build an MCP server that exposes tools like search_companies, get_cap_table, and list_investments. The server authenticates to the database, enforces access control (only certain agents can see certain deals), and logs all queries for audit purposes.

Proprietary ML Model: A fintech company has a custom credit scoring model that runs internally. They build an MCP server that exposes a single tool: score_applicant. The agent passes in applicant data, the server runs the model, and returns a score. The server handles versioning (the model is updated weekly), caching (avoid rescoring the same applicant twice), and fallback behavior (if the model is down, return a default score).

Multi-System Orchestration: A private equity firm's portfolio companies use different systems. Company A uses Salesforce, Company B uses a custom CRM, Company C uses a spreadsheet. Instead of building three separate integrations, the firm builds one MCP server that abstracts away the differences. Tools like get_customer_record and update_customer_record work the same way regardless of the backend system. The MCP server handles routing and transformation.

Compliance and Audit: A regulated financial services company needs to ensure that every agent action is auditable. They build an MCP server that wraps their core business logic. Every tool call is logged with context (which agent, which customer, which data was accessed, what changed). This creates an audit trail that satisfies compliance requirements.

These patterns have something in common: they're solving problems that can't be solved with off-the-shelf integrations. They're leveraging custom MCP servers to give their agent teams superpowers.

Deployment Strategies: From Development to Production

Deploying an MCP server to production requires careful planning. You can't just push code and hope it works.

Development Environment: Start locally. Run your MCP server on your machine, connect it to your local test database, and test with mock agents. Use the Builder.io tutorial on MCP servers as a reference for setting up a local development environment.

Staging Environment: Before production, deploy to a staging environment that mirrors production. Use real data (or realistic test data). Run your full agent suite against staging. Verify that everything works end-to-end.

Production Deployment: When deploying to production, consider:

  • High availability: Run multiple instances of your MCP server behind a load balancer. If one instance goes down, traffic routes to the others.
  • Graceful shutdown: When you deploy a new version, finish processing existing requests before shutting down the old instance.
  • Versioning: Include a version number in your MCP server. If you need to roll back, you know exactly which version you're rolling back to.
  • Canary deployments: Deploy the new version to a small percentage of traffic first. Monitor for errors. If everything looks good, roll out to 100%.
  • Secrets management: Store API keys, database credentials, and other secrets in a secrets manager (AWS Secrets Manager, HashiCorp Vault, etc.). Never commit secrets to version control.

On Padiso's pricing page, you'll see that the platform supports unlimited integrations and MCP servers. This means you can deploy as many custom servers as you need without hitting limits.

Troubleshooting Common Issues

Even with careful planning, things go wrong. Here are common issues and how to fix them.

Issue: Agent can't find the tool

Cause: The MCP server didn't register the tool correctly, or Padiso didn't discover it.

Fix: Check that your tool is decorated with the @server.tool() decorator. Verify that the tool name follows the naming convention (lowercase, underscores). Restart the MCP server and re-sync with Padiso.

Issue: Tool returns an error that the agent doesn't understand

Cause: The error message is unclear or the error code isn't documented.

Fix: Review your error handling. Return structured error responses with a clear error_code and error_message. Document what each error code means. Test error scenarios explicitly.

Issue: Tool is slow

Cause: The backend API is slow, or your MCP server is doing unnecessary work.

Fix: Profile your code. Use a tool like py-spy to see where time is spent. Add caching for expensive operations. Consider async/await to handle concurrent requests. If the backend API is slow, consider implementing a queue or batch processing.

Issue: Tool returns inconsistent results

Cause: Your backend data is changing, or your tool isn't deterministic.

Fix: Ensure that your tool is deterministic (same input always produces the same output). If your backend data is changing, that's expected-document it. Consider adding a timestamp to responses so agents know how fresh the data is.

Advanced Topics: Caching, Rate Limiting, and Optimization

Once your MCP server is working, you can optimize it for production scale.

Caching: If your backend API is expensive to call, cache results. Use Redis or an in-memory cache. Be careful about cache invalidation-stale data can cause agents to make wrong decisions. Set reasonable TTLs (time-to-live) for cached data.

Rate Limiting: If your backend API has rate limits, implement rate limiting in your MCP server. Queue excess requests and retry them later. Log when rate limits are hit so you know if you need to upgrade your backend service.

Batching: If your tool is called with many different inputs, consider batching. Instead of calling the backend API 100 times for 100 different customer IDs, batch them into 10 requests with 10 IDs each.

Async/Await: Use async/await to handle concurrent requests efficiently. If your tool makes HTTP calls, use an async HTTP client like httpx or aiohttp. This allows your server to handle multiple agent requests simultaneously without blocking.

Connection Pooling: If your tool connects to a database, use connection pooling. Don't create a new database connection for every request-reuse connections from a pool.

These optimizations matter when your agents are calling your MCP server thousands of times per day. They're the difference between a server that can handle load and one that falls over.

The Future: Scaling Agent Teams with Custom Integrations

As you scale your agent team, you'll likely need more custom integrations. Your first MCP server handles customer data. Your second handles internal systems. Your third handles external APIs that don't have standard integrations.

The question is: how do you manage this complexity?

This is where Padiso's orchestration capabilities shine. Instead of each agent team managing its own integrations, Padiso provides a central hub. You define integrations once, and all your agents can use them. You monitor integrations centrally. You update integrations in one place.

For founders building headless companies-companies where AI agents handle operations-custom MCP servers are essential infrastructure. They're how you give agents access to proprietary business logic. They're how you scale without hiring.

For engineering leaders deploying production agents, custom MCP servers are a way to solve integration problems that off-the-shelf solutions can't handle. They're a way to move fast without sacrificing control or security.

The technical bar for building an MCP server is low. The business value is high. If you're running agents in production, and you have proprietary systems that agents need to access, building a custom MCP server is not optional-it's essential.

Getting Started: Your First Custom MCP Server

If you're ready to build, here's where to start:

  1. Identify a tool your agents need: What internal system do your agents need to access? What operation do they need to perform?

  2. Define the schema: What are the inputs? What are the outputs? What errors are possible?

  3. Choose a framework: Python with FastMCP is a good default. If your team prefers another language, use that.

  4. Build and test locally: Write the code. Write tests. Get it working on your machine.

  5. Deploy to staging: Package as a Docker container. Deploy to a staging environment. Test end-to-end with your agents.

  6. Deploy to production: Once staging is solid, deploy to production. Monitor closely for the first few days.

  7. Integrate with Padiso: Connect your MCP server to Padiso. Assign it to agents. Watch it work.

The Padiso documentation has detailed guides for each step. The Padiso team is available to help if you get stuck.

Building custom MCP servers is how modern teams give their agents superpowers. It's how they scale operations without scaling headcount. It's how they compete in a world where AI is increasingly central to business.