Building a Hosted MCP Server for an SMB CRM (the Same Week Salesforce Shipped Headless 360)
How OneHub360 ships an agent-accessible MCP server alongside an existing REST API. The architecture, the security model, the gotchas, and the work that lives outside the protocol itself.
On April 23, Salesforce announced Headless 360, their public push to make Sales Cloud, Service Cloud, and Marketing Cloud agent-accessible over MCP. Their PR walked through the protocol, the auth model, the tenant isolation story. Big launch. Coordinated with the Anthropic Dev Day push.
That same week, I shipped MCP for OneHub360. I am one person. The repo is 110 lines for the route, ~430 for the tool registry. There are 28 tools live: 19 read, 8 write, 1 destructive. It works in Claude Desktop, Claude Code, ChatGPT's MCP client, and anywhere else that speaks the protocol.
I want to walk through exactly what it took, because the architecture is small enough to fit in your head, and the gotchas are not in any docs. If you run a SaaS and you have not started thinking about agent access, you are about to be late.
Why bother for SMB SaaS
The pitch for MCP at the enterprise tier is obvious: Fortune 500 buys CRM, Fortune 500 buys agents, agents need to talk to CRM. Salesforce had to ship this or watch the market route around them.
The pitch for SMB SaaS is less obvious and more important. My customers are plumbers, accountants, real estate agents, small agencies. They are not building agent platforms. But every one of them already lives inside Claude or ChatGPT for an hour a day, drafting emails, summarizing calls, asking questions about their business. The friction is that the chatbot has no idea who their customers are.
The unlock is not "agents replace your CRM." It is "the LLM you already use can answer questions about your pipeline without you copy-pasting CSVs into the chat." That is a 10-minute install for the user and an enormous retention multiplier for me. The bar to ship MCP is much lower than the bar to ship a great UI for any single workflow.
The architecture, in one route
The whole hosted MCP server lives in app/api/mcp/route.ts. Stateless. No session storage. No long-lived process. Each customer request authenticates fresh, builds a fresh server instance, and dies.
The shape:
POST /api/mcp
Authorization: Bearer nhk_<their_key>
route.ts:
1. validateApiKey() -> tenantId, businessId
2. tenant.plan -> planTier -> 'starter' | 'growth' | 'pro'
3. new McpServer({...}) -> fresh per request
4. registerAllTools(server, { apiKey, baseUrl, defaultBusinessId, planTier })
5. WebStandardStreamableHTTPServerTransport (stateless, JSON response)
6. transport.handleRequest(request)The transport is the new WebStandard variant from @modelcontextprotocol/sdk 1.29.0. It speaks the 2024-11-05 protocol revision and accepts a Web standard Request directly, which means I do not have to bridge Next.js's NextRequest through any custom adapter. sessionIdGenerator: undefined + enableJsonResponse: true gives me stateless JSON-over-HTTP, which is the only sane mode for a serverless route handler.
Tool handlers do not call my database directly. They loop back to my own REST endpoints at /api/v1/contacts, /api/v1/deals, /api/v1/tasks with the same Bearer token plus an x-mcp-tool header so the audit log can attribute the call. More on why that is the wrong choice later.
The plan-tier filter trick
Three tiers: Starter $97 (read-only), Growth $197 (read + write), Pro $397 (full, including destructive). The interesting part is how that gets enforced.
Because the McpServer is built fresh per request, I can register only the tools the calling key's plan allows. Inside lib/mcp-tools.ts:
export function registerAllTools(server: McpServer, ctx: ToolContext) {
// Always-on read tools
registerReadTools(server, ctx)
if (ctx.planTier === 'growth' || ctx.planTier === 'pro') {
registerWriteTools(server, ctx)
}
if (ctx.planTier === 'pro') {
registerDestructiveTools(server, ctx)
}
}When the agent calls tools/list, the SDK only reports the tools that were registered. A Starter customer literally cannot see oh360_create_contact in their tool inventory. They never get tempted to ask the agent to create a contact, because as far as the agent knows, no such tool exists.
This is sometimes called "capability discovery" in MCP docs. It is much cleaner than passing a giant permitted_actions: string[] on every call.
Confirm-flag for destructive actions
The one destructive tool is oh360_delete_contact. Pro tier only. But plan tier is not enough. Agents are confidently wrong all the time. I want a hard wall between "the LLM decided to delete something" and "the contact is actually gone."
The Zod schema:
server.tool(
'oh360_delete_contact',
'Permanently delete a contact. Requires confirm:true.',
{
contactId: z.string(),
confirm: z.literal(true),
},
async ({ contactId, confirm }) => {
if (confirm !== true) {
return errorResponse(
'Refusing to delete without confirm:true. ' +
'Acknowledge to the user that this is destructive.'
)
}
// ... actual delete via /api/v1/contacts/:id DELETE
}
)z.literal(true) means the schema will reject any call that does not include exactly confirm: true. The MCP client validates against the schema before dispatching, so most bad calls do not even reach my server. The early return inside the handler is the second wall — defense in depth, in case the client's schema enforcement is lax.
The handler text is also instructional. When Claude reads the tool description, it sees that this is destructive and that confirmation is required, which influences how it phrases the action to the user. Free safety rail.
Idempotency-Key, because agents retry
Agents retry. They time out, they get rate-limited, they get interrupted mid-tool-call by the user closing the chat. If a create-contact call retries after the original already succeeded, you do not want two contacts.
I implemented idempotency at the REST layer in lib/api-idempotency.ts, and the MCP write handlers mint a key per call. The contract:
- Client sends
Idempotency-Key: <uuid>on any mutation. - Server stores the request body hash + the response under that key for 24 hours in the
ApiIdempotencyKeytable. - Same key with same body hash inside the window: replay the cached response.
- Same key with a different body hash:
409 Conflictwith an error explaining the body changed. - After 24h: cache evicted, key reusable.
The body-hash check is the part most people skip. Without it, a retry that mutated the request payload would silently get the old response. With it, you catch the bug instead of papering over it.
The audit log
Every MCP call goes through lib/api-audit.ts and lands in the ApiAuditEvent Prisma model. Schema:
model ApiAuditEvent {
id String @id @default(cuid())
apiKeyId String
toolName String // e.g. 'oh360_create_contact'
endpoint String // e.g. 'POST /api/v1/contacts'
status Int // HTTP status
durationMs Int
errorCode String? // P2002, P2025, etc.
requestHash String // for correlation, not the body itself
createdAt DateTime @default(now())
apiKey ApiKey @relation(fields: [apiKeyId], references: [id])
@@index([apiKeyId, createdAt])
@@index([toolName, createdAt])
}This gives the customer a settings page where they can see exactly what their agent has been doing. Critical for trust. The first question every developer customer asks is "how do I see what Claude actually did," and the answer is a SQL query I can render in 50 lines of React.
Three real gotchas
1. The SDK serializes integers as strings
MCP, like most JSON-RPC frameworks, has a serialization layer between the model's tool call and your handler. I had a tool accepting a limit: z.number() argument. In testing everything worked. In production, every other call returned a Zod parse error: "Expected number, received string."
The SDK was emitting integer args as strings in some clients. The fix is one character per arg:
// before limit: z.number().optional() // after limit: z.coerce.number().optional()
z.coerce.number() accepts both real numbers and numeric strings and casts them. Use this on every numeric arg in a tool schema. There is a recent commit in this repo (z.number() -> z.coerce.number() in hosted MCP tool schemas) that does exactly this for all 28 tools. Just do it from the start.
2. Plan-tier filtering is not security
Filtering tools out of tools/list is a UX layer, not a security layer. A misbehaving or malicious client can call any tool name it wants. tools/call with oh360_create_contact arrives at your endpoint regardless of whether you advertised it.
Every write handler also re-checks the plan tier from the context, and every destructive handler re-checks again, and the REST endpoints they loop back to enforce it a third time. If someone bypasses the filter, they get a 403 from the REST layer. Belt, suspenders, and a backup pair of pants.
3. SQLite Prisma rejects mode:'insensitive'
OneHub360 ships SQLite to make self-hosting trivial. Prisma's TypeScript types let you write { name: { contains: q, mode: 'insensitive' } }, which works great on Postgres. On SQLite, it crashes at runtime with a Prisma validation error. The TypeScript compiler will not catch this for you because the type is generic across providers.
For my oh360_search_contacts tool I had to drop mode: 'insensitive' entirely and rely on SQLite's default LIKE behavior, which is ASCII-case-insensitive but not Unicode-case-insensitive. Acceptable for a CRM. If you need real Unicode case folding on SQLite, you have to do it in application code.
The 90/10 rule
If you took the route file and the tool registry as a single unit, you are looking at maybe 600 lines of TypeScript. The protocol plumbing itself is the small part of the job.
The rest of the work, which actually shipped a product:
- Cleaning up the existing REST API. My
/api/v1/*endpoints were originally written for my own dashboard. They had inconsistent error envelopes, mixed field naming, and one endpoint that quietly returned different shapes depending on a query param. Agents do not tolerate that. I had to normalize everything before the MCP layer was useful. - The customer UX for connecting Claude. A settings page that generates an API key, shows the ready-to-paste config block for Claude Desktop and Claude Code, includes a copy button, and explains what each tool does. That page took longer than the route file.
- Error mapping. Prisma throws codes like
P2002(unique constraint),P2025(record not found),P2003(foreign key). The agent does not know what those mean. I wrote a translation layer inlib/api-mutation.tsthat turns each into a sentence the model can act on: "A contact with that email already exists," "The contact you tried to update no longer exists," etc. - The marketing. A landing page at
/developers/mcp. A use-case page for each customer persona. A comparison page against doing it the hard way. SEO structured data, OG cards, all of it. - Launch coordination. Posting on HN, replying to every comment, writing this post.
The protocol implementation was 10% of the work. The other 90% was making it a product instead of a demo.
What I would do differently
The biggest mistake in my architecture is the HTTP loopback. My tool handlers send a Bearer-authed POST to http://127.0.0.1:3801/api/v1/..., which goes back through Next.js's router, hits the same auth middleware again, and finally calls the database. For every MCP call I get two audit log rows: the MCP entry and the underlying REST entry.
The right move was to extract the handlers into shared service functions:
// services/contacts.ts
export async function createContact(
ctx: AuthCtx,
input: CreateContactInput
): Promise<Contact> { ... }
// app/api/v1/contacts/route.ts
import { createContact } from '@/services/contacts'
export async function POST(req) {
const ctx = await authenticate(req)
const input = await req.json()
return NextResponse.json(await createContact(ctx, input))
}
// lib/mcp-tools.ts (handler)
async ({ name, email }) => {
const contact = await createContact(ctx, { name, email })
return successResponse(contact)
}One service function, two callers. Faster (no HTTP roundtrip), one audit row per logical operation, easier to test, easier to mock. I built the loopback because it was a 30-minute path to working code. I will pay it back over the next few weeks.
The other thing: I should have shipped the audit log UI and the rate-limit headers on day one. Customers asked for both within the first 48 hours. Trust beats features.
Where this goes
Salesforce shipping Headless 360 is the canary. By the end of 2026, every CRM will have an MCP endpoint. By 2027, every SaaS in a serious vertical will. The early movers will land integration slots in agent platforms before the slots are crowded. The stragglers will look like the companies that did not ship a REST API in 2014.
If you run a SaaS, the question is not "should I do this." The question is "is my API clean enough that an MCP layer on top of it would not embarrass me." If the answer is no, fix the API first. The protocol layer is the small part. The work that earns the right to ship a hosted MCP is everything around it: stable contracts, proper auth, plan-tier discipline, audit accountability, customer onboarding that does not require a partner. That is what the rest of this post was about.
OneHub360's endpoint is at https://onehub360.com/api/mcp. The full guide is at /developers/mcp. If you are a customer with a Growth plan or higher, you already have an API key — point Claude at the endpoint and start asking it about your pipeline.
If you are another founder shipping MCP, my DMs are open. Compare notes. We are early.
Run your business from one place. Then point Claude at it.
CRM, pipeline, inbox, chat, scheduling, invoicing. Now also agent-accessible over MCP. One platform. One API key.
Start Your Free TrialNo credit card required. 14-day trial.