Show HN: Spanly - See what AI agents do inside your MCP server

1 point

2 days ago

Hi HN, I'm Tim, solo founder of Spanly (https://spanly.com). I spent the last year working on MCP gateways, and debugging them with nothing better than HTTP-level APM was painful. Spanly runs its own MCP server and monitors it with itself, so I've been the first user of everything below.

Spanly is observability for MCP servers, the protocol that lets agents like Claude and Cursor call your product's tools. It's a sidecar proxy, so it works with any language and requires zero code changes.

Live demo (no signup): https://app.spanly.com/share/demo

Sidecar source (Apache-2.0): https://github.com/spanlyhq/spanly

Here's the problem. If your product ships an MCP server (the "Stripe added an MCP" pattern), generic APM sees HTTP requests to /mcp and not much else. An agent calls your tool, gets an error, retries twice and gives up; in your APM that's three healthy POSTs.

Sentry and New Relic both recently added MCP monitoring, which helps if you're on their stack: both are in-process SDKs, TypeScript and Python only, mapping MCP onto their existing span model. If your server is in Go or Rust, or you didn't write it, there's nothing to attach an SDK to. And notifications and MCP logging don't fit a span model at all.

Sentry wrote candidly about hitting this with their own MCP server: it grew to 50M requests/month, and the way they learned requests were silently dying (no result, no error) was users reaching out, because everything looked clean on their side. (https://blog.sentry.io/introducing-mcp-server-monitoring/)

How it works: a Go sidecar proxy sits in front of your MCP server and speaks the protocol itself. Any language, any framework, including servers you don't have source for. It wraps stdio servers too, not just HTTP. Telemetry shipping is async and lossy by design: if Spanly's backend is down, your server doesn't feel it, and drops are counted in the proxy's Prometheus metrics.

It captures every message type, not just tool-call spans, with client identification (Claude, Cursor, Copilot, ChatGPT, Windsurf, Cline, Zed, custom agents) and per-request latency and error breakdowns.

The part I haven't seen elsewhere: a product view next to the engineering view. The engineering side is traces, errors, latency percentiles. The product side answers what teams ask the week after shipping an MCP: which clients are connecting, which tools are growing, how adoption is trending, and how much agents actually do per session.

It cooperates with the APM you already have: W3C traceparent passes through untouched, so MCP traces cross-link with your Datadog or Sentry traces. Credential-bearing headers (Authorization, cookies, API keys) are redacted before telemetry ever leaves the sidecar.

Open core: the sidecar and optional TypeScript/Python SDKs are Apache-2.0; the hosted cloud is the paid part, and the backend isn't open source or self-hostable today. US and EU regions, with telemetry staying in-region.

Pricing: free tier with 100k requests/mo, paid from $49. We meter requests (a request and its response count as one); notifications and MCP log messages aren't billed. The free tier degrades to 10% sampling instead of billing you when you go over.

I've been heads-down on this solo for a while and I'm excited to finally show it. Would love feedback, especially from anyone running an MCP server in production: what do you wish you could see that you currently can't?