We spent the last year in deploying AI agents for teams in large enterprises. The agents themselves worked fine. The problem was managing them. You've got Claude Code in a terminal, a research agent in a browser tab, a Slack bot somewhere else, a scheduling assistant in yet another window. It's chaotic. There's no single place to see what's running, who's doing what, or how things connect. As the number of agents grows, this gets unmanageable fast.
Mercury is a canvas where you bring your agents and humans together in one place. You draw connections between them to form agent teams. An edge between Agent A and Agent B means A can delegate work to B — B processes the request, calls its tools, and replies. Agents can delegate further down the graph. The canvas becomes a living map of how your team operates.
Some details on the stack:
- Delegation as a primitive: when Agent A messages Agent B, it creates a persistent task. Tasks survive across activations so multi-step work doesn't lose context. - Agent types: native Mercury agents (Anthropic SDK / Claude), plus adapters for Claude Code, Devin, Manus, OpenClaw, Gumloop, and any MCP-compatible agent. Mix them on the same canvas. - 800+ tool integrations via Composio (Gmail, Calendar, Slack, Linear, Notion, etc.) with per-agent OAuth. - Channels: agents reach humans via the web UI, iMessage, or Slack. External triggers (new email, calendar event) can wake agents and kick off workflows. - Human-in-the-loop by default: agents need approval before real-world actions. You relax controls per-agent as trust builds.
We've been running Mercury on Mercury for months — 30+ agents supporting a 3-person team. Scheduling, sales ops, engineering triage, finance. The thing that surprised us most: the hard part was never getting a single agent to be smart. It was getting five of them to not duplicate work, not contradict each other, and not spam the same human with redundant questions.
We've raised $1.5M from a16z, with investors from OpenAI, Cognition, and others. We're opening up alpha access — if you're interested: mercury.build
One question we'd love the community's take on: where should memory live — in the orchestration layer or the agent layer? We started by managing memory at the org level, exposing it as tools or injecting it into agent context. But not all agents are created equal. Some handle memory well on their own, and we need to pick our battles. The role of the orchestration layer in memory management is one of the harder design decisions we're wrestling with, and we'd genuinely love to hear how others think about it.
We put together a short walkthrough if you want to see the UX before signing up: https://youtu.be/jpKvbyjkXMU
Happy to answer anything about the stack, the agent communication protocol, or the dumb mistakes we made along the way.