# Lessons from shipping agent-facing MCP tools

> Five lessons that would have saved time on a multi-tool MCP build: test the tool the way the agent calls it, not just its inner function; the value you want to return is often already in hand; make implicit behaviour explicit; the build gate catches what unit tests don't; and a busy branch means rebasing.

URL: https://biloh.com.au/docs/engineering-notes/lessons-shipping-agent-tools
Category: Engineering notes | Audience: builder | Updated: 2026-06-25

A run of agent-facing tools surfaced the same handful of lessons more than once. None are exotic; each cost real time the first time. Here they are as patterns to copy.

## 1. Test the tool the way the agent calls it, not just its inner function

A tool's underlying function returned exactly the right value. Its unit test passed. The agent still got nothing — because the thin MCP wrapper between them **cherry-picked the response and dropped the new field.** The fix was an *integration* test that calls the tool through its wrapper and the response envelope, the way an agent actually does. It failed immediately, for the right reason.

> The unit test proves the logic. The integration test proves the *consumer* sees the logic. Ship both.

## 2. The value you want to return is often already in hand

A request to "return the bumped lock version after an update" looked like it needed a second read. It didn't. The database trigger that increments the version runs **before** the row is written, so the row already returned by the update carried the new value — it was just buried in a nested field the agent didn't know to read. The fix was to surface it explicitly, with no extra round-trip.

**Read the existing behaviour before adding a query.** A surprising amount of "we need to fetch X" is "X is already on the object."

## 3. Make implicit behaviour explicit, in-band

Two small changes removed real friction:

- An opaque field name (`divergenceFlags`) became a self-describing one (`visits_per_year_mismatches`) at the response layer — without touching the value the UI consumes.
- A tool that silently applied financial defaults began returning an `applied_defaults` block, so the agent could *see* what it had been given rather than infer it.

An agent acts on what the response shows it. Surfacing the implicit is often higher-leverage than new logic.

## 4. The build gate catches what unit tests don't — run it

The version changelog lives in a **single-quoted** string. An apostrophe in a description (`tool's`) closed the string and broke the build — twice. The test runner tolerated it; the production build (type-check plus lint) did not.

Run the real build before every push, and judge it by its exit code or full output — never by the tail of the log, because lint errors appear early and a `tail` hides them.

## 5. On a busy main branch, expect to rebase — and preserve the other work

When other commits land between your work and your push, the changelog and version files collide. The recovery pattern:

1. Rebase onto their commit.
2. Take their version of the changelog files, then **re-apply** your entry on top and bump your version *past* the collision.
3. Keep their changelog entry. Don't clobber it.

Done that way, both histories survive and the version stays monotonic. A merge that overwrites someone else's changelog entry is a silent data loss.

## Next steps

- The safety model these tools ship inside: [Making a multi-connector MCP setup safe to act on](/docs/engineering-notes/multi-tenant-mcp-safety).
- How agents find these tools at all: [Tool discoverability for agents](/docs/engineering-notes/tool-discoverability-for-agents).
