# Biloh — full documentation corpus

> Every published page at https://biloh.com.au/docs, concatenated for AI agents.
> See https://llmstxt.org/ for the llms.txt convention.
>
> Biloh is the AI-native operating system for service businesses, with Model
> Context Protocol integration at every tier. MCP endpoint:
> https://app.biloh.com.au/api/mcp

# The gap between a green build and a live deploy

URL: https://biloh.com.au/docs/engineering-notes/green-build-vs-live-deploy
Category: Engineering notes | Audience: builder | Updated: 2026-06-25

> A passing local build proves your code compiles — nothing more. The expensive failures in building this documentation corpus all lived in the gap between "it compiles" and "it's live": a deploy-time security gate, a missing shell variable that silently broke two tools, a popular framework that did not fit the stack, and a serverless function that could not read its own files. The defense is to verify across every boundary you cross.

A passing local build tells you your code is syntactically valid. It does **not** tell you the thing will deploy, run, or be reachable. The most expensive hours of building this very documentation system were all spent in the gap between *"it compiles"* and *"it's live"* — and every one of them had the same shape: an assumption that held in one environment and quietly broke in the next.

## A green build is a claim about syntax, not about shipping

A local production build passed cleanly. The deploy was then rejected — by a security gate that refused a known-vulnerable transitive dependency (a CVE in a markdown library), something the compiler never checks. The fix was a one-line patch bump; the lesson is durable: **"build passed" and "deploy succeeded" are independent claims**, and only the live deploy is authoritative. If your workflow lands on a single branch and every push deploys, treat the green deployment — not the green build — as your definition of done.

## When a tool goes silent, suspect the shell before your code

Adding one dependency failed with a cryptic argument-type error, and the test runner produced *no output at all*. It looked like a corrupted install, and an hour disappeared into deleting and reinstalling things. The actual cause was a single missing environment variable — the path to the system command interpreter. Without it, both the package manager and the test runner could not spawn the child processes they rely on, and both failed in ways that mimicked broken code.

The principle: **when several unrelated tools fail to *spawn* subprocesses, the common cause is the shell environment, not any one tool.** Check the environment before you start deleting your dependencies.

## A "production" install quietly removes your build tools

Installing one new package in a shell that defaulted to production mode pruned the project's dev dependencies — including the CSS toolchain the build needs — and the next build failed with *module not found* for things that had been fine an hour earlier. Installs are environment-sensitive: when you add a package in a context that might be production-mode, force dev dependencies in (or set the environment explicitly) so the tool you just removed doesn't take the build down with it.

## Pick the engine for the stack you have, not the one the field recommends

The most-recommended documentation framework required a major version of the web framework, the UI library, *and* the CSS toolchain that the project didn't run. Adopting it meant a high-risk upgrade of three foundations to ship a docs page. The right call was the boring one: a small, native library that matched the existing stack exactly. **Verify peer-dependency compatibility before you adopt the popular tool** — "best in class" is relative to your constraints, and the migration you avoid is the bug you don't ship.

## A serverless function can't read a file you didn't bundle

A request-time route that read content files off disk worked locally and returned a 500 in production: the platform's output file-tracing can't follow a path computed at runtime, so the files were never shipped into the function. Two fixes work — declare the files for explicit inclusion, or make the route **static** so it reads at build time and serves a cached result. Runtime filesystem access is a deployment concern, not just a code concern.

## The data you parse is rarely the type you assumed

An author wrote a date the natural way — unquoted — and the YAML parser handed back a `Date` object where the schema expected a string, failing the build. The fix was not to scold the author; it was to **coerce at the schema boundary**. Normalize inputs where they enter your system and be generous about the formats real authors will actually use. A validator that rejects the obvious, correct thing is a validator that will be worked around.

## The thread that ties them together

Every one of these was a *boundary* failure: code → shell, local → deploy, build-time → run-time, parser → schema. A green check on one side of a boundary is not a green check on the other.

The cheap, durable defense is two-part. First, **verify across the boundary you're about to cross** — run the thing in the environment that will actually run it, and treat the live deployment as the only completion signal. Second, **make the crossing safe**: a pre-push hook that runs the full build means a broken build is rejected at the push and never reaches production; a schema that coerces author-friendly input means a real author's first draft doesn't bounce; and a one-command check (here, `npm run docs:check`) that validates the cheap things — schema, links — in seconds means the slow boundary, the full build and deploy, only runs on changes already likely to pass.

None of this is exotic. It's the same discipline as a good test suite, applied to the seams between systems rather than the inside of one.

## Related

- [Lessons from shipping agent-facing MCP tools](/docs/engineering-notes/lessons-shipping-agent-tools)
- [Making a multi-connector MCP setup safe to act on](/docs/engineering-notes/multi-tenant-mcp-safety)
- [What is Biloh?](/docs/getting-started/what-is-biloh)

---

# Lessons from shipping agent-facing MCP tools

URL: https://biloh.com.au/docs/engineering-notes/lessons-shipping-agent-tools
Category: Engineering notes | Audience: builder | Updated: 2026-06-25

> Five lessons that would have saved time on a multi-tool MCP build: test the tool the way the agent calls it, not just its inner function; the value you want to return is often already in hand; make implicit behaviour explicit; the build gate catches what unit tests don't; and a busy branch means rebasing.

A run of agent-facing tools surfaced the same handful of lessons more than once. None are exotic; each cost real time the first time. Here they are as patterns to copy.

## 1. Test the tool the way the agent calls it, not just its inner function

A tool's underlying function returned exactly the right value. Its unit test passed. The agent still got nothing — because the thin MCP wrapper between them **cherry-picked the response and dropped the new field.** The fix was an *integration* test that calls the tool through its wrapper and the response envelope, the way an agent actually does. It failed immediately, for the right reason.

> The unit test proves the logic. The integration test proves the *consumer* sees the logic. Ship both.

## 2. The value you want to return is often already in hand

A request to "return the bumped lock version after an update" looked like it needed a second read. It didn't. The database trigger that increments the version runs **before** the row is written, so the row already returned by the update carried the new value — it was just buried in a nested field the agent didn't know to read. The fix was to surface it explicitly, with no extra round-trip.

**Read the existing behaviour before adding a query.** A surprising amount of "we need to fetch X" is "X is already on the object."

## 3. Make implicit behaviour explicit, in-band

Two small changes removed real friction:

- An opaque field name (`divergenceFlags`) became a self-describing one (`visits_per_year_mismatches`) at the response layer — without touching the value the UI consumes.
- A tool that silently applied financial defaults began returning an `applied_defaults` block, so the agent could *see* what it had been given rather than infer it.

An agent acts on what the response shows it. Surfacing the implicit is often higher-leverage than new logic.

## 4. The build gate catches what unit tests don't — run it

The version changelog lives in a **single-quoted** string. An apostrophe in a description (`tool's`) closed the string and broke the build — twice. The test runner tolerated it; the production build (type-check plus lint) did not.

Run the real build before every push, and judge it by its exit code or full output — never by the tail of the log, because lint errors appear early and a `tail` hides them.

## 5. On a busy main branch, expect to rebase — and preserve the other work

When other commits land between your work and your push, the changelog and version files collide. The recovery pattern:

1. Rebase onto their commit.
2. Take their version of the changelog files, then **re-apply** your entry on top and bump your version *past* the collision.
3. Keep their changelog entry. Don't clobber it.

Done that way, both histories survive and the version stays monotonic. A merge that overwrites someone else's changelog entry is a silent data loss.

## Next steps

- The safety model these tools ship inside: [Making a multi-connector MCP setup safe to act on](/docs/engineering-notes/multi-tenant-mcp-safety).
- How agents find these tools at all: [Tool discoverability for agents](/docs/engineering-notes/tool-discoverability-for-agents).

---

# Making a multi-connector MCP setup safe to act on

URL: https://biloh.com.au/docs/engineering-notes/multi-tenant-mcp-safety
Category: Engineering notes | Audience: builder | Updated: 2026-06-25

> When an agent holds several tenant connectors at once, the risk isn't data leakage (that's sealed server-side) — it's acting on the wrong tenant by mistake. The fix is layered: make the tenant visible, make it assertable, and gate the irreversible actions. Visibility alone is not enough.

An AI agent can hold **several Biloh connectors at once** — one per tenant, plus a platform-admin connector — and they look almost identical in the tool list. The danger is not that one connector reads another's data (that is sealed by a tenant-bound token, row-level security, and write gateways, independent of the agent). The danger is **intent**: calling the right tool on the *wrong* connector.

Biloh solves this in three layers, cheapest first: **make the tenant visible, make it assertable, and gate the actions you can't undo.** Visibility alone — the most common instinct — is not enough.

## Layer 1 — make the tenant visible

Two signals, on by default:

- **Connector identity.** Each connection advertises itself as `biloh · <Tenant> (<MODE>)`, where mode is `LIVE` (a real tenant), `TEST` (a tenant flagged as test data), or `PLATFORM` (the cross-tenant admin surface). Two real tenants are *both* `LIVE` — so the **name** distinguishes them and the **mode** separates a tenant from the platform.
- **A per-response stamp.** Every tool result carries `meta.tenant { id, name, mode }`. Because it is on *every* call, the last result an agent received always tells it which tenant it just touched.

A subtle implementation note: the MCP handler is built **once per process**, so the server's identifying info is otherwise static across connections. Per-connection identity is therefore applied by rewriting the protocol `initialize` *response* — fail-safe, so any anomaly returns the original handshake untouched rather than risking the connection.

## Layer 2 — make the tenant assertable (a misroute should *fail*)

Visibility helps a careful reader. It does nothing for a mistake already in flight. So a write can carry an optional `expected_tenant`: the tenant you *intend* to act on, by name or id.

If it doesn't match the connector you're on, the call returns `expected_tenant_mismatch` and the handler **never runs**. A misroute stops instead of silently going through. The argument rides the same central schema-augmentation as cross-tenant routing and is enforced once, at the single wrapper every tool call passes through — so it covers the whole tool surface without per-tool code.

> The principle: **data isolation and intent safety are different problems.** One is enforced by the database; the other has to be enforced by affordances the caller opts into.

## Layer 3 — gate what you can't undo

Two-phase confirm-gates protect the irreversible actions. The first call returns a **preview** plus a deterministic `confirm_token` and mutates nothing; only a second call carrying the matching token executes.

The scoping rule matters more than the mechanism: **gate by operation, not by keyword.** Archiving a tenant is irreversible — gated. Creating a tenant is gated too (it completes the lifecycle pair). Minting an access token is *revocable* — so it is deliberately **left ungated**. "Sounds dangerous" is not the test; "can't be undone" is.

A gate is only safe to add after an **audit** confirms nothing automated calls the tool — otherwise you break provisioning. In this build the audit showed zero code-level callers, so the gate had zero blast radius.

## What each layer is locked by

Every layer ships with a spec test written *before* the code, so the guarantee can't quietly regress:

- The mismatch guard has a test asserting a wrong `expected_tenant` returns the error code and the handler does not run.
- Each confirm-gate has a test asserting the no-token call previews and mutates nothing, a wrong token errors, and the matching token executes.
- The identity rewrite is fail-safe by construction and tested against a malformed handshake.

## Next steps

- The connection basics: [Connecting Biloh over MCP](/docs/reference/mcp-overview).
- How an agent finds the right tool in the first place: [Tool discoverability for agents](/docs/engineering-notes/tool-discoverability-for-agents).

---

# Tool discoverability for agents

URL: https://biloh.com.au/docs/engineering-notes/tool-discoverability-for-agents
Category: Engineering notes | Audience: builder | Updated: 2026-06-25

> A catalogue of ~230 tools is unusable by name-scanning. The path is: orient with a categorised map, search by intent, then measure the one honest signal — the rate at which a search returns nothing. A miss is a catalogue gap, and it should be turned into a request for the missing tool.

A Biloh tenant exposes on the order of **230 MCP tools**. No agent should find the right one by reading names. The working pattern is three moves — **orient, search, measure** — and the measurement is the part people skip.

## How does an agent orient?

The first call on a fresh session is `get_session_context`. Beyond confirming the tenant and persona, it returns a **`tool_map`**: every tool the persona can use, grouped by category (clients, contractors, invoices, jobs, proposals, …), each with a one-line summary. That is the floor-plan — enough to know *where* a capability lives before narrowing.

The authoritative flat list stays available as `mcp_health.tool_names`. The map is for orientation; the flat list is the source of truth.

## How does an agent find a specific tool?

`search_tools("onboard a client for a recurring service")` returns a ranked, persona-scoped shortlist — name, summary, category. The agent searches by **what it wants to do**, not by guessing a name.

One non-obvious discipline kept the ranker honest: **it carries no tool-name literals.** Ranking is pure lexical overlap (weighted name > category > description, with a small domain synonym map). A structural test asserts the ranker source contains no tool names, so it can't be quietly "tuned" by hard-coding the answers to the test queries — it has to generalise. The spec proves this on a *held-out* intent set the ranker never sees.

## What metric proves discoverability actually improved?

The honest signal is the **zero-result rate**. If `search_tools` returns nothing, the catalogue or the ranker failed to serve the agent's intent — that is the gap to close, and it is the one thing worth counting.

So a miss does three things: it sets an explicit `zero_results` flag on the response, it records a structured `tool_search` metric for aggregation, and it points the agent at the **request-a-tool** flow. That last move closes the loop:

> search miss → tool request → catalogue grows → fewer misses.

A high zero-result rate is not a failure to hide; it is a map of exactly what agents wanted that didn't exist yet.

## The shape to copy

1. A categorised map for orientation (`get_session_context`), a flat catalogue for truth (`mcp_health`).
2. Intent search with a ranker that has no knowledge of the test answers.
3. A zero-result signal that is both measured and turned into an actionable request.

## Next steps

- Connection basics: [Connecting Biloh over MCP](/docs/reference/mcp-overview).
- Keeping a multi-tenant connection safe: [Making a multi-connector MCP setup safe to act on](/docs/engineering-notes/multi-tenant-mcp-safety).

---

# What is Biloh?

URL: https://biloh.com.au/docs/getting-started/what-is-biloh
Category: Explanation | Audience: operator | Updated: 2026-06-25

> Biloh is an AI-native operating system for service businesses. It runs quoting, scheduling, contractor dispatch, and invoicing through chat and any MCP-compatible AI agent, with field-ready portals for clients and contractors — and Model Context Protocol integration bundled at every tier.

Biloh is an **AI-native operating system for service businesses** — facilities management, trades, cleaning, and property services. It runs the work a service business actually does: quoting jobs, scheduling recurring visits, dispatching contractors, and invoicing clients. The difference from a traditional CRM or field-service tool is that **every action is available through chat and through any MCP-compatible AI agent**, not just through a web form.

## What can Biloh do?

| Job to be done | In Biloh |
| --- | --- |
| Quote work | Build a proposal, send it, and capture a legally-binding acceptance |
| Schedule recurring work | Set a cadence once; jobs spawn on their true dates |
| Dispatch contractors | Assign, send a work order, and gate it on compliance |
| Bill clients | Generate a tax invoice, take payment, and reconcile |
| Pay contractors | Track payables and release them against client payment |

Every one of those is a tool an AI agent can call, so you can run your business by typing "send Beverley her quarterly invoice" instead of clicking through five screens.

## Who is it for?

Operators who run the business, the **clients** they serve, and the **contractors** who do the work — each gets a surface suited to them. Operators work chat-first with a web app alongside; clients and contractors use field-ready portals from the same phone they already carry.

## Why "AI-native"?

Most software bolts a chatbot onto a clicking-based app. Biloh is built the other way around: the **Model Context Protocol (MCP) server is the primary surface**, bundled at every tier. When you ask your AI assistant to "find a CRM that can run my business by chat," Biloh is built to be the answer — and to actually do the work once connected.

## Next steps

- Connect Biloh to your AI assistant — see the [MCP integration overview](/docs/reference/mcp-overview).
- Send your first quote — see [how to send a proposal](/docs/how-to/send-a-proposal).

---

# How to send a proposal

URL: https://biloh.com.au/docs/how-to/send-a-proposal
Category: How-to guides | Audience: operator | Updated: 2026-06-25

> Create a proposal for a client and site, add priced service lines, preview the honest investment summary, then send it. The client accepts by replying or via a portal link, which forms a binding acceptance and stands up the contract.

To send a proposal in Biloh you build it, preview it, and send it for acceptance. You can do every step by chat (ask your AI assistant) or in the web app — the steps are the same.

## 1. Create the proposal

Create a proposal against a **client** and one of their **sites**. If the client or site does not exist yet, create them first (or use the one-step onboarding composite). A new proposal starts in `draft`.

## 2. Add priced service lines

Add a line for each service you are quoting, with its price and — for recurring work — its cadence (weekly, monthly, a multi-visit program, or specific dates). Pricing is captured in integer cents and shown to the client as a clear investment.

## 3. Preview the investment summary

Before sending, preview the **investment summary**. This is the same honest, frequency-aware figure the client will see on the signed agreement and the accept page — visits per year are placed on their true months, not evenly smeared. Previewing here means there are no surprises at acceptance.

## 4. Send it

Sending is deliberately two-step: stage the send for approval, then approve it. On approval Biloh emails the client a branded proposal with a portal link.

## 5. The client accepts

The client accepts by replying to the email or via the portal. Acceptance is captured as immutable legal evidence under the Electronic Transactions Act, and the accepted proposal stands up the contract and its schedule automatically.

## Related

- [What is Biloh?](/docs/getting-started/what-is-biloh)
- [Connecting Biloh over MCP](/docs/reference/mcp-overview) — to do all of this by chat.

---

# Connecting Biloh over MCP

URL: https://biloh.com.au/docs/reference/mcp-overview
Category: Reference | Audience: builder | Updated: 2026-06-25

> Biloh exposes a Model Context Protocol server at app.biloh.com.au/api/mcp. Authenticate with a tenant-scoped Personal Access Token, and any MCP-compatible assistant can drive the tenant's operations end-to-end. MCP is included at every tier.

Biloh exposes a **Model Context Protocol (MCP) server** so any MCP-compatible AI assistant — Claude, ChatGPT, Grok, Perplexity — can run a tenant's operations end-to-end. This page is the reference for connecting.

## Where is the endpoint?

The MCP server lives at:

```
https://app.biloh.com.au/api/mcp
```

It is a streamable HTTP MCP server implementing the [Model Context Protocol](https://modelcontextprotocol.io). MCP integration is included at **every** Biloh tier.

## How does authentication work?

Requests authenticate with a **tenant-scoped Personal Access Token (PAT)**. The token binds every call to a single tenant, so a connection always acts on exactly one business's data — isolation is enforced server-side by row-level security and write gateways, independent of the agent.

## What can an agent do once connected?

Orient first, then act:

- **`get_session_context`** — the read-me-first call: returns the active tenant, your persona, conventions, and live state.
- **`search_tools(query)`** — find the right tool by intent instead of scanning the full catalogue.
- Composite tools like **`onboard_client_for_service`** set up a client, site, and pricing in one call.

Sends are always two-step (`propose_send_*` then `approve_send_*`) so nothing leaves the system without an explicit approval.

## Next steps

- New to Biloh? Start with [what Biloh is](/docs/getting-started/what-is-biloh).
- Want to send your first quote by chat? See [how to send a proposal](/docs/how-to/send-a-proposal).