ADR-001: Dynamic Tool Loading

Why tools are loaded dynamically based on configured settings and message keywords.

Status: Accepted Date: 2025-03-15 Applies to: apps/core/src/ai/tool-registry.ts, apps/core/src/ai/agent.ts

Context

Talome's AI assistant has 219 tools across 17 domains. Loading all tools into every dashboard chat conversation degrades tool selection accuracy -- language models struggle to choose the right tool when presented with too many options. Testing showed that beyond ~50 tools, the model increasingly selects the wrong tool or hallucinates tool names that don't exist.

At the same time, we need every tool available for the MCP server (Claude Code and other MCP clients), where the model has much larger context windows and better tool selection at scale.

The core tension is: dashboard chat needs a small, focused tool set for accurate selection, while MCP needs the complete tool set for maximum capability.

Decision

Tools are organized into domains. Each domain declares settingsKeys -- the settings that indicate the corresponding app is configured. The system loads tools differently depending on the consumer:

Dashboard chat uses a two-layer filtering strategy:

Settings-based domain activation: only domains whose settingsKeys have values in the database are considered active. A user without Sonarr configured never sees arr tools. Domains with empty settingsKeys arrays (core, mdns) are always active.
Keyword-based message routing: within active domains, the system matches the user's message against domain-specific keywords defined in DOMAIN_KEYWORDS. A message mentioning "torrent" loads core + qBittorrent tools, not the full active set. If no optional domain keywords match, the system falls back to loading all active tools as a catch-all for ambiguous messages.

MCP server uses getAllRegisteredTools() which returns every tool from every domain, regardless of settings. Each tool's execute function handles the "not configured" case gracefully.

Implementation

The registry is in tool-registry.ts:

registerDomain(domain)          -- registers a domain with tools, settings keys, and tiers
getActiveRegisteredTools()      -- returns tools from settings-active domains
getToolsForMessage(message)     -- returns keyword-filtered subset for a specific message
getAllRegisteredTools()          -- returns all tools from all domains (for MCP)
getActiveDomainNames()          -- returns the set of currently active domain names

Settings are cached for 10 seconds (SETTINGS_CACHE_TTL_MS) to avoid N database queries per message. The cache is explicitly invalidated when settings change via invalidateSettingsCache().

Domain keywords are defined in a DOMAIN_KEYWORDS map. Each domain maps to an array of lowercase strings matched case-insensitively against the user's message. The matching uses String.includes() for simplicity and speed.

Tool Count by Layer

Layer	Typical Tool Count	When
All tools	219	MCP server, always
Active domains	80-150	Dashboard chat, settings-filtered
Keyword-routed	50-80	Dashboard chat, message-filtered
Core only	~117	Dashboard with no integrations configured

Consequences

Benefits:

Dashboard chat typically sees 50-80 tools instead of 219, significantly improving tool selection accuracy
Zero-config activation: configure Sonarr's URL and arr tools appear automatically on the next message
MCP server gets the full tool set without any filtering, keeping Claude Code fully capable
Keyword routing reduces tool count further for focused messages
Settings cache prevents database query overhead (single query per 10-second window instead of per tool per message)
New domains with no keywords fall through to the full active set, so nothing breaks

Tradeoffs:

Adding a new domain requires updating agent.ts (registration) and optionally tool-registry.ts (keywords)
The keyword list is manually maintained -- missing keywords may cause tools to load unnecessarily, wasting context but never breaking functionality
The 10-second cache means freshly configured settings take up to 10 seconds to activate new tools
Keyword matching is string-based and may match unintended words (e.g., "light" matching home assistant when the user means something else). False positives load extra tools but don't cause incorrect behavior.

Alternatives Considered

Embedding-based tool selection: pre-compute embeddings for tool descriptions and retrieve the top-K most relevant tools per message. Rejected because it adds complexity (embedding model dependency, startup latency for embedding computation) and the keyword approach is simpler, faster, and sufficient for the current scale.
Always load all tools: let the model figure it out. Rejected because testing showed unacceptable tool selection accuracy degradation beyond ~50 tools in the dashboard chat context window.
User-selectable tool profiles: let users manually choose which domains to load. Rejected because it requires manual configuration, creates a barrier for new users, and the settings-based approach achieves the same result automatically.
LLM-based tool routing: use a fast model to pre-classify the message and select tools. Rejected because it adds a round-trip to an LLM for every message, increasing latency and cost for marginal accuracy improvement.