AI Assistant

How Talome's agentic AI works — architecture, tools, memory, security, and conversation patterns.

This is not a chatbot. Talome's AI assistant is an autonomous agent with 230+ purpose-built tools that plans multi-step operations, selects the right tools for each step, executes against your real infrastructure, and reasons about failures to recover. When you say "set up a media stack," it doesn't hand you a checklist — it installs five apps, wires their APIs together, syncs indexers, verifies health, and reports back with URLs.

The difference between a chatbot and an agent: a chatbot answers questions. An agent does work.

Watch: The AI diagnosing and fixing a container issue in real-time

Architecture

Every message you send passes through a precise pipeline before a single tool executes.

1. System Prompt (Cached)

The system prompt is approximately 5,000 tokens and is cached using Anthropic's prompt caching. It contains the AI's identity, every behavioral rule, response formatting patterns, app-specific instructions (media volume conventions, Audiobookshelf library setup flow, arr wiring sequences), and the zero-config execution policy. Because it's cached, the first tokens of every conversation cost near zero — you only pay for new messages and tool results.

2. Dynamic Context Injection

Before the AI sees your message, Talome injects live context into the conversation:

Memories — the top 10 most relevant memories about you, ranked by a composite score of recency, access frequency, and confidence
Page context — which dashboard page you're on (the AI knows if you're looking at containers, media, or settings)
Stack status — which feature stacks are installed and their current health
Configured apps — which integrations have URLs and API keys set, so the AI knows what tools are available

3. Tool Selection (Keyword Routing + Domain Filtering)

Not all 230 tools load for every message. The tool registry uses a two-layer filter:

Layer 1: Domain activation. Each tool belongs to a domain. A domain's tools only load when the corresponding app is configured in Settings. If you haven't set up Jellyfin, the 6 Jellyfin tools don't exist in the AI's context.

Layer 2: Message keyword routing. Even among active domains, tools are further filtered by scanning your message for domain-relevant keywords. If your message mentions "torrent" or "download speed," the qBittorrent domain loads. If it mentions "indexer" or "quality profile," the arr domain loads. If no optional domain matches (ambiguous message), all active tools load as a fallback.

This keeps the per-message tool count low — typically 80-120 tools instead of 230 — which directly improves the AI's tool selection accuracy.

4. Security Gateway

Every tool execution passes through the security gateway, which enforces one of three modes:

Mode	Read Tools	Modify Tools	Destructive Tools
Permissive	Execute freely	Execute freely	Execute freely
Cautious (default)	Execute freely	Execute freely	Require `confirmed: true` — the AI asks you first
Locked	Execute freely	Blocked	Blocked

You can change the security mode in Settings > Security. The default "cautious" mode means the AI can read anything and modify anything, but destructive operations like uninstalling apps, deleting containers, or pruning Docker resources require your explicit confirmation.

5. Streaming Execution

The AI streams responses using the Vercel AI SDK's streamText() with a maximum of 10 tool calls per turn (stepCountIs(10)). Each tool call appears as a real-time execution card in the chat UI showing the tool name, parameters, and result. The AI processes tool results and continues reasoning — it can chain multiple tool calls in a single response to complete multi-step tasks.

Tool Domains

All 230+ tools are organized into 16 domains. Each domain activates when you configure the corresponding app in Settings:

Domain	Tools	Activates When	What It Does
Core	~113	Always	Docker, system, apps, filesystem, groups, backups, memory, widgets, automations, self-improvement, compose, config files, notifications, settings, updates, shell, storage, GPU
Media	5	`sonarr_url` or `radarr_url`	Library browsing, search, downloads, calendar, media requests
Optimization	9	`sonarr_url` or `radarr_url`	Media analysis, transcoding, library health, batch optimization
Arr	27	`sonarr_url`, `radarr_url`, `readarr_url`, or `prowlarr_url`	Root folders, download clients, indexers, quality profiles, queue management, release search, monitoring
qBittorrent	6	`qbittorrent_url`	Version, preferences, torrents, download paths, speed limits
Jellyfin	6	`jellyfin_url`	Status, libraries, scanning, stats, API keys
Audiobookshelf	13	`audiobookshelf_url`	Libraries, items, search, progress tracking, scanning, audiobook search, download, and request
Overseerr	7	`overseerr_url`	Status, Jellyfin/Sonarr/Radarr configuration, requests, approval/decline
Plex	5	`plex_url`	Status, on deck, recently watched, mark watched/unwatched
Home Assistant	5	`homeassistant_url`	Status, entities, service calls, history, automations
Pi-hole	5	`pihole_url`	Stats, enable/disable, whitelist/blacklist
Vaultwarden	4	`vaultwarden_url`	Status, users, invitations, signup toggle
Proxy	5	`proxy_enabled`	Routes, TLS, reload
Tailscale	3	`tailscale_auth_key`	Setup, status, stop
mDNS	4	Always	Local DNS status, enable/disable, refresh
Ollama	5	`ollama_url`	Models, pull/delete, model info, running processes

The MCP server always loads all tools regardless of configuration — each tool fails gracefully with a clear error if its app isn't set up.

Tool Tiers

Every tool has an assigned security tier that determines how the security gateway handles it:

Read Tier

Always execute, in any security mode. These tools only observe — they never change state.

Examples: list_containers, get_system_stats, search_apps, recall, arr_get_status, jellyfin_list_libraries, get_disk_usage, diagnose_app

Modify Tier

Execute freely in permissive and cautious modes. Blocked in locked mode. These tools change state but are generally reversible.

Examples: install_app, start_container, restart_app, wire_apps, set_app_env, remember, arr_add_download_client, create_automation

Destructive Tier

Execute freely in permissive mode. Require explicit confirmation in cautious mode (the AI asks before proceeding). Blocked in locked mode. These operations are difficult or impossible to reverse.

Examples: uninstall_app, prune_resources, cleanup_docker, delete_group, apply_change, remove_network, delete_file

Memory System

The AI remembers what you tell it — permanently. Memories persist in SQLite across conversations, sessions, and restarts.

Memory Types

Type	Purpose	Example
Preference	How you like things configured	"I prefer Jellyfin over Plex"
Fact	Concrete details about your setup	"Media drive is mounted at /mnt/nas"
Context	Situational information	"Setting up a home lab for the first time"
Correction	Things the AI got wrong that it shouldn't repeat	"Port 8080 is taken by Apache"

How Ranking Works

Each conversation turn, the top 10 memories are injected into the AI's context. The ranking algorithm scores every memory on three factors:

Recency — more recently created or accessed memories score higher
Access frequency (accessCount) — memories the AI references often score higher
Confidence — memories with higher confidence scores (set at creation, boosted by confirmations) score higher

When the AI stores a new memory, it checks existing memories for bigram similarity. If a new memory is more than 80% similar to an existing one, the duplicate is rejected — preventing the same fact from being stored repeatedly across conversations.

Memory Tools

Tool	What It Does
`remember`	Store a new memory with type and confidence
`recall`	Search memories by keyword (semantic-ish, bigram matching)
`forget`	Delete a specific memory by ID
`update_memory`	Modify an existing memory's content or confidence
`list_memories`	Show all stored memories with metadata

Memory in Practice

Memories are stored implicitly when you share information, and explicitly when you ask:

You: My media is stored at /Volumes/Media Vault/Media

AI: Got it. I'll remember that your media is at /Volumes/Media Vault/Media.
    [remember: type=fact, content="Media files stored at /Volumes/Media Vault/Media"]

--- later, in a different conversation ---

You: Install Jellyfin

AI: [recall: "media"]
    Installing Jellyfin with your media directory at /Volumes/Media Vault/Media.
    [install_app: jellyfin, volumeMounts: { media: "/Volumes/Media Vault/Media" }]

The AI didn't ask where your media was the second time. It remembered.

Managing Memories

You: What do you remember about me?
AI: [list_memories]
    Here's what I know:
    1. Media files stored at /Volumes/Media Vault/Media (fact, high confidence)
    2. Prefers Jellyfin over Plex (preference)
    3. Port 8080 is used by Apache (correction)
    4. DNS is handled by Pi-hole at 192.168.1.53 (fact)
    ...

You: Forget the thing about port 8080 — I freed that port.
AI: [forget: memory_id]
    Done. I've forgotten that port 8080 was taken.

The Zero-Config Philosophy

Traditional self-hosted apps require you to install an app, then open its web UI, navigate to settings, find the right field, paste a URL or API key, save, restart, and repeat for every connected service. Talome's AI eliminates this entirely.

The system prompt contains a strict Config-First Execution Policy:

Use dedicated configuration tools first (overseerr_configure_sonarr, arr_add_download_client, etc.)
If a dedicated tool fails, fall back to config-file automation (read_app_config_file / write_app_config_file)
Only ask the user for values that are truly unavailable (API keys they haven't provided)
Never output "manual configuration required" if a tool or config-file path exists

This means the AI will never tell you to "go to Sonarr's settings page and add qBittorrent as a download client." It will call arr_add_download_client and do it.

Conversation Patterns

1. Single App Install

Talome Assistant

Install Pi-hole

Found Pi-hole in the Talome store. Installing now.

⚙

search_apps → found Pi-hole in Talome store

⚙

install_app → pihole container created and started

⚙

inspect_container → pihole: running, healthy

pihole is running and healthy. Web UI at http://localhost:8089/admin, DNS on port 53 (TCP/UDP). Your Pi-hole admin password was auto-generated. You can find it in the container logs or reset it with "Show me the Pi-hole admin password." Want me to configure your server to use Pi-hole as its DNS resolver?

2. Multi-App Stack Orchestration

Talome Assistant

Set up a complete media stack. My media is at /mnt/media.

I will install Jellyfin, Sonarr, Radarr, Prowlarr, and qBittorrent, then wire them together.

⚙

search_apps → found all 5 apps in Talome store

⚙

install_app x 5 → jellyfin, sonarr, radarr, prowlarr, qbittorrent — all created

⚙

wire_apps → connected download clients

⚙

arr_add_download_client → sonarr and radarr linked to qbittorrent

⚙

arr_add_root_folder → /mnt/media/TV on Sonarr, /mnt/media/Movies on Radarr

⚙

arr_sync_indexers_from_prowlarr → indexers synced to sonarr and radarr

⚙

check_service_health → all 5 apps healthy

All five apps are running and wired together. Jellyfin at :8096, Sonarr at :8989, Radarr at :7878, Prowlarr at :9696, qBittorrent at :8080. Sonarr and Radarr are configured with qBittorrent as download client, root folders at /mnt/media/TV and /mnt/media/Movies, and indexers synced from Prowlarr. Try: "Search for The Bear"

3. Diagnostics

Talome Assistant

Why is Sonarr not finding any releases?

⚙

arr_get_status → sonarr: running

⚙

arr_list_indexers → sonarr: 0 indexers configured

⚙

arr_list_download_clients → sonarr: qbittorrent connected

⚙

prowlarr_get_indexer_stats → 3 active indexers, all healthy

Found the issue. Sonarr has 0 indexers configured, even though Prowlarr has 3 active indexers with healthy stats. The indexer sync is not set up. Fixing now.

⚙

arr_sync_indexers_from_prowlarr → 3 indexers synced to sonarr

⚙

arr_list_indexers → sonarr: 3 indexers active

Done. Sonarr now has 3 indexers synced from Prowlarr: NZBgeek (Usenet, 142 queries today), 1337x (Torrent, 89 queries), RARBG (Torrent, 67 queries). Try searching for a show now — it should find releases immediately.

4. System Monitoring

You: What's eating all my resources?

AI: [get_system_stats]
    [get_container_stats]

    Your server is under moderate load:
    - CPU: 67% (4 cores)
    - Memory: 5.2 GB / 8 GB (65%)
    - Disk: 412 GB / 500 GB (82%)

    Top containers by memory:
    | Container | CPU | Memory |
    |-----------|-----|--------|
    | `jellyfin` | 34% | 1.8 GB |
    | `plex` | 12% | 920 MB |
    | `sonarr` | 8% | 340 MB |
    | `radarr` | 6% | 310 MB |
    | `postgres` | 4% | 280 MB |

    `jellyfin` is the heaviest — 34% CPU and 1.8 GB RAM. That's
    typical during active transcoding. If nobody's streaming right now,
    it might be running a library scan.

    Want me to check Jellyfin's active sessions or set resource limits
    on it?

5. App Creation

You: Create a recipe manager with ingredient search and meal planning

AI: [design_app_blueprint: "Recipe manager with ingredient search,
    meal planning, PostgreSQL backend, clean dark UI"]

    Blueprint generated:
    - **Mealie** pattern adapted — two services (web + postgres)
    - Port 9925 (auto-selected, no conflicts with your running apps)
    - Volumes: ./data (recipes), ./postgres (database)
    - Healthcheck: HTTP GET /api/health

    Scaffolding the app now with Claude Code...

    [scaffold generation runs in background]

    Your app "Recipe Manager" is ready in Your Creations.
    - 2 services: web app (Node.js 20) + PostgreSQL 16
    - Features: ingredient search, weekly meal planner, recipe import
    - Dark theme matching Talome's design system

    Want me to install it?

6. Configuration

You: Change Sonarr's quality profile to prefer 1080p Bluray

AI: [arr_list_quality_profiles: sonarr]

    Sonarr has 4 quality profiles. Here's what they allow:

    | Profile | Top Quality | Cutoff |
    |---------|------------|--------|
    | Any | Remux-2160p | HDTV-720p |
    | HD-1080p | Remux-1080p | Bluray-1080p |
    | Ultra-HD | Remux-2160p | WEBDL-2160p |
    | SD | DVD | SDTV |

    "HD-1080p" with a Bluray-1080p cutoff matches what you want.
    Applying it now:

    [arr_apply_quality_profile: sonarr, profileName: "HD-1080p"]

    Done. All new series in Sonarr will use the HD-1080p profile.
    Existing series keep their current profiles — want me to change
    those too?

7. Cross-Integration Wiring

You: Connect Overseerr to everything

AI: [overseerr_get_status]
    [list_configured_apps]
    [arr_get_status: sonarr]
    [arr_get_status: radarr]
    [jellyfin_get_status]

    Overseerr is running. I'll connect it to your configured apps.

    [overseerr_configure_jellyfin]
    [overseerr_configure_sonarr]
    [overseerr_configure_radarr]

    Overseerr is now connected to:
    - **Jellyfin** — for user authentication and media availability
    - **Sonarr** — for TV show requests (HD-1080p profile, /mnt/media/TV)
    - **Radarr** — for movie requests (HD-1080p profile, /mnt/media/Movies)

    Users can now request media at http://localhost:5055.
    Requests will auto-route to the right *arr app and start downloading.

    Want me to set up auto-approval so requests don't need manual review?

8. Self-Improvement

Talome Assistant

The container list loads slowly when I have 20+ containers

⚙

plan_change → Optimize container list performance for 20+ containers

I found the issue. The container list page calls inspect_container for each container sequentially to get detailed stats. With 20+ containers, that is 20+ serial Docker API calls. Proposed fix: use list_containers with stats: true to get basic stats in a single API call, only call inspect_container for the detail view, and add a 5-second cache. This should reduce API calls from N to 1 and cut load time from ~4 seconds to under 200ms. Should I apply this change?

Do it.

⚙

apply_change → TypeScript compilation passed, 2 files modified

Change applied. Modified apps/core/src/routes/containers.ts and apps/dashboard/src/components/containers/container-list.tsx. The container list now fetches all stats in a single call. Refresh the containers page to see the difference.

Multi-Provider Support

Talome supports three AI providers:

Provider	Default Model	Best For
Anthropic	Claude Haiku 4.5	Best tool use, recommended default
OpenAI	GPT-5.3	Alternative cloud option
Ollama	Local models	Fully offline, privacy-first

Configure your provider in Settings > AI Provider. The system prompt and tool architecture work identically across all providers — only the model's reasoning quality varies.

What the AI Can Create

Beyond managing existing infrastructure, the AI assistant can create new resources through conversation:

Custom tools -- define new tools at runtime via create_tool. Describe a capability (e.g., "check my UPS battery status via NUT") and the AI generates a working tool with Zod schema, execution function, and keyword triggers.
Dashboard widgets -- create and configure widgets programmatically via create_widget_manifest. Describe what metrics or data you want to see and the AI builds a widget that appears on your dashboard.
Automations -- describe a workflow in plain English and the AI generates the full automation with triggers, conditions, and steps via create_automation.
App blueprints -- describe an app you want to build and the AI generates a complete Docker Compose stack with manifest, health checks, and optionally a full web UI via design_app_blueprint.
Integration wiring -- the AI can configure cross-app connections (download clients, indexer sync, media server linking) through tools like wire_apps, overseerr_configure_sonarr, and arr_add_download_client.
Codebase changes -- the AI can modify Talome's own TypeScript source code via the self-improvement pipeline (plan_change / apply_change), with automatic TypeScript validation and rollback.

Everything is conversational. You describe the intent, the AI builds the implementation.

Custom System Prompts

You can append custom instructions to the system prompt in Settings > AI. Common uses:

Set a preferred language: "Always respond in Spanish"
Define naming conventions: "Use kebab-case for all container names"
Add domain knowledge: "My NAS is a Synology DS920+ at 192.168.1.100"

Custom prompts are appended after the default system prompt and do not override it.

Limitations

No internet access — the AI works with your local system. It cannot browse the web, access external APIs, or download files from the internet (unless you've configured a web search tool).
Destructive operations need confirmation — in the default "cautious" mode, operations like uninstalling apps, deleting files, or pruning Docker resources require you to confirm.
10 tool calls per turn — the AI can execute up to 10 tool calls in a single response. For operations requiring more steps, it completes across multiple turns.
Zod validation on all inputs — every tool parameter is validated with a Zod schema. The AI cannot pass malformed data to any tool.
No eval() — the AI never executes arbitrary code strings. All operations go through typed, validated tool definitions.
Shell commands are restricted — run_shell only executes commands you explicitly request and logs every execution to the audit trail.

Architecture

1. System Prompt (Cached)

2. Dynamic Context Injection

3. Tool Selection (Keyword Routing + Domain Filtering)

4. Security Gateway

5. Streaming Execution

Tool Domains

Tool Tiers

Read Tier

Modify Tier

Destructive Tier

Memory System

Memory Types

How Ranking Works

Memory Tools

Memory in Practice

Managing Memories

The Zero-Config Philosophy

Conversation Patterns

1. Single App Install

2. Multi-App Stack Orchestration

3. Diagnostics

4. System Monitoring

5. App Creation

6. Configuration

7. Cross-Integration Wiring

8. Self-Improvement

Multi-Provider Support

What the AI Can Create

Custom System Prompts

Limitations

Tools Reference

Media Stack Guide

Self-Improvement

MCP Server

On this page

AI Assistant

Source

Source

Source

Source

Tools Reference

Media Stack Guide

Self-Improvement

MCP Server

On this page