Migration design · research artifact

From Plans to Agents.
A unified AI runtime for Loquent.

Loquent already has most of an agent system — it's just spread across eight modules under different names. This document proposes a single AiAgent abstraction that owns every AI operation in the platform: text suggestions, transcription, call analysis, contact memory, reports, dashboard, task extraction, autonomous campaigns. Personas, memory, rules, tools, access. Inspired by Hermes Agent. Built to learn.

Audience · You + future coding agents Status · Research; awaiting your direction on §13 First PR · Phase 0 only — ships in a week

TL;DR — the nine decisions that shape everything

Build a new generic AiAgent runtime in src/mods/ai_agent/. Don't rename Plan in place.

The voice agent at mods::agent already owns the name. Use AiAgent internally; show "Agents" in UI.

Coexist with feature flags. Plans keep running. Each retargeting rolls out per org with a 30-day cleanup tail.

Phase 0 ships schema + types + admin endpoint. Phase 1 pilots Dashboard Briefing (lowest blast radius).

6 dimensions per agent: Persona, Memory, Goals, Rules, Tools, Access. Hermes-style profile isolation.

Self-evolution blends Generative Agents reflection + Reflexion lessons + Voyager skill library.

Observability follows OpenTelemetry GenAI semantic conventions on ai_agent_log. OTel-ready day one.

3-tier rules: Platform > Org > Agent. Platform rules are immutable; ship via code-reviewed migration.

Reflection is opt-in per org (cost discipline). Defaults stay on cheap models with hard budgets.

§1Context — why we're doing this

Loquent's AI capabilities live in eight disconnected places. Each has its own prompt-building, model resolution, tool list, logging shape, notification path, and permission story. There is no shared identity an end user can name, configure, or watch evolve.

AI modules with their own flow

AiUsageFeature variants

files reference mods::plan

files reference mods::agent (voice)

files reference mods::text_agent

unified persona or memory model

What an org owner experiences today

They see a dashboard briefing, a daily report email, three reply suggestions on every inbound message, and (if they're using campaigns) a Plan timeline. None of those four AI surfaces are aware of each other. The dashboard briefing doesn't know the campaign agent has been pestering a contact about something specific. The text-reply suggestions don't know what the daily report told the owner this morning.

What we want them to experience

One mental model: "These are my agents. They have personalities, they follow rules I taught them, and they're learning from how I use them." Concretely:

One identity per agent (org-scoped, kind-tagged, persona on file)
One timeline showing every operation, with cost and latency per step
One rule system where platform rules layer under org rules layer under per-agent rules
One memory attribution: every contact note knows which agent and which run wrote it
One learning loop: daily reflections → weekly skills → monthly distillation

§2Research summary

2.1 Hermes Agent — the source of inspiration

You mentioned Hermes Agent as the model. The reference is Nous Research's Hermes Agent — a self-improving multi-agent framework where every agent is a profile: a fully isolated unit with its own configuration, persona (SOUL.md), memory database, tool list, and cron jobs. Profiles connect independently to Telegram, Discord, Slack, WhatsApp, Signal, Email. Each is a complete CLI entity with its own personality.

What we steal from Hermes

The SOUL.md pattern — codifying persona as a first-class, shareable record. The isolated-profile concept — each agent is a complete identity, not a function call. Loquent's adaptation: multi-tenant SaaS where org boundaries matter and agents share infrastructure.

2.2 Agent anatomy across modern frameworks

Dimension	Hermes	CrewAI	LangGraph	Letta (MemGPT)
Persona	`SOUL.md` file	`role`, `goal`, `backstory`	System prompt in node	Agent identity + persistent memory
Memory	File DB + session history	Short-term + opt-in long-term	Explicit state object	3-tier: Core / Recall / Archival
Goals	SOUL.md + cron	Tasks in a Crew	Graph nodes (implicit)	Memory-driven
Rules	SOUL.md + env	System prompt + tools	Node LLM instruction	Constitutional layer (WIP)
Tools	Function-calling + skill library	Per-agent tool list	Tools as graph nodes	Tool calls within memory
Access	Gateway channel + env	Task-scoped	Node-level (implicit)	Function signatures

No framework has built-in multi-tenant access control or organization-scoped rules. Loquent has to design this layer itself.

2.3 Self-evolution — three mechanisms we'll blend

Method	Trigger	Storage	How Loquent uses it
Generative Agents Park et al. 2023	Importance threshold (~2–3×/day)	Time-indexed text + embeddings	Daily/weekly/monthly `ai_agent_reflection` rows; meta-LLM digests the log
Reflexion Shinn et al. 2023	External feedback / failure	Episodic memory buffer	`ai_agent_lesson` rows: one-line verbal learnings from failed runs + user feedback
Voyager Wang et al. 2023	Task success	Indexed code/prompts in vector DB	`ai_agent_skill` rows: extracted prompt fragments after N successes; retrieved into future runs

2.4 Observability — OpenTelemetry GenAI conventions

Since 2024, OpenTelemetry has defined semantic conventions for GenAI tracing. This is the emerging industry standard that unifies Langfuse, LangSmith, Helicone, Traceloop. We adopt it in ai_agent_log day one so a future export to any tracing backend is a config change, not a refactor.

Loquent field	OTel attribute	Example
`ai_agent_run.trace_id`	`trace.id`	uuid
`ai_agent_log.span_id`	`gen_ai.*` span id	uuid
`ai_agent_log.parent_span_id`	`span.parent_id`	uuid
`ai_agent.kind`	custom: `agent.kind`	text_reply_agent
`ai_usage_log.model`	`gen_ai.request.model`	claude-opus-4-7
`ai_usage_log.input_tokens`	`gen_ai.usage.input_tokens`	1500
`ai_usage_log.cached_tokens`	`gen_ai.usage.cache_read_tokens`	200
computed	custom: `cost_cents`	0.18
`ai_agent_log.duration_ms`	span duration	850

§3Current state in Loquent

Three modules already do most of what we need. They just don't know about each other.

The Plan system — already 90% an agent runtime

src/mods/plan/ (71 file references) is a sophisticated autonomous-execution framework. It has a typed tool-call log, an 8-state JSONB state machine, a cron-polled executor, contact assignments, autopilot gating, approval gates, re-enrollment policy, per-template model override. The hard problems are solved here.

The 8 plan states (we reuse this exact shape for `AiAgentRunState`)

PendingReview user must approve before running StandBy { next_execution_at } waiting for cron to pick up AwaitingInput paused on AskUser / approval Executing LLM loop is active Paused manually paused Completed { completed_at } Failed { error_message } Stopped user-cancelled

The 13 typed tool variants (we keep this typed-per-kind pattern)

SendEmail, SendSms, ListPlanContacts, GetContactDetails, GetContactNotes, WriteInteractionNote, UpdateSystemNote, GetConversationHistory, UpdateContact, AskUser, CompletePlan, FailPlan, ScheduleNextExecution

The Text Agent — already a separate module

src/mods/text_agent/ generates 3 high/med/low-confidence reply suggestions per inbound message. Has its own text_agent table (purpose, tier, model, escalation_instructions, restricted_topics, temperature, knowledge_base_ids) and a text_agent_suggestion table. Critical-path code: generate_text_agent_suggestions_service.rs:11.

The Voice Agent — naming collision risk

Critical: mods::agent is already taken

The voice/realtime agent module owns the Agent type, AgentResource permission, the phone_number.agent_id FK, and the agent table — 54 file references. Any new "Agent" abstraction that reuses the bare name will produce ambiguous imports, compile errors, and confused code reviews. We use AiAgent internally and rename voice → VoiceAgent as a late, optional phase once the new runtime is stable.

The AI infrastructure — already well-organized

src/mods/ai/ wraps aisdk + OpenRouter. The ai_usage_log table + spawn_log_ai_usage(AiUsageEntry) helper at log_ai_usage_service.rs:13 capture every AI call's tokens, cost, model, provider, latency. 26 AiUsageFeature variants already cover every call site (TextAgentSuggestions, DashboardBriefing, AnalyzeCall, SummarizeCall, UpdateContactMemory, GenerateReport, ExtractTasks, RealtimeTurn, …).

✓

No new AiUsageFeature variants needed

Each AiAgentKind maps to an existing feature variant. Billing tier matrix stays stable. Admin dashboards keep working. The enum is the integration seam — don't grow it.

§4Target architecture

The 6 dimensions of an `AiAgent`

┌──────────────────────────────────────────────────────────────┐ │ AiAgent │ ├──────────────────────────────────────────────────────────────┤ │ Persona (SOUL.md-equivalent — tone, voice, identity) │ │ Memory (short-term scratchpad + long-term facts) │ │ Goals (current run goals + persistent objectives) │ │ Rules (per-agent; platform/org layers added at run) │ │ Tools (allowlist; resolved at run-time) │ │ Access (read/write scopes; defaults to triggering user)│ └──────────────────────────────────────────────────────────────┘

Agent kinds — the launch lineup

Kind	Phase	What it does	Replaces
`DashboardBriefingAgent`	P1	Daily summary card on the dashboard	`generate_dashboard_briefing_service`
`TextReplyAgent`	P2	3 suggestions per inbound message	`mods::text_agent`
`CustomReportAgent`	P3	Scheduled digest (leads, calls, messages, tasks)	`mods::report`
`AutonomousCampaignAgent`	P4	Multi-step contact campaigns	`mods::plan` executor
`AnalyzerAgent`	P8	Call analysis (custom analyzers)	`mods::analyzer`
`TaskExtractionAgent`	P8	Extract todos from calls/messages	`create_tasks_from_call_service`
`AutoTagAgent`	P8	Tag contacts after call	`auto_tag_contact_from_call_service`
`ContactEnrichmentAgent`	P8	Enrich contact fields from convo	`enrich_contact_service`
`ContactMemoryAgent`	P8	Maintain contact memory notes	`update_contact_memory_service`
`AssistantAgent`	P8	In-app assistant chat	`mods::assistant`
`ReflectionAgent`	P9	Meta-agent that distills other agents' logs	(new)

Naming: internal vs. UI

Internal (Rust code): AiAgent, mods::ai_agent, ai_agent table, AiAgentResource permission. Disambiguates from existing voice Agent.
User-facing (UI copy): "Agents" everywhere. Voice agents become "Voice Agents". Owners see one mental model.

§5Data model

10 new tables. 4 existing tables augmented with nullable FKs. No drops in the migration window.

ER diagram

organization (existing) │ ▼ ai_agent ┌───────────┴───────────┐ ▼ ▼ ai_agent_run ai_agent_schedule │ ┌─────────┼─────────┐ ▼ ▼ ▼ ai_agent_log ai_agent_memory ai_usage_log (existing) (timeline) (working+long) │ ▼ ai_agent_reflection / ai_agent_skill / ai_agent_lesson (rolled up from logs) Existing tables augmented with FK: plan.ai_agent_id → ai_agent.id (autonomous campaign) plan_template.ai_agent_id → ai_agent.id text_agent.ai_agent_id → ai_agent.id (text reply) contact_note.written_by_ai_agent_run_id → ai_agent_run.id

The big four — what they store, why they exist

1. `ai_agent` — the identity (the SOUL.md row)

One row per agent. Org-scoped. Holds: kind, name, slug, is_system_default, is_active; JSONB fields for persona (tone/voice/signature/traits/soul_md), goals, rules_override, tools_allowlist, access_scope, budget; plus model_override, enable_reflection, enable_autopilot.

Indexes: (organization_id, kind, slug) UNIQUE; (kind) for dispatch fan-out; partial on active rows.

2. `ai_agent_run` — one execution

Has state (the AiAgentRunState enum JSONB, same shape as PlanState), parent_run_id for orchestration chains, trigger_source JSONB (Manual { user_id } | Cron { schedule_id } | Webhook { entity, entity_id } | Triggered { by_run_id }), input_payload, output_payload, total cost/tokens/tool-calls, and trace_id for OTel.

3. `ai_agent_log` — unified, OTel-compatible timeline

The single most important new table. Every operation an agent performs becomes one row. Six kinds:

`kind`	When written	Payload
`system_event`	State transition, schedule, retry, budget enforced	event, prev_state, new_state, reason
`thought`	LLM produced reasoning text	text, confidence
`llm_generation`	Every aisdk call	provider, model, finish_reason, summaries + FK to `ai_usage_log`
`tool_call`	Every typed tool invocation	`ToolCallVariant<I,O>` per kind (typed I/O)
`reflection`	A reflection cycle ran	scope, period, source_log_ids[]
`failover`	New path failed, legacy path took over	from_path, to_path, reason

Indexes for fast analysis AND reflection pipeline: (ai_agent_run_id, created_at) primary for timeline rendering; (ai_agent_id, created_at DESC) for "recent activity"; (organization_id, created_at DESC) partial on llm_generation|tool_call for cost dashboards; GIN on entry for JSONB attribute search.

4. `ai_agent_schedule` — cron-driven runs

One row per scheduled agent. cron_spec + timezone + next_run_at + is_enabled + config_payload JSONB (recipients, filters — kind-specific). A single run_due_ai_agents_job polls every minute with SELECT … FOR UPDATE SKIP LOCKED and dispatches.

The evolution tables (Phase 9)

Table	What it stores
`ai_agent_memory`	Short-term scratchpad + long-term facts. JSONB `value`; nullable `embedding` (pgvector deferred)
`ai_agent_reflection`	Daily/weekly/monthly digests. `content_markdown`, `key_insights` JSONB, `source_run_ids`, tokens consumed
`ai_agent_skill`	Voyager-style growing library. `prompt_fragment`, `trigger_condition`, `success_count`, `failure_count`, admin curation flags
`ai_agent_lesson`	Reflexion verbal lessons. One-line learning + provenance FK to failure run + optional user feedback

System rules

platform_ai_rule — system-wide immutable rules (slug, applies_to_kinds JSONB, rule_text, priority). Edited only by code-reviewed migrations. The existing organization_ai_rule table becomes the org tier.

Why JSONB instead of separate persona/goal/rule tables

We don't proliferate side tables until a dimension proves it needs first-class queryability. Persona, goals, agent-rules, tools-allowlist, access-scope, and budget all start as JSONB columns on ai_agent. If a future requirement (e.g., "list every agent that has rule X") needs a structured query, we promote that one dimension to a table — not before.

§6Migration phases

Eleven phases. The first ships in a week with zero behavior change. The next three retarget existing modules. Phase 4 attaches Plans to the new runtime structurally without changing the executor.

PHASE 0 Foundations
M · 1 week · first PR

Land ai_agent, ai_agent_run, ai_agent_log tables + Rust types + generic run_ai_agent executor + admin-only debug endpoint. Zero behavior change. Adds the AiAgent permission resource.
PHASE 1 Dashboard Briefing Agent (pilot)
M · lowest blast radius

Retarget generate_dashboard_briefing_service to run via the new runtime. Failures are cosmetic (stale briefing) — perfect first pilot. Per-org flag, A/B parity test, hard token budget.
PHASE 2 Text Reply Agent
L · structured output

Refactor mods::text_agent to back its persona/rules on ai_agent. Critical path (every inbound message). Auto-fallback to legacy on schema parse failure. 10% rollout watched for a week before 100%.
PHASE 3 Custom Report Agent + schedule dispatcher
L · cron consolidation

Replace SendDailyReportJob with RunDueAiAgentsJob driven by ai_agent_schedule. Mutually-exclusive flags prevent duplicate sends. 7-day shadow-run before cutover.
PHASE 4 Autonomous Campaign Agent (attach plans)
XL · structural only

Every plan + plan_template gets an ai_agent_id FK (kind=AutonomousCampaignAgent). Plan log dual-writes to ai_agent_log. Executor body unchanged. Plans-specific bookkeeping (contacts, actions, re-enrollment) stays where it is.
PHASE 5 Seed default agents on signup
S · 2–3 days

New seed_default_agents_service called from finalize_signup_service:174. Creates DashboardBriefing + CustomReport agents. Non-blocking like existing notification-pref seeder.
PHASE 6 Layered rules (Platform > Org > Agent)
M

3-tier rule injection via build_layered_rules. Platform rules immutable; ship via code-reviewed migration. Every run snapshots effective rules in its log for auditability.
PHASE 7 Memory unification
M

All contact-memory writes stamped with ai_agent_run_id. New ai_agent_memory table for short-term scratchpad + long-term facts. pgvector deferred (JSONB + GIN until proven needed).
PHASE 8 Onboard remaining AI ops
XL · 6 small PRs · parallelizable

One PR per kind: AnalyzerAgent, TaskExtractionAgent, AutoTagAgent, ContactEnrichmentAgent, ContactMemoryAgent (messages batch), AssistantAgent. Each is a drop-in replacement of the inline aisdk call.
PHASE 9 Self-evolution pipeline
XL · opt-in

Daily/weekly/monthly reflection cron + skill extraction (Voyager) + lesson capture (Reflexion). ReflectionAgent kind. UI at /agents/{id}/evolution. Opt-in per org due to cost.
PHASE 10 Budgets, throttling, observability UI
L

Per-agent + per-org budget enforcement before every LLM call. Unified timeline view at /agents/{id}/runs/{run_id}. Cost/latency/success-rate metrics dashboard.
PHASE 11 Voice agent rename (optional, late)
L · mechanical

Resolve long-term collision: Agent → VoiceAgent. Only after the new runtime is stable for ≥6 months. Pure rename PR with cargo check guardrails.

Top 5 risks & mitigations

#	Risk	Mitigation
1	Naming collision with existing voice `Agent`	Use `AiAgent` internally; defer voice rename to Phase 11
2	Live plan execution disruption (paying customers)	Coexist with feature flags; Phase 4 is structural-only (no semantic change)
3	Billing/usage logging double-counting	Mutually-exclusive flags; duplicate-detection guard in `spawn_log_ai_usage`
4	Structured-output parse failure (Text Agent)	Auto-fallback to legacy path; failover recorded as `ai_agent_log.kind=failover`; 10%→100% rollout
5	Cron-job overlap during Custom Report transition	Mutually-exclusive flag; legacy cron short-circuits when new flag on; `FOR UPDATE SKIP LOCKED`

§7Observability — tracing every operation

Every thought, tool call, LLM generation, and system event becomes one indexed row in ai_agent_log. Optimized for fast timeline rendering AND for feeding the reflection pipeline.

The indexes that matter

Index	Powers
`(ai_agent_run_id, created_at)`	Timeline UI
`(ai_agent_id, created_at DESC)`	Recent activity per agent · Reflection pipeline reads this
`(organization_id, created_at DESC)` partial	Cost dashboards (LLM gen + tool calls only)
`(kind, created_at DESC)`	Filter by kind in admin UI
GIN on `entry` JSONB	Attribute search ("find all tool_call where input.to_number = X")

Day-one OTel-compatible

We don't ship to Langfuse/Datadog yet — but the field names, span/parent_span structure, and trace IDs are OTel-conformant. When we want to export, it's a collector config change, not a refactor.

§8Self-evolution — the learning loop

The loop closes when build_ai_agent_prompt_service retrieves the top-K relevant skills + recent lessons and injects them into the next run's prompt. Voyager pattern: agents get better at the same task family over time, without retraining.

┌─────────────────────────┐ ai_agent_log (run) ──► │ RunDailyReflectionJob │ ──► ai_agent_reflection (day) └─────────────────────────┘ │ ▼ ┌─────────────────────────┐ │ WeeklyDistillationJob │ ──► ai_agent_reflection (week) └─────────────────────────┘ + ai_agent_skill (Voyager: success patterns) │ + ai_agent_lesson (Reflexion: failure lessons) ▼ ┌─────────────────────────┐ │ MonthlyDistillationJob │ ──► ai_agent_reflection (month) └─────────────────────────┘ + curated skills / retired lessons

Cost discipline

Opt-in per org with a 14-day free trial. Default off.
Cheapest available model for reflection (e.g. deepseek-v3.2).
Hard per-agent monthly token budget enforced.
Skills with failure rate > threshold are auto-disabled.
Weekly distillation re-evaluates daily reflections (averaging effect against bad-day reinforcement).

Admin curation

The owner can view /agents/{id}/evolution: every reflection, every extracted skill (with success/failure counts), every lesson. They can edit, suppress, or retire any of them. This keeps the loop transparent — agents that learn must also be agents you can teach.

§9The three initial agents (Phase 1–3 deliverables)

Dashboard Briefing Agent

Trigger: Dashboard page load + scheduled (daily 5 AM org-tz)

Persona default: "Concise, data-driven analyst. Plain English, no jargon."

Tools: get_workspace_summary, get_workspace_needs_attention, get_engagement_stats

Output: Markdown briefing rendered on the dashboard

Maps to existing: generate_dashboard_briefing_service.rs

Text Reply Agent

Trigger: Webhook (inbound SMS)

Persona default: Inherits from existing text_agent (purpose, escalation, restricted topics)

Tools: query_knowledge

Output: Structured 3-suggestion JSON (high/med/low confidence)

Special: Auto-fallback to legacy on schema parse failure

Custom Report Agent

Trigger: Scheduled (ai_agent_schedule row, default daily at 5 AM org-tz)

Persona default: "Helpful daily-digest assistant. Highlight what changed and what needs attention."

Toggleable sections: new leads · handled calls · messages in · messages out · prioritized tasks

Delivery: notify_service → email + in-app + push (per user pref)

Maps to existing: generate_daily_report_service.rs

§10Org onboarding — defaults on signup

Hook added to finalize_signup_service.rs:174, right beside the existing notification-preference seeding. The new seed_default_agents creates:

1× DashboardBriefingAgent per org — system default, active, autopilot
1× CustomReportAgent per org — system default, active, daily 5 AM schedule
1× TextReplyAgent per phone number provisioned — autopilot=false (suggestions, not auto-replies)

Each seeded agent carries is_system_default = true. The org admin can edit any of these. Setting is_system_default = false orphans the row from future platform updates — so a platform-level persona update doesn't overwrite a customer's edits.

Non-blocking like notification preferences

If seeding fails, signup still completes. We log the error and the org can have agents seeded later via a backfill job. Signup never fails because the agent table had a transient issue.

§11Layered rules — system > org > agent

Three layers concatenated into every agent's system prompt. Platform rules are immutable from any UI; they ship via code-reviewed migration.

EFFECTIVE_SYSTEM_PROMPT = [Platform rules] ← immutable; in code + platform_ai_rule table + [Org rules] ← organization_ai_rule table; org admin editable + [Agent rules] ← ai_agent.rules_override; per-agent editable + [Persona] ← ai_agent.persona + [Goal] ← ai_agent.goals + per-run goal + [Retrieved skills] ← top-K ai_agent_skill rows + [Recent lessons] ← recent ai_agent_lesson rows + [Kind instructions] ← static, per-kind

Example platform rules (immutable)

Never make legal, medical, or financial advice claims.
Never share PII outside the org boundary.
Always announce when a message is AI-generated if asked.
Never autopilot a destructive action without explicit org-owner approval.
Always log every tool call with full input/output.

Example org rules (editable)

Sign messages as "Alex from Acme Co."
Use first-name only when addressing contacts.
Never discuss competitor X.

Example agent rules (per-agent overrides)

This text agent only handles inbound questions about pricing.
This report agent emails the founder, not the team.

✓

Auditable enforcement

Each ai_agent_run.ai_agent_log snapshots the effective rules used. Platform changes ship via migration, not SQL. If an agent acted in a way that violates a rule, the log proves what rules were in force at the time.

§12Verification plan

How we know each phase is safe to ship.

Phase	Acceptance signal
0	`cargo check` green; migrations apply + roll back in CI; admin endpoint manually exercised; state-machine unit tests pass
1	Side-by-side parity test for 7 days; semantic content diff acceptable; cost parity within ±10%; no double-write to `ai_usage_log`
2	Nightly synthetic: 100 simulated inbounds → 3 suggestions each on new path, 0 fallbacks; 10% rollout watched 1 week before 100%
3	7-day shadow-run dispatcher verified; single test org switched to new job; no duplicate reports
4	Every plan + template has non-null `ai_agent_id` after backfill; plan execution success rate stable 2 weeks
5	New-signup smoke: org gets 2+ `ai_agent` rows immediately
6	Unit test: org-rule cannot override platform-rule; effective rules snapshotted in log
7	Plan-driven memory writes show `written_by_ai_agent_run_id` set
8	Per-kind parity test against legacy code path
9	30-day-old agent has ≥1 reflection + ≥1 skill, viewable in admin UI
10	Budget exceeded → next run rejected with user-visible message + system_event log entry

End-to-end smoke (after Phase 5)

Sign up a new org
Provision a phone number
Observe DashboardBriefing + CustomReport + TextReply agents created
Send an SMS to the phone number → observe 3 suggestions generated via new path
Wait for daily report cron tick → observe email delivered
Open dashboard → observe briefing rendered
Inspect /agents/{id}/runs/{run_id} timeline for each
Confirm ai_usage_log rows have matching ai_agent_log rows

§13Open questions for you

These are the decisions where multiple reasonable answers exist. Recommended option is highlighted. Please confirm or redirect.

Q1 · NAMING

Use AiAgent (defer voice rename) — or Agent (rename voice now) — or a fresh word entirely?

AiAgent internally; UI says "Agents"; rename voice → VoiceAgent later (Phase 11). Internal disambiguation, clean UX, low day-one risk.
Rename voice → VoiceAgent first, then use Agent. Cleaner code long-term but expensive coordination during the migration.
Fresh word (Operator, Worker, AiAssistant). Avoids collision but introduces yet another concept.

Q2 · CONCEPTUAL FRAMING

Are "Plans" subsumed by Agents, or do they stay as a separate concept?

Plans become one kind: AutonomousCampaignAgent. Other agent kinds are simpler — they don't need contacts, actions table, or re-enrollment policy.
Force everything into the plan shape. Adds complexity to simple agents.
Plans + Agents stay as fully separate concepts. Loses the unification you asked for.

Q3 · CUTOVER STRATEGY

How aggressive should the cutover be?

Coexist with feature flags; retire legacy paths after 30 days at 100%. Safest for paying customers.
Hard cutover per phase. Faster but each phase ships with rollback panic if something breaks.

Q4 · SELF-EVOLUTION

Default-on, opt-in, or paid premium?

Opt-in per org with a 14-day free trial. Demonstrates value without surprise costs.
Default-on with hard token budgets. Aggressive learning but billing complaints likely.
Premium-tier feature only. Monetizes evolution; lower-tier orgs miss out.

Q5 · VECTOR EMBEDDINGS

Add pgvector for semantic memory now or later?

Defer to a focused spike after Phase 9. Initial implementation: JSONB + GIN + recency. Add when one kind proves the need.
Add pgvector in Phase 0. Higher upfront cost; locks in retrieval strategy before we know what's needed.

Q6 · TOOL-CALL TYPING

Typed per kind, one megaenum, or untyped JSONB?

One typed enum per kind, JSONB column, kind-discriminated parse. Type safety per kind; no megaenum; shared storage.
Single megaenum across all kinds. Type safety everywhere; gets unwieldy.
Untyped — just JSONB. Loses the safety already proven valuable in plans.

Q7 · TEXT-SUGGESTION OUTPUT

Structured output (schema::<T>) vs prompt template?

Keep structured output, auto-fallback to legacy on parse failure. Reuses existing approach; safe.
Switch to a prompt template + post-hoc parsing. More fragile.
Mixed: structured for capable models, prompt template for fallback model. Most complex; only if (a) proves unreliable.

Q8 · VOICE AGENT FUTURE

When (if ever) does the voice agent join the AiAgent family?

Phase 12+: voice joins, sharing only persona/rules/budget; realtime session machinery stays separate.
Force voice agents into the same ai_agent_run shape now. Probably forces awkward modeling.
Leave voice agents fully separate forever. Misses cross-cutting features (rules, budget, audit).

Q9 · CUSTOM REPORTS — HOW MUCH FREEDOM?

Canned options, free-text prompt, or visual builder?

Canned options + a single optional 500-char custom prompt. Predictable cost; predictable quality.
Fully free-text custom report prompt. Maximum flexibility; harder to budget; quality varies.
Visual builder (sections + filters + frequency) with no free-text. Most polished UX; biggest scope.

Q10 · WHERE TO START

First PR scope?

Phase 0 only. Schema + types + admin endpoint, no behavior change. Low review burden, sets the foundation.
Phase 0 + Phase 1 together. Faster end-to-end demo; bigger review.
Skip Phase 0; rename plan_template to agent in place. Riskiest; collides with voice agent.

§14References

Primary sources used in this research.