Loquent · Plans → Agents Research · v1
Migration design · research artifact

From Plans to Agents.
A unified AI runtime for Loquent.

Loquent already has most of an agent system — it's just spread across eight modules under different names. This document proposes a single AiAgent abstraction that owns every AI operation in the platform: text suggestions, transcription, call analysis, contact memory, reports, dashboard, task extraction, autonomous campaigns. Personas, memory, rules, tools, access. Inspired by Hermes Agent. Built to learn.

Audience · You + future coding agents Status · Research; awaiting your direction on §13 First PR · Phase 0 only — ships in a week

TL;DR — the nine decisions that shape everything

01

Build a new generic AiAgent runtime in src/mods/ai_agent/. Don't rename Plan in place.

02

The voice agent at mods::agent already owns the name. Use AiAgent internally; show "Agents" in UI.

03

Coexist with feature flags. Plans keep running. Each retargeting rolls out per org with a 30-day cleanup tail.

04

Phase 0 ships schema + types + admin endpoint. Phase 1 pilots Dashboard Briefing (lowest blast radius).

05

6 dimensions per agent: Persona, Memory, Goals, Rules, Tools, Access. Hermes-style profile isolation.

06

Self-evolution blends Generative Agents reflection + Reflexion lessons + Voyager skill library.

07

Observability follows OpenTelemetry GenAI semantic conventions on ai_agent_log. OTel-ready day one.

08

3-tier rules: Platform > Org > Agent. Platform rules are immutable; ship via code-reviewed migration.

09

Reflection is opt-in per org (cost discipline). Defaults stay on cheap models with hard budgets.

§1Context — why we're doing this

Loquent's AI capabilities live in eight disconnected places. Each has its own prompt-building, model resolution, tool list, logging shape, notification path, and permission story. There is no shared identity an end user can name, configure, or watch evolve.

8+
AI modules with their own flow
26
AiUsageFeature variants
71
files reference mods::plan
54
files reference mods::agent (voice)
39
files reference mods::text_agent
0
unified persona or memory model

What an org owner experiences today

They see a dashboard briefing, a daily report email, three reply suggestions on every inbound message, and (if they're using campaigns) a Plan timeline. None of those four AI surfaces are aware of each other. The dashboard briefing doesn't know the campaign agent has been pestering a contact about something specific. The text-reply suggestions don't know what the daily report told the owner this morning.

What we want them to experience

One mental model: "These are my agents. They have personalities, they follow rules I taught them, and they're learning from how I use them." Concretely:

  • One identity per agent (org-scoped, kind-tagged, persona on file)
  • One timeline showing every operation, with cost and latency per step
  • One rule system where platform rules layer under org rules layer under per-agent rules
  • One memory attribution: every contact note knows which agent and which run wrote it
  • One learning loop: daily reflections → weekly skills → monthly distillation

§2Research summary

2.1 Hermes Agent — the source of inspiration

You mentioned Hermes Agent as the model. The reference is Nous Research's Hermes Agent — a self-improving multi-agent framework where every agent is a profile: a fully isolated unit with its own configuration, persona (SOUL.md), memory database, tool list, and cron jobs. Profiles connect independently to Telegram, Discord, Slack, WhatsApp, Signal, Email. Each is a complete CLI entity with its own personality.

i

What we steal from Hermes

The SOUL.md pattern — codifying persona as a first-class, shareable record. The isolated-profile concept — each agent is a complete identity, not a function call. Loquent's adaptation: multi-tenant SaaS where org boundaries matter and agents share infrastructure.

2.2 Agent anatomy across modern frameworks

DimensionHermesCrewAILangGraphLetta (MemGPT)
Persona SOUL.md file role, goal, backstory System prompt in node Agent identity + persistent memory
Memory File DB + session history Short-term + opt-in long-term Explicit state object 3-tier: Core / Recall / Archival
Goals SOUL.md + cron Tasks in a Crew Graph nodes (implicit) Memory-driven
Rules SOUL.md + env System prompt + tools Node LLM instruction Constitutional layer (WIP)
Tools Function-calling + skill library Per-agent tool list Tools as graph nodes Tool calls within memory
Access Gateway channel + env Task-scoped Node-level (implicit) Function signatures

No framework has built-in multi-tenant access control or organization-scoped rules. Loquent has to design this layer itself.

2.3 Self-evolution — three mechanisms we'll blend

MethodTriggerStorageHow Loquent uses it
Generative Agents
Park et al. 2023
Importance threshold (~2–3×/day) Time-indexed text + embeddings Daily/weekly/monthly ai_agent_reflection rows; meta-LLM digests the log
Reflexion
Shinn et al. 2023
External feedback / failure Episodic memory buffer ai_agent_lesson rows: one-line verbal learnings from failed runs + user feedback
Voyager
Wang et al. 2023
Task success Indexed code/prompts in vector DB ai_agent_skill rows: extracted prompt fragments after N successes; retrieved into future runs

2.4 Observability — OpenTelemetry GenAI conventions

Since 2024, OpenTelemetry has defined semantic conventions for GenAI tracing. This is the emerging industry standard that unifies Langfuse, LangSmith, Helicone, Traceloop. We adopt it in ai_agent_log day one so a future export to any tracing backend is a config change, not a refactor.

Loquent fieldOTel attributeExample
ai_agent_run.trace_idtrace.iduuid
ai_agent_log.span_idgen_ai.* span iduuid
ai_agent_log.parent_span_idspan.parent_iduuid
ai_agent.kindcustom: agent.kindtext_reply_agent
ai_usage_log.modelgen_ai.request.modelclaude-opus-4-7
ai_usage_log.input_tokensgen_ai.usage.input_tokens1500
ai_usage_log.cached_tokensgen_ai.usage.cache_read_tokens200
computedcustom: cost_cents0.18
ai_agent_log.duration_msspan duration850

§3Current state in Loquent

Three modules already do most of what we need. They just don't know about each other.

The Plan system — already 90% an agent runtime

src/mods/plan/ (71 file references) is a sophisticated autonomous-execution framework. It has a typed tool-call log, an 8-state JSONB state machine, a cron-polled executor, contact assignments, autopilot gating, approval gates, re-enrollment policy, per-template model override. The hard problems are solved here.

The 8 plan states (we reuse this exact shape for AiAgentRunState)

PendingReview user must approve before running StandBy { next_execution_at } waiting for cron to pick up AwaitingInput paused on AskUser / approval Executing LLM loop is active Paused manually paused Completed { completed_at } Failed { error_message } Stopped user-cancelled

The 13 typed tool variants (we keep this typed-per-kind pattern)

SendEmail, SendSms, ListPlanContacts, GetContactDetails, GetContactNotes, WriteInteractionNote, UpdateSystemNote, GetConversationHistory, UpdateContact, AskUser, CompletePlan, FailPlan, ScheduleNextExecution

The Text Agent — already a separate module

src/mods/text_agent/ generates 3 high/med/low-confidence reply suggestions per inbound message. Has its own text_agent table (purpose, tier, model, escalation_instructions, restricted_topics, temperature, knowledge_base_ids) and a text_agent_suggestion table. Critical-path code: generate_text_agent_suggestions_service.rs:11.

The Voice Agent — naming collision risk

!

Critical: mods::agent is already taken

The voice/realtime agent module owns the Agent type, AgentResource permission, the phone_number.agent_id FK, and the agent table — 54 file references. Any new "Agent" abstraction that reuses the bare name will produce ambiguous imports, compile errors, and confused code reviews. We use AiAgent internally and rename voice → VoiceAgent as a late, optional phase once the new runtime is stable.

The AI infrastructure — already well-organized

src/mods/ai/ wraps aisdk + OpenRouter. The ai_usage_log table + spawn_log_ai_usage(AiUsageEntry) helper at log_ai_usage_service.rs:13 capture every AI call's tokens, cost, model, provider, latency. 26 AiUsageFeature variants already cover every call site (TextAgentSuggestions, DashboardBriefing, AnalyzeCall, SummarizeCall, UpdateContactMemory, GenerateReport, ExtractTasks, RealtimeTurn, …).

No new AiUsageFeature variants needed

Each AiAgentKind maps to an existing feature variant. Billing tier matrix stays stable. Admin dashboards keep working. The enum is the integration seam — don't grow it.

§4Target architecture

The 6 dimensions of an AiAgent

┌──────────────────────────────────────────────────────────────┐ │ AiAgent │ ├──────────────────────────────────────────────────────────────┤ │ Persona (SOUL.md-equivalent — tone, voice, identity) │ │ Memory (short-term scratchpad + long-term facts) │ │ Goals (current run goals + persistent objectives) │ │ Rules (per-agent; platform/org layers added at run) │ │ Tools (allowlist; resolved at run-time) │ │ Access (read/write scopes; defaults to triggering user)│ └──────────────────────────────────────────────────────────────┘

Agent kinds — the launch lineup

KindPhaseWhat it doesReplaces
DashboardBriefingAgentP1Daily summary card on the dashboardgenerate_dashboard_briefing_service
TextReplyAgentP23 suggestions per inbound messagemods::text_agent
CustomReportAgentP3Scheduled digest (leads, calls, messages, tasks)mods::report
AutonomousCampaignAgentP4Multi-step contact campaignsmods::plan executor
AnalyzerAgentP8Call analysis (custom analyzers)mods::analyzer
TaskExtractionAgentP8Extract todos from calls/messagescreate_tasks_from_call_service
AutoTagAgentP8Tag contacts after callauto_tag_contact_from_call_service
ContactEnrichmentAgentP8Enrich contact fields from convoenrich_contact_service
ContactMemoryAgentP8Maintain contact memory notesupdate_contact_memory_service
AssistantAgentP8In-app assistant chatmods::assistant
ReflectionAgentP9Meta-agent that distills other agents' logs(new)

Naming: internal vs. UI

  • Internal (Rust code): AiAgent, mods::ai_agent, ai_agent table, AiAgentResource permission. Disambiguates from existing voice Agent.
  • User-facing (UI copy): "Agents" everywhere. Voice agents become "Voice Agents". Owners see one mental model.

§5Data model

10 new tables. 4 existing tables augmented with nullable FKs. No drops in the migration window.

ER diagram

organization (existing) │ ▼ ai_agent ┌───────────┴───────────┐ ▼ ▼ ai_agent_run ai_agent_schedule │ ┌─────────┼─────────┐ ▼ ▼ ▼ ai_agent_log ai_agent_memory ai_usage_log (existing) (timeline) (working+long) │ ▼ ai_agent_reflection / ai_agent_skill / ai_agent_lesson (rolled up from logs) Existing tables augmented with FK: plan.ai_agent_id → ai_agent.id (autonomous campaign) plan_template.ai_agent_id → ai_agent.id text_agent.ai_agent_id → ai_agent.id (text reply) contact_note.written_by_ai_agent_run_id → ai_agent_run.id

The big four — what they store, why they exist

1. ai_agent — the identity (the SOUL.md row)

One row per agent. Org-scoped. Holds: kind, name, slug, is_system_default, is_active; JSONB fields for persona (tone/voice/signature/traits/soul_md), goals, rules_override, tools_allowlist, access_scope, budget; plus model_override, enable_reflection, enable_autopilot.

Indexes: (organization_id, kind, slug) UNIQUE; (kind) for dispatch fan-out; partial on active rows.

2. ai_agent_run — one execution

Has state (the AiAgentRunState enum JSONB, same shape as PlanState), parent_run_id for orchestration chains, trigger_source JSONB (Manual { user_id } | Cron { schedule_id } | Webhook { entity, entity_id } | Triggered { by_run_id }), input_payload, output_payload, total cost/tokens/tool-calls, and trace_id for OTel.

3. ai_agent_log — unified, OTel-compatible timeline

The single most important new table. Every operation an agent performs becomes one row. Six kinds:

kindWhen writtenPayload
system_eventState transition, schedule, retry, budget enforcedevent, prev_state, new_state, reason
thoughtLLM produced reasoning texttext, confidence
llm_generationEvery aisdk callprovider, model, finish_reason, summaries + FK to ai_usage_log
tool_callEvery typed tool invocationToolCallVariant<I,O> per kind (typed I/O)
reflectionA reflection cycle ranscope, period, source_log_ids[]
failoverNew path failed, legacy path took overfrom_path, to_path, reason

Indexes for fast analysis AND reflection pipeline: (ai_agent_run_id, created_at) primary for timeline rendering; (ai_agent_id, created_at DESC) for "recent activity"; (organization_id, created_at DESC) partial on llm_generation|tool_call for cost dashboards; GIN on entry for JSONB attribute search.

4. ai_agent_schedule — cron-driven runs

One row per scheduled agent. cron_spec + timezone + next_run_at + is_enabled + config_payload JSONB (recipients, filters — kind-specific). A single run_due_ai_agents_job polls every minute with SELECT … FOR UPDATE SKIP LOCKED and dispatches.

The evolution tables (Phase 9)

TableWhat it stores
ai_agent_memoryShort-term scratchpad + long-term facts. JSONB value; nullable embedding (pgvector deferred)
ai_agent_reflectionDaily/weekly/monthly digests. content_markdown, key_insights JSONB, source_run_ids, tokens consumed
ai_agent_skillVoyager-style growing library. prompt_fragment, trigger_condition, success_count, failure_count, admin curation flags
ai_agent_lessonReflexion verbal lessons. One-line learning + provenance FK to failure run + optional user feedback

System rules

platform_ai_rule — system-wide immutable rules (slug, applies_to_kinds JSONB, rule_text, priority). Edited only by code-reviewed migrations. The existing organization_ai_rule table becomes the org tier.

i

Why JSONB instead of separate persona/goal/rule tables

We don't proliferate side tables until a dimension proves it needs first-class queryability. Persona, goals, agent-rules, tools-allowlist, access-scope, and budget all start as JSONB columns on ai_agent. If a future requirement (e.g., "list every agent that has rule X") needs a structured query, we promote that one dimension to a table — not before.

§6Migration phases

Eleven phases. The first ships in a week with zero behavior change. The next three retarget existing modules. Phase 4 attaches Plans to the new runtime structurally without changing the executor.

  1. PHASE 0 Foundations
    M · 1 week · first PR

    Land ai_agent, ai_agent_run, ai_agent_log tables + Rust types + generic run_ai_agent executor + admin-only debug endpoint. Zero behavior change. Adds the AiAgent permission resource.

  2. PHASE 1 Dashboard Briefing Agent (pilot)
    M · lowest blast radius

    Retarget generate_dashboard_briefing_service to run via the new runtime. Failures are cosmetic (stale briefing) — perfect first pilot. Per-org flag, A/B parity test, hard token budget.

  3. PHASE 2 Text Reply Agent
    L · structured output

    Refactor mods::text_agent to back its persona/rules on ai_agent. Critical path (every inbound message). Auto-fallback to legacy on schema parse failure. 10% rollout watched for a week before 100%.

  4. PHASE 3 Custom Report Agent + schedule dispatcher
    L · cron consolidation

    Replace SendDailyReportJob with RunDueAiAgentsJob driven by ai_agent_schedule. Mutually-exclusive flags prevent duplicate sends. 7-day shadow-run before cutover.

  5. PHASE 4 Autonomous Campaign Agent (attach plans)
    XL · structural only

    Every plan + plan_template gets an ai_agent_id FK (kind=AutonomousCampaignAgent). Plan log dual-writes to ai_agent_log. Executor body unchanged. Plans-specific bookkeeping (contacts, actions, re-enrollment) stays where it is.

  6. PHASE 5 Seed default agents on signup
    S · 2–3 days

    New seed_default_agents_service called from finalize_signup_service:174. Creates DashboardBriefing + CustomReport agents. Non-blocking like existing notification-pref seeder.

  7. PHASE 6 Layered rules (Platform > Org > Agent)
    M

    3-tier rule injection via build_layered_rules. Platform rules immutable; ship via code-reviewed migration. Every run snapshots effective rules in its log for auditability.

  8. PHASE 7 Memory unification
    M

    All contact-memory writes stamped with ai_agent_run_id. New ai_agent_memory table for short-term scratchpad + long-term facts. pgvector deferred (JSONB + GIN until proven needed).

  9. PHASE 8 Onboard remaining AI ops
    XL · 6 small PRs · parallelizable

    One PR per kind: AnalyzerAgent, TaskExtractionAgent, AutoTagAgent, ContactEnrichmentAgent, ContactMemoryAgent (messages batch), AssistantAgent. Each is a drop-in replacement of the inline aisdk call.

  10. PHASE 9 Self-evolution pipeline
    XL · opt-in

    Daily/weekly/monthly reflection cron + skill extraction (Voyager) + lesson capture (Reflexion). ReflectionAgent kind. UI at /agents/{id}/evolution. Opt-in per org due to cost.

  11. PHASE 10 Budgets, throttling, observability UI
    L

    Per-agent + per-org budget enforcement before every LLM call. Unified timeline view at /agents/{id}/runs/{run_id}. Cost/latency/success-rate metrics dashboard.

  12. PHASE 11 Voice agent rename (optional, late)
    L · mechanical

    Resolve long-term collision: AgentVoiceAgent. Only after the new runtime is stable for ≥6 months. Pure rename PR with cargo check guardrails.

Top 5 risks & mitigations

#RiskMitigation
1Naming collision with existing voice AgentUse AiAgent internally; defer voice rename to Phase 11
2Live plan execution disruption (paying customers)Coexist with feature flags; Phase 4 is structural-only (no semantic change)
3Billing/usage logging double-countingMutually-exclusive flags; duplicate-detection guard in spawn_log_ai_usage
4Structured-output parse failure (Text Agent)Auto-fallback to legacy path; failover recorded as ai_agent_log.kind=failover; 10%→100% rollout
5Cron-job overlap during Custom Report transitionMutually-exclusive flag; legacy cron short-circuits when new flag on; FOR UPDATE SKIP LOCKED

§7Observability — tracing every operation

Every thought, tool call, LLM generation, and system event becomes one indexed row in ai_agent_log. Optimized for fast timeline rendering AND for feeding the reflection pipeline.

The indexes that matter

IndexPowers
(ai_agent_run_id, created_at)Timeline UI
(ai_agent_id, created_at DESC)Recent activity per agent · Reflection pipeline reads this
(organization_id, created_at DESC) partialCost dashboards (LLM gen + tool calls only)
(kind, created_at DESC)Filter by kind in admin UI
GIN on entry JSONBAttribute search ("find all tool_call where input.to_number = X")
i

Day-one OTel-compatible

We don't ship to Langfuse/Datadog yet — but the field names, span/parent_span structure, and trace IDs are OTel-conformant. When we want to export, it's a collector config change, not a refactor.

§8Self-evolution — the learning loop

The loop closes when build_ai_agent_prompt_service retrieves the top-K relevant skills + recent lessons and injects them into the next run's prompt. Voyager pattern: agents get better at the same task family over time, without retraining.

┌─────────────────────────┐ ai_agent_log (run) ──► │ RunDailyReflectionJob │ ──► ai_agent_reflection (day) └─────────────────────────┘ │ ▼ ┌─────────────────────────┐ │ WeeklyDistillationJob │ ──► ai_agent_reflection (week) └─────────────────────────┘ + ai_agent_skill (Voyager: success patterns) │ + ai_agent_lesson (Reflexion: failure lessons) ▼ ┌─────────────────────────┐ │ MonthlyDistillationJob │ ──► ai_agent_reflection (month) └─────────────────────────┘ + curated skills / retired lessons

Cost discipline

  • Opt-in per org with a 14-day free trial. Default off.
  • Cheapest available model for reflection (e.g. deepseek-v3.2).
  • Hard per-agent monthly token budget enforced.
  • Skills with failure rate > threshold are auto-disabled.
  • Weekly distillation re-evaluates daily reflections (averaging effect against bad-day reinforcement).

Admin curation

The owner can view /agents/{id}/evolution: every reflection, every extracted skill (with success/failure counts), every lesson. They can edit, suppress, or retire any of them. This keeps the loop transparent — agents that learn must also be agents you can teach.

§9The three initial agents (Phase 1–3 deliverables)

Dashboard Briefing Agent

Trigger: Dashboard page load + scheduled (daily 5 AM org-tz)

Persona default: "Concise, data-driven analyst. Plain English, no jargon."

Tools: get_workspace_summary, get_workspace_needs_attention, get_engagement_stats

Output: Markdown briefing rendered on the dashboard

Maps to existing: generate_dashboard_briefing_service.rs

Text Reply Agent

Trigger: Webhook (inbound SMS)

Persona default: Inherits from existing text_agent (purpose, escalation, restricted topics)

Tools: query_knowledge

Output: Structured 3-suggestion JSON (high/med/low confidence)

Special: Auto-fallback to legacy on schema parse failure

Custom Report Agent

Trigger: Scheduled (ai_agent_schedule row, default daily at 5 AM org-tz)

Persona default: "Helpful daily-digest assistant. Highlight what changed and what needs attention."

Toggleable sections: new leads · handled calls · messages in · messages out · prioritized tasks

Delivery: notify_service → email + in-app + push (per user pref)

Maps to existing: generate_daily_report_service.rs

§10Org onboarding — defaults on signup

Hook added to finalize_signup_service.rs:174, right beside the existing notification-preference seeding. The new seed_default_agents creates:

  • 1× DashboardBriefingAgent per org — system default, active, autopilot
  • 1× CustomReportAgent per org — system default, active, daily 5 AM schedule
  • 1× TextReplyAgent per phone number provisioned — autopilot=false (suggestions, not auto-replies)

Each seeded agent carries is_system_default = true. The org admin can edit any of these. Setting is_system_default = false orphans the row from future platform updates — so a platform-level persona update doesn't overwrite a customer's edits.

i

Non-blocking like notification preferences

If seeding fails, signup still completes. We log the error and the org can have agents seeded later via a backfill job. Signup never fails because the agent table had a transient issue.

§11Layered rules — system > org > agent

Three layers concatenated into every agent's system prompt. Platform rules are immutable from any UI; they ship via code-reviewed migration.

EFFECTIVE_SYSTEM_PROMPT = [Platform rules] ← immutable; in code + platform_ai_rule table + [Org rules] ← organization_ai_rule table; org admin editable + [Agent rules] ← ai_agent.rules_override; per-agent editable + [Persona] ← ai_agent.persona + [Goal] ← ai_agent.goals + per-run goal + [Retrieved skills] ← top-K ai_agent_skill rows + [Recent lessons] ← recent ai_agent_lesson rows + [Kind instructions] ← static, per-kind

Example platform rules (immutable)

  • Never make legal, medical, or financial advice claims.
  • Never share PII outside the org boundary.
  • Always announce when a message is AI-generated if asked.
  • Never autopilot a destructive action without explicit org-owner approval.
  • Always log every tool call with full input/output.

Example org rules (editable)

  • Sign messages as "Alex from Acme Co."
  • Use first-name only when addressing contacts.
  • Never discuss competitor X.

Example agent rules (per-agent overrides)

  • This text agent only handles inbound questions about pricing.
  • This report agent emails the founder, not the team.

Auditable enforcement

Each ai_agent_run.ai_agent_log snapshots the effective rules used. Platform changes ship via migration, not SQL. If an agent acted in a way that violates a rule, the log proves what rules were in force at the time.

§12Verification plan

How we know each phase is safe to ship.

PhaseAcceptance signal
0cargo check green; migrations apply + roll back in CI; admin endpoint manually exercised; state-machine unit tests pass
1Side-by-side parity test for 7 days; semantic content diff acceptable; cost parity within ±10%; no double-write to ai_usage_log
2Nightly synthetic: 100 simulated inbounds → 3 suggestions each on new path, 0 fallbacks; 10% rollout watched 1 week before 100%
37-day shadow-run dispatcher verified; single test org switched to new job; no duplicate reports
4Every plan + template has non-null ai_agent_id after backfill; plan execution success rate stable 2 weeks
5New-signup smoke: org gets 2+ ai_agent rows immediately
6Unit test: org-rule cannot override platform-rule; effective rules snapshotted in log
7Plan-driven memory writes show written_by_ai_agent_run_id set
8Per-kind parity test against legacy code path
930-day-old agent has ≥1 reflection + ≥1 skill, viewable in admin UI
10Budget exceeded → next run rejected with user-visible message + system_event log entry

End-to-end smoke (after Phase 5)

  1. Sign up a new org
  2. Provision a phone number
  3. Observe DashboardBriefing + CustomReport + TextReply agents created
  4. Send an SMS to the phone number → observe 3 suggestions generated via new path
  5. Wait for daily report cron tick → observe email delivered
  6. Open dashboard → observe briefing rendered
  7. Inspect /agents/{id}/runs/{run_id} timeline for each
  8. Confirm ai_usage_log rows have matching ai_agent_log rows

§13Open questions for you

These are the decisions where multiple reasonable answers exist. Recommended option is highlighted. Please confirm or redirect.

Q1 · NAMING

Use AiAgent (defer voice rename) — or Agent (rename voice now) — or a fresh word entirely?

  • Rename voice → VoiceAgent first, then use Agent. Cleaner code long-term but expensive coordination during the migration.
  • Fresh word (Operator, Worker, AiAssistant). Avoids collision but introduces yet another concept.
Q2 · CONCEPTUAL FRAMING

Are "Plans" subsumed by Agents, or do they stay as a separate concept?

  • Force everything into the plan shape. Adds complexity to simple agents.
  • Plans + Agents stay as fully separate concepts. Loses the unification you asked for.
Q3 · CUTOVER STRATEGY

How aggressive should the cutover be?

  • Hard cutover per phase. Faster but each phase ships with rollback panic if something breaks.
Q4 · SELF-EVOLUTION

Default-on, opt-in, or paid premium?

  • Default-on with hard token budgets. Aggressive learning but billing complaints likely.
  • Premium-tier feature only. Monetizes evolution; lower-tier orgs miss out.
Q5 · VECTOR EMBEDDINGS

Add pgvector for semantic memory now or later?

  • Add pgvector in Phase 0. Higher upfront cost; locks in retrieval strategy before we know what's needed.
Q6 · TOOL-CALL TYPING

Typed per kind, one megaenum, or untyped JSONB?

  • Single megaenum across all kinds. Type safety everywhere; gets unwieldy.
  • Untyped — just JSONB. Loses the safety already proven valuable in plans.
Q7 · TEXT-SUGGESTION OUTPUT

Structured output (schema::<T>) vs prompt template?

  • Switch to a prompt template + post-hoc parsing. More fragile.
  • Mixed: structured for capable models, prompt template for fallback model. Most complex; only if (a) proves unreliable.
Q8 · VOICE AGENT FUTURE

When (if ever) does the voice agent join the AiAgent family?

  • Force voice agents into the same ai_agent_run shape now. Probably forces awkward modeling.
  • Leave voice agents fully separate forever. Misses cross-cutting features (rules, budget, audit).
Q9 · CUSTOM REPORTS — HOW MUCH FREEDOM?

Canned options, free-text prompt, or visual builder?

  • Fully free-text custom report prompt. Maximum flexibility; harder to budget; quality varies.
  • Visual builder (sections + filters + frequency) with no free-text. Most polished UX; biggest scope.
Q10 · WHERE TO START

First PR scope?

  • Phase 0 + Phase 1 together. Faster end-to-end demo; bigger review.
  • Skip Phase 0; rename plan_template to agent in place. Riskiest; collides with voice agent.

§14References

Primary sources used in this research.

  1. Hermes Agent — Nous Research · Profiles
  2. Generative Agents — Park et al., Stanford/Google (UIST '23)
  3. Reflexion: Language Agents with Verbal Reinforcement Learning — Shinn et al. (NeurIPS 2023)
  4. Voyager: Open-Ended Embodied Agent with LLMs — Wang et al. (2023)
  5. Anthropic — Building Effective Agents (Dec 2024)
  6. CrewAI Documentation — Agents & Memory
  7. Letta (MemGPT) — Agent Memory
  8. OpenTelemetry GenAI Semantic Conventions
  9. Langfuse — Open-Source LLM Observability
  10. Policy-as-Prompt — arXiv:2509.23994