Product UX Direction

Status: working direction

Calciforge currently exposes a lot of power through chat commands, installer prompts, config files, docs, and service logs. That is workable for early operators, but it makes the product feel clunky because the user has to remember hidden state: which agent is active, which channel they are speaking through, which host owns a local link, which security layer is in force, and which command shape is safe.

The product should feel like a control surface between a person, their agents, and the security gateway. Chat remains valuable, but commands should behave more like a small conversational CLI: discoverable, consistent, state-aware, and easy to recover from.

Principles

Command Shape

The ! prefix is not sacred. It is useful because it is easy to detect and unlikely to collide with natural agent prompts, but it also makes Calciforge feel like a bot command layer bolted onto an agent conversation.

Near term, keep ! for compatibility but make commands more predictable:

Longer term, consider a mode where the channel adapter can accept slash-style commands, quick replies, buttons, or a local web control panel when the channel supports them. The text grammar should remain the source of truth so agents and humans can use the same operations.

Channel-Native Affordances

Calciforge should not force every channel into plain text when the transport offers better controls. Keep text as the universal fallback, but model richer controls as optional channel capabilities:

Matrix needs a separate security track for end-to-end encryption. The current Matrix channel uses plaintext Client-Server API rooms; E2EE support requires Olm/Megolm device/session management, trust or verification policy, encrypted media handling, and clear recovery behavior when keys are unavailable. Until that lands, docs should label Matrix as plaintext and operators should choose Signal or another encrypted channel for sensitive chat content.

Expose these capabilities through configuration, not channel assumptions. ui_mode = "auto" can enable safe native controls for a direct channel while ui_mode = "text" keeps bridge-heavy setups, such as WhatsApp through Matrix or Beeper, on the plain text interface. Button presses should always call the same command handlers as text input so both modes stay behaviorally identical.

iMessage and WhatsApp likely have useful non-text surfaces. Telegram, Matrix, and SMS/iMessage need explicit research against their current APIs and the libraries Calciforge uses before committing to a shared abstraction. A reasonable architecture is a channel capability trait: handlers ask for a high-level interaction such as “single choice”, “approval”, “artifact”, or “form link”; each channel renders the best native affordance it can, then falls back to deterministic text.

WhatsApp is worth treating as a dependency-risk item. If the embedded WhatsApp Web library cannot expose reply buttons or lists safely, Calciforge can still ship text/media support and use Telegram or the local web UI as a control surface. A narrow fork may be justified later if native WhatsApp controls become important enough and the upstream crate does not accept or prioritize the needed API surface.

Secret Input UX

!secure input and !secure bulk should read as local-network paste flows:

Local Web Control Surface

A local web UI would reduce chat-command pressure without replacing chat:

This can start as localhost/LAN-only and later support authenticated remote access if the security model is explicit.

References