← Guides
v1 v2
Download setup guide (.md)
Original architecture documented by Eric Cheng and his AI assistant. This v2 adapts the design for Discord-first command surfaces, Qwen3-TTS, and Mac mini deployments.

OpenClaw → iMessage Flow

Command your AI assistant via Discord or Telegram. It sends messages — text or AI-generated voice — to your contacts via iMessage. Everything runs locally on a Mac. No cloud TTS, no third-party message routing.

What changed in v2

Componentv1v2
Command surfaceTelegram onlyDiscord (primary) + Telegram
HardwareMacBook ProMac mini (always-on, headless)
TTS engineKokoro-82M (fixed presets)Qwen3-TTS 0.6B/1.7B (voice cloning, natural language control)
Voice profilesPreset voice IDsVoice design (describe it) or voice clone (3s sample)
Languages810 (added Korean, Russian)
Security5 layers5 layers + FileVault, macOS Firewall, Tailscale

High-Level Architecture

      graph LR
        DISCORD["Discord
Primary command surface"] TELEGRAM["Telegram
Secondary / mobile"] GW["Mac mini
OpenClaw Gateway
always-on, headless"] AGENT["Main Agent
Claude Opus"] PROFILE["Contact Profiles
language · tone · voice"] QWEN["Qwen3-TTS
0.6B / 1.7B · local"] FFMPEG["ffmpeg
WAV → M4A"] IMSG["imsg CLI
Messages.app"] RECV["Recipient
iMessage"] DISCORD -->|"command"| GW TELEGRAM -->|"command"| GW GW --> AGENT AGENT -->|"reads"| PROFILE AGENT -->|"text"| QWEN PROFILE -->|"voice desc / clone ref"| QWEN QWEN -->|"WAV"| FFMPEG FFMPEG -->|"M4A"| IMSG AGENT -->|"text only"| IMSG IMSG -->|"iMessage"| RECV AGENT -->|"confirms"| GW GW -->|"reply"| DISCORD classDef discord fill:#151a2e,stroke:#7289da,stroke-width:2px,color:#e2e8f0 classDef tg fill:#111e30,stroke:#60a5fa,stroke-width:1.5px,color:#e2e8f0 classDef mac fill:#0d1f18,stroke:#34d399,stroke-width:2px,color:#e2e8f0 classDef tts fill:#1a1806,stroke:#fbbf24,stroke-width:1.5px,color:#e2e8f0 classDef imsg fill:#0d1f14,stroke:#4ade80,stroke-width:2px,color:#e2e8f0 classDef recv fill:#0e1119,stroke:#1a1d24,stroke-width:1.5px,color:#e2e8f0 class DISCORD discord class TELEGRAM tg class GW,AGENT,PROFILE mac class QWEN,FFMPEG tts class IMSG imsg class RECV recv
Discord (primary)
Telegram (secondary)
OpenClaw (processing)
TTS pipeline
iMessage (delivery)

Hardware & Account Isolation

Dedicated Mac mini

Always-on · headless · Apple Silicon
  • Runs OpenClaw 24/7 as a LaunchAgent
  • Standalone Apple ID (not your personal account)
  • Dedicated email + eSIM on a standalone iPhone
  • Home network + Tailscale for secure remote access
  • FileVault enabled (full disk encryption)
  • macOS Firewall enabled

Why Physical Isolation

Your data never leaves your hardware
  • iMessage requires a real Apple ID signed into Messages.app
  • Full Disk Access granted to OpenClaw.app process only
  • No personal data on the assistant's machine
  • TTS runs 100% locally — no audio sent to any cloud
  • Tailscale for remote access (never port forwarding)

Message Flow — Step by Step

      graph TD
        CMD["Owner sends command
'Send X a voice message about Y'"] ACK["Agent reads context
contact profile + permissions"] COMPOSE["Compose message
language · tone · dialect per profile"] PREVIEW["Draft preview
sent to owner for approval"] APPROVE{"Approve?"} ITERATE["Owner gives direction
'more grit' · 'shorter' · 'add a joke'"] TTS_CHECK{"Voice message?"} TTS["6a. Qwen3-TTS
voice design or clone"] CONVERT["6b. ffmpeg
WAV → M4A"] SEND_TEXT["imsg send --text"] SEND_VOICE["imsg send --file"] LOG["Log to outbound-log.md"] CONFIRM["Confirm on Discord"] CMD --> ACK ACK --> COMPOSE COMPOSE --> PREVIEW PREVIEW --> APPROVE APPROVE -->|"revise"| ITERATE ITERATE --> COMPOSE APPROVE -->|"send it"| TTS_CHECK TTS_CHECK -->|"yes"| TTS TTS --> CONVERT CONVERT --> SEND_VOICE TTS_CHECK -->|"text only"| SEND_TEXT SEND_TEXT --> LOG SEND_VOICE --> LOG LOG --> CONFIRM classDef cmd fill:#151a2e,stroke:#7289da,stroke-width:2px,color:#e2e8f0 classDef agent fill:#0d1f18,stroke:#34d399,stroke-width:1.5px,color:#e2e8f0 classDef decision fill:#1a1806,stroke:#fbbf24,stroke-width:2px,color:#e2e8f0 classDef tts fill:#1a1806,stroke:#fbbf24,stroke-width:1.5px,color:#e2e8f0 classDef send fill:#0d1f14,stroke:#4ade80,stroke-width:2px,color:#e2e8f0 classDef log fill:#0e1119,stroke:#1a1d24,stroke-width:1.5px,color:#e2e8f0 class CMD,PREVIEW,CONFIRM,ITERATE cmd class ACK,COMPOSE agent class APPROVE,TTS_CHECK decision class TTS,CONVERT tts class SEND_TEXT,SEND_VOICE send class LOG log
1

Owner sends command on Discord (or Telegram)

Natural language: "Send María a voice message wishing her happy birthday". Discord is the primary surface — a dedicated channel the owner uses for all assistant interaction. Telegram works too, especially on mobile.

2

Agent reads context

Loads memory/contacts/<name>.md for language, tone, voice preferences, and relationship context. Checks PERMISSIONS.md to verify outbound is allowed for this contact.

3

Compose message

Writes the message respecting the contact's language (e.g. Castilian Spanish with distinción), humor level, and relationship dynamic. The agent brings its own personality — it's not just a relay.

4

Draft preview → Owner

The full draft is sent back for review. Nothing leaves the Mac until explicitly approved. The owner can iterate with tone direction — "more casual", "add humor", "shorter". Multiple rounds are normal and encouraged.

5

TTS generation (voice messages)

Qwen3-TTS generates speech using either voice design (describe the voice: "warm male, slight accent, moderate pace") or voice clone (3-second reference audio sample). Runs 100% locally on Apple Silicon. Output WAV → ffmpeg converts to M4A for iMessage.

6

Send via iMessage

Text: imsg send --to "+1..." --text "message". Voice: imsg send --to "+1..." --file output.m4a. The CLI automates Messages.app — recipient sees a normal iMessage from the assistant's Apple ID.

7

Log & confirm

Every outbound message logged to memory/outbound-log.md with timestamp, channel, recipient, summary, and approval reference. Delivery confirmation sent to the owner on Discord.

Qwen3-TTS — Voice Generation

      graph LR
        TEXT["Approved text"]
        DESC["Voice description
'warm, friendly, slight accent'"] CLONE["Reference audio
3-10s clip"] QWEN["Qwen3-TTS
0.6B or 1.7B
local ONNX"] WAV["output.wav"] FFMPEG["ffmpeg
AAC 128k"] M4A["output.m4a
iMessage-ready"] TEXT --> QWEN DESC -->|"voice design"| QWEN CLONE -->|"voice clone"| QWEN QWEN --> WAV WAV --> FFMPEG FFMPEG --> M4A classDef input fill:#0d1f18,stroke:#34d399,stroke-width:1.5px,color:#e2e8f0 classDef voice fill:#eef0ff,stroke:#5865f2,stroke-width:1.5px,color:#3b40a0 classDef process fill:#1a1806,stroke:#fbbf24,stroke-width:1.5px,color:#e2e8f0 classDef output fill:#0d1f14,stroke:#4ade80,stroke-width:2px,color:#e2e8f0 class TEXT input class DESC,CLONE voice class QWEN,FFMPEG process class WAV,M4A output

Voice Design

Describe the voice you want

Natural language control: "A deep, calm voice with warmth. Professional but approachable. Moderate pace." No pre-made voices to browse — just describe it.

Voice Clone

3 seconds of audio → your voice

Zero-shot cloning from a short reference clip. Store a sample per contact and the system reproduces their voice — or yours. Clear audio, minimal background noise.

10 Languages

Per-contact language profiles

Chinese, English, Japanese, Korean, German, French, Russian, Portuguese, Spanish, Italian — plus dialectal voice profiles. Language set per contact, not globally.

Command Surfaces — Discord + Telegram

Discord (Primary)

Dedicated channel for assistant interaction
  • Private server with a dedicated #assistant channel
  • Rich formatting: embeds, reactions, threads for long conversations
  • Desktop + mobile — always accessible
  • Approval via reactions or simple "yes"
  • Delivery confirmations appear inline in the conversation

Telegram (Secondary)

Mobile-first, lightweight
  • Bot token configured in OpenClaw channels
  • dmPolicy: "pairing" — one-time code for unknowns
  • Great for quick commands on the go
  • Same approval flow — draft → preview → approve
  • Both surfaces see the same contact profiles and permissions

Security Architecture — 5 Layers + OS Hardening

      graph TD
        REQ["Outbound message request"]
        L1["Layer 1: SOUL.md
Hard rules · Default DENY"] L2["Layer 2: AGENTS.md
Mandatory permission check"] L3["Layer 3: PERMISSIONS.md
Per-contact allowlists"] L4["Layer 4: Channel Config
dmPolicy: allowlist
groupPolicy: disabled"] L5["Layer 5: Audit Log
outbound-log.md"] OS["OS Hardening
FileVault · Firewall · Tailscale"] SEND["✅ Message sent"] BLOCK["❌ Blocked"] REQ --> L1 L1 -->|"pass"| L2 L1 -->|"deny"| BLOCK L2 -->|"pass"| L3 L2 -->|"deny"| BLOCK L3 -->|"approved"| L4 L3 -->|"unknown"| BLOCK L4 -->|"pass"| L5 L5 --> SEND OS -.->|"protects all layers"| L1 classDef req fill:#0e1119,stroke:#1a1d24,stroke-width:1.5px,color:#e2e8f0 classDef layer fill:#fff1f2,stroke:#be123c,stroke-width:1.5px,color:#881337 classDef os fill:#eef2ff,stroke:#6366f1,stroke-width:1.5px,color:#3730a3 classDef pass fill:#0d1f18,stroke:#34d399,stroke-width:2px,color:#e2e8f0 classDef block fill:#fef2f2,stroke:#dc2626,stroke-width:2px,color:#7f1d1d class REQ req class L1,L2,L3,L4,L5 layer class OS os class SEND pass class BLOCK block

Permission Tiers

TierPolicyApplies To
T1AutonomousRead messages, triage, 2FA
T2Pre-approvedSpecific contacts + types
T3Approval requiredAll outbound (default)

OS-Level Hardening

Beyond the application layer
  • FileVault: Full disk encryption — data at rest is protected
  • macOS Firewall: Enabled, blocks unsolicited inbound
  • Tailscale: Private mesh VPN for remote access (no port forwarding)
  • FDA scoped: Full Disk Access only for OpenClaw.app
  • Gateway loopback: OpenClaw binds to 127.0.0.1 only

What unauthorized contacts experience

No indication that an AI is involved

Inbound Monitoring — Reply Detection

      graph LR
        CRON["⏰ Scheduled check
cron job"] CHATS["imsg chats
--limit 10"] HIST["imsg history
--chat-id N"] TRIAGE{"Triage"} URGENT["🔴 URGENT
relay immediately"] NOTABLE["🟡 NOTABLE
relay to owner"] SKIP["⚪ SKIP
ignore"] CRON --> CHATS CHATS --> HIST HIST --> TRIAGE TRIAGE --> URGENT TRIAGE --> NOTABLE TRIAGE --> SKIP classDef cron fill:#eef2ff,stroke:#6366f1,stroke-width:1.5px,color:#3730a3 classDef imsg fill:#edfcf0,stroke:#16a34a,stroke-width:1.5px,color:#14532d classDef decision fill:#1a1806,stroke:#fbbf24,stroke-width:2px,color:#e2e8f0 classDef urgent fill:#fef2f2,stroke:#dc2626,stroke-width:1.5px,color:#7f1d1d classDef notable fill:#fffbeb,stroke:#ca8a04,stroke-width:1.5px,color:#713f12 classDef skip fill:#f3f4f6,stroke:#6b7280,stroke-width:1px,color:#374151 class CRON cron class CHATS,HIST imsg class TRIAGE decision class URGENT urgent class NOTABLE notable class SKIP skip

Monitoring architecture

Lightweight, runs on a cheap model

A scheduled cron job checks imsg chats for recent activity. New replies are triaged: urgent items relayed immediately to the owner on Discord, notable items batched into the next check, routine messages skipped. The monitoring agent runs on a cheaper model to avoid burning tokens on routine checks.

Contact Profiles

Stored in memory/contacts/<name>.md

Each contact gets fully personalized handling

Software Stack

OpenClaw Gateway

LaunchAgent · always-on
  • Main agent: Claude Opus (Anthropic)
  • Sub-agents: Cheaper models for heartbeats and triage
  • Config: ~/.openclaw/openclaw.json
  • Gateway binds to loopback only

iMessage Channel

Outbound messaging surface
  • CLI: imsg (Homebrew)
  • DM policy: allowlist
  • Group policy: disabled
  • FDA: Scoped to OpenClaw.app

Qwen3-TTS

Local · no cloud dependency
  • Models: 0.6B (fast) / 1.7B (quality)
  • Runtime: Python 3.12 venv or Ollama
  • Voice modes: Design, clone, or both
  • License: Open-source

Supporting tools

Standard utilities
  • ffmpeg: Audio format conversion
  • Tailscale: Secure remote access
  • Homebrew: Package management
  • Messages.app: macOS built-in

Future Expansion

Tier 2 Autonomy

Pre-approve specific contacts + message types. "Confirm plans with María" without owner approval each time. Trust earned incrementally.

Agent Mode

Allow the assistant to respond autonomously to approved contacts using their personality profile — a conversational AI with the owner's voice and context.

SMS & WhatsApp

SMS via the dedicated iPhone's eSIM (imsg --service sms). WhatsApp via linked device on same Mac. Same permission framework applies.

Voice Cloning Library

Build a library of 3-second voice samples per contact. The assistant picks the right voice automatically — or blends characteristics from multiple references.

Core principle: The owner is always in control. Every outbound message requires explicit approval by default. The assistant has the capability to send messages autonomously — but the permission defaults to deny. Trust is earned incrementally, tier by tier.