Molty Bot — System Overview & Status

System Overview

What is Molty?

The AI employee that never clocks out — Ryan's personal assistant, developer, and business operator.

🧠

Claude Opus 4.5

Primary brain — Anthropic's most capable model. Deep reasoning, expert coding, nuanced conversation. Falls back through Sonnet → Haiku → GPT 5.2.

🏗️

Builds While You Sleep

Creates PRs, builds websites, deploys to Cloudflare, manages GitHub repos. Never pushes live — always branches and PRs for review.

💬

Always On Telegram

DMs, group chats, inline buttons. Streams responses in real-time. Handles images, files, voice messages.

🔄

Self-Managing

Heartbeats, proactive monitoring, adaptive rate limiting, memory maintenance. Manages itself so you don't have to.

Architecture

How Molty is built

Clawdbot gateway running on Ryan's Mac, connected to Telegram, with full filesystem and tool access.

🖥️

Runtime

Clawdbot Gateway

v2026.1.24

🟢

Platform

macOS (Darwin x64)

Node v22

🔒

Gateway

Local mode, loopback

Port 18789

📂

Workspace

Main agent workspace

/clawd-main

📡 Connection Stack

Telegram → Clawdbot Gateway → Claude API → Tools (exec, browser, web, files, cron, memory)

Messages arrive via Telegram long-polling, get processed by the gateway which manages sessions, routes to the right agent, calls the AI model, executes tool calls, and streams the response back. The gateway handles auth, rate limiting, model fallbacks, and session persistence automatically.

Model Configuration

AI models & fallback chain

Multi-model setup with automatic failover for maximum reliability.

🔗 Model Fallback Chain

When the primary model hits a rate limit (429), Molty automatically falls to the next model in the chain. Cooldown is exponential — retries the primary after increasing intervals.

🧠

1. Claude Opus 4.5 PRIMARY

Most capable model. Deep reasoning, complex coding, long context. ~$15/MTok input, $75/MTok output.

↓ if rate limited

⚡

2. Claude Sonnet 4 FALLBACK 1

Smart and fast. Great for most tasks. ~$3/MTok input, $15/MTok output. 5x cheaper than Opus.

↓ if rate limited

💨

3. Claude Haiku 3.5 FALLBACK 2

Lightweight and fast. Much higher rate limits. ~$0.25/MTok input, $1.25/MTok output.

↓ if rate limited

🌐

4. GPT 5.2 (OpenAI Codex) FALLBACK 3

Different provider entirely. If all Anthropic models are limited, falls to OpenAI. Never fully stops.

⏱️ Cooldown Behavior

When a model gets a 429, Molty backs off with exponential cooldown: 1 min → 5 min → 25 min → 1 hour. After each cooldown, it retries the higher-tier model. Once it responds, the cooldown resets. All transparent.

💾 Prompt Caching

Anthropic automatically caches the system prompt across calls within the same session. Cached tokens don't count toward rate limits and cost 90% less. First message in a session may take ~55s (cold cache), subsequent ones are much faster.

Rate Limit Protection

Multi-layer rate limit defense

Three layers of protection ensure Molty never fully stops responding.

🛡️

Layer 1: Model Fallback

Automatic failover through 4 models across 2 providers. If one model is rate-limited, the next one takes over instantly.

🧠

Layer 2: Smart Queue

Adaptive monitoring daemon watches for 429 errors, high concurrency, and system load. Auto-enables request queuing when needed.

💾

Layer 3: Prompt Caching

System prompt cached across API calls. Cached tokens don't count toward rate limits — reduces effective token usage by 60-80%.

Adaptive Smart Queue

Self-managing protection system

A background daemon that watches for problems and automatically activates additional protections.

🤖 How the Smart Monitor Works

A background process runs every 30 seconds, checking three signals:

🚫 429 Errors

3+ rate limit errors in recent logs triggers queue

📊 Concurrency

12+ active subagents triggers queue

⚡ CPU Load

80%+ CPU usage triggers queue

📋 Queue Behavior

🚦

Auto-Enable

When thresholds are crossed, the priority queue activates automatically

📋

Priority Ordering

DMs (priority 1) → Group chats (priority 2) → Background tasks (priority 3)

🤫

Silent Under 30s

Short waits are invisible — looks like normal processing time

✅

Auto-Recovery

After 5 minutes of stability (no errors, normal load), queue disables itself

🛠️ Manual Controls

Command	Action
`./scripts/smart-queue-monitor.sh status`	Check monitor status + current metrics
`./scripts/smart-queue-monitor.sh daemon`	Start monitoring daemon
`./scripts/smart-queue-monitor.sh stop`	Stop monitoring daemon
`./scripts/queue-control.sh emergency-enable`	Force-enable basic queue immediately
`./scripts/queue-control.sh disable`	Force-disable queue

Agent Configuration

Agents running on this install

Two agents configured — Molty (main) and Hanna (secondary).

🔧

Molty (Main)

ID: main
Workspace: /Users/home/clawd-main
Model: Claude Opus 4.5
Role: Primary assistant — coding, business ops, proactive work
Default: Yes — handles all unrouted messages

👩

Hanna

ID: hanna
Workspace: /Users/home/clawd/agents/hanna
Model: Claude Opus 4.5
Role: Secondary agent — separate Telegram bot
Groups: Disabled

⚙️ Concurrency Limits

Setting	Value	Description
`maxConcurrent`	4	Max simultaneous main agent runs
`subagents.maxConcurrent`	8	Max simultaneous sub-agent runs
`compaction`	safeguard	Context compaction with memory flush before truncation

Messaging Channels

How Molty connects

Telegram is the primary channel. WhatsApp plugin is enabled but not linked.

📱 Telegram (Primary)

Setting	Value
DM Policy	`allowlist` — only Ryan (ID: 7191564227)
Group Policy	`allowlist` — Ryan + Leads group
Stream Mode	`partial` — real-time streaming edits
Ack Reactions	Group mentions only (`group-mentions`)
Capabilities	Inline buttons enabled

Molty Bot: @MoltyBot (main agent)
Hanna Bot: Separate Telegram bot (DMs only, groups disabled)

📞 WhatsApp

Plugin enabled but not currently linked. Available for future use — supports DMs, groups, voice messages, reactions.

Memory System

How Molty remembers

File-based memory with daily logs, long-term curation, and group isolation.

📝

Daily Logs

memory/YYYY-MM-DD.md

🧠

Long-Term

MEMORY.md — curated

🔒

Group Isolated

memory/groups/<slug>/

♻️

Self-Maintaining

Reviews during heartbeats

📁 Workspace Files

File	Purpose
`SOUL.md`	Personality, behavior rules, core mission
`USER.md`	Ryan's info — name, timezone, context
`IDENTITY.md`	Molty's name, creature, vibe, emoji
`AGENTS.md`	Operating procedures, memory protocol, group chat rules
`TOOLS.md`	Local tool notes (cameras, SSH, TTS, etc.)
`HEARTBEAT.md`	Periodic task checklist for heartbeat polls
`IDEAS.md`	Pipeline of ideas, opportunities, things to explore
`MEMORY.md`	Curated long-term memory (main session only)

Active Projects

What Molty is working on

Current projects, repos, and deployments managed by this install.

🌐

Denver AI Training

Site: denveraitraining.com
Repo: RShuken/denveraitraining
Stack: Next.js → Cloudflare Pages
PRs: Stripe checkout, SEO polish

🔮

OpenClaw Install

Site: openclawinstall.net
Repo: RShuken/openclawinstall
Stack: Next.js → Cloudflare Pages
PRs: Stripe checkout, SEO polish

⚙️

MC-Leads Worker

Repo: RShuken/molty-bot-sites
Stack: CF Worker + D1 + Stripe
PRs: /checkout/quick endpoint
Features: Lead capture, Turnstile, payments

👤

Client Installs

Vishnu (Ian Ferguson)
Host: Mac Mini via Tailscale
SSH: roberts-mac-mini
Model: Claude Sonnet 4 + Haiku fallback

Configuration Reference

Key config settings

Quick reference for important configuration values.

🔑 Auth Profiles

Profile	Provider	Mode
`anthropic:default`	Anthropic	Token
`openai-codex:default`	OpenAI Codex	OAuth

🧰 Installed Skills

Skill	Purpose
`goplaces`	Google Places API queries
`openai-image-gen`	Image generation via OpenAI
`openai-whisper-api`	Audio transcription via Whisper

🔌 Active Plugins

Plugin	Status
`telegram`	✅ Enabled
`whatsapp`	✅ Enabled (not linked)

🔗 Internal Hooks

Hook	Status
`boot-md`	✅ Loads workspace .md files at session start
`command-logger`	✅ Logs slash commands
`session-memory`	✅ Session persistence

FAQ

Common questions

How do I restart Molty?

Run clawdbot gateway restart from the terminal, or send /reset in Telegram to reset the current session without restarting the gateway.

What happens if Molty runs out of context?

Molty uses "safeguard" compaction mode. When the context window fills up, it triggers a memory flush (saving important context to daily memory files), then compacts the conversation into a summary. Long-term memory in MEMORY.md is preserved.

How do I add a new Telegram group?

Add the group's chat ID to channels.telegram.groupAllowFrom in the config, then restart the gateway. Molty will automatically create isolated memory for the new group.

How do I check API usage/costs?

Send /status in Telegram to see current session costs, or check the Anthropic console at console.anthropic.com.

Can Molty access other machines?

Yes — SSH is configured to roberts-mac-mini (Ian's Vishnu install) via Tailscale. Additional SSH hosts can be added to ~/.ssh/config.

What if the smart queue monitor crashes?

The model fallback chain (Layer 1) still works independently. You can manually restart the monitor with ./scripts/smart-queue-monitor.sh daemon, or use ./scripts/queue-control.sh emergency-enable to force-enable the queue.