Visual Explainer

How Tokens Work

Why sending "test 123" to Vishnu isn't as simple as it looks — and what that means for rate limits.

What happens when you send a message

You type a tiny message. Behind the scenes, something much bigger gets assembled and sent to the AI.

1

💬 You type "test 123" in Telegram

Your message is tiny — just 3 tokens. Practically nothing.

2

⚙️ Clawdbot receives it & builds the request

Before your message reaches the AI, Clawdbot has to package everything Vishnu needs to understand who he is, what he's doing, and what you've been talking about.

3

📦 The request bundles everything together

Your 3-token message gets wrapped in a massive context package:

What's actually sent to the AI
System prompt
~15,000 tk
Workspace files
~5,000 tk
Chat history
~2,000–50,000 tk
Your message
3 tk
Total sent per message ~22,000+ tokens
Your tiny message, packaged up:
System Prompt15,000 tk
Workspace Files5,000 tk
Conversation History2,000+ tk
"test 123"3 tk
4

🚀 Sent to Anthropic's API as ONE request

The entire package — all 22,000+ tokens — gets sent in a single API call. Anthropic sees the full payload, not just your message.

5

✨ Vishnu responds

Anthropic processes everything and sends back a response of ~500–2,000 tokens.

🔍

Your "test 123" was 3 tokens. But the total request was 22,000+ tokens. That's like writing a 3-word post-it note, but delivering it inside a 60-page binder.

3
Your message
22,003+
What's actually sent

Why does it need all that context?

The AI doesn't remember anything between messages. Every single time you send a message, the entire context must be re-sent from scratch.

🧠

No Memory Between Calls

The AI is completely blank every time. It's like calling a brand new employee for each message — they know nothing.

📋

The Training Manual

SOUL.md, AGENTS.md, USER.md, etc. are like the employee's training manual. Sent every single time, so Vishnu knows who he is.

📜

Reading Back the Transcript

The conversation history is like reading back the entire chat from the beginning — every message you've ever sent in this session.

📞

The Phone Call Analogy

Imagine calling a help desk where they never remember you. Each call: "Hi, my name is..., I'm working on..., we last discussed..."

📈

Conversations grow over time

As you chat more, the conversation history grows. What starts as a 22K token request can balloon to 50K, 70K, or even more — because every previous message is included.

Message 1
22K
22,000 tk
Message 10
28K
28,000 tk
Message 30
42K
42,000 tk
Message 60
65K
65,000 tk
Message 100+
90K+
90,000+ tk

What happens with multiple chats

When several group chats fire messages at the same time, the tokens stack up fast — and can blow past rate limits in seconds.

👥
Group Chat 1
"test 123"
25,000 tk
👥
Group Chat 2
"hello"
25,000 tk
👥
Group Chat 3
"testing"
25,000 tk
💬
Direct Message
"hi"
20,000 tk
⚡ ~95,000 tokens in seconds
How that compares to rate limits
Free Tier 40,000 tokens/min limit
237% OVER — BLOCKED ✕
Paid Tier 1 80,000 tokens/min limit
119% OVER — BLOCKED ✕
Paid Tier 2 160,000 tokens/min limit
59% — OK ✓
⚠️

Four simple messages — "test 123", "hello", "testing", "hi" — fired at the same time can exceed your rate limit and cause Vishnu to stop responding until the limit resets.

How to avoid rate limits

A few simple habits can keep Vishnu running smoothly without hitting walls.

⏱️

Space out your messages

Don't fire messages in all groups at once. Give Vishnu a few seconds between messages so they don't all hit the API simultaneously.

📈

Upgrade your Anthropic tier

Higher plan tiers give you more tokens per minute. Tier 2 (160K/min) gives 4× the room compared to free (40K/min).

🔄

Use /reset regularly

This clears the conversation history, shrinking the token payload back down. Great after long sessions.

🚦

Throttle concurrency

We can configure Vishnu to process one message at a time instead of all at once — avoiding token spikes.

🔱 New to Vishnu? Start with the full guide.

← Vishnu Getting Started Guide