OpenCrabs 0.2.2: Token Counting & Memory Enhancements

OpenCrabs is a Rust‑based orchestration layer that lets AI agents run faster and smarter. The latest release, 0.2.2, brings big changes to how the system counts tokens and manages memory. In this article we’ll walk through the new token‑counting logic, the memory improvements, and how these updates help developers build more reliable AI workflows.

Why Token Counting Matters in AI Agents

When an AI model like Claude or GPT processes text, it breaks the input into tokens. Each token is a small piece of text, and the model can only handle a limited number of tokens in one request. If you exceed that limit, the request fails or the model truncates the input.

OpenCrabs keeps track of how many tokens are in the conversation history so it can stay within the model’s limits. The old version counted tokens incorrectly, which caused two problems:

Inflated token numbers – The system added up tokens from every tool call, even when the same text was reused.
Wrong billing – Because the token counter was wrong, users were billed for more tokens than they actually used.

The 0.2.2 update fixes these issues and gives developers a clearer view of token usage.

OpenCrabs 0.2.2 Token Counting

The new token‑counting logic uses the official tiktoken library with the cl100k_base model. This means the count matches what the OpenAI API reports. The key changes are:

Per‑message token count – Only the output tokens of a message are shown, not the combined input and output.
Context token display – The UI now shows the last iteration’s token count, not the cumulative total.
Accurate billing – The usage field still accumulates for billing, but the displayed count is now correct.

Because the token counter is now accurate, developers can design longer conversations without worrying about hitting the limit unexpectedly.

Memory Improvements in 0.2.2

OpenCrabs uses a three‑tier memory system: a brain file, daily logs, and a semantic search tool. The 0.2.2 release tightens the compaction logic and improves the search experience.

3‑Tier Memory System

Brain MEMORY.md – A file that holds user‑defined notes that load at the start of each session.
Daily Memory Logs – After each compaction, a summary is written to a file named by date. Multiple compactions per day stack in the same file.
Memory Search Tool – A tool that queries past logs using QMD (a lightweight semantic search engine). If QMD is missing, the tool falls back to reading the log file directly.

Compaction Threshold and Summary Display

The compaction threshold was lowered from 80 % to 70 % of the context window. This means the system will compact earlier, giving more headroom for new messages. When compaction happens, the full summary is shown in the chat as a system message, so users can see exactly what was remembered.

QMD Auto‑Indexing

After each compaction, qmd update runs in the background to keep the search index fresh. This ensures that the memory search tool can find relevant past conversations quickly.

How These Changes Help Developers

1. Predictable Token Usage

With accurate token counting, developers can set a hard limit for each conversation and be confident that the system will not exceed it. This is especially useful when building agents that need to stay within a strict budget.

2. Better Memory Management

The earlier compaction and summary display mean that agents can keep more context without losing important details. The semantic search tool also makes it easier to retrieve past information, which is critical for long‑running tasks.

3. Transparent Billing

Because the token counter now matches the provider’s count, users can see exactly how many tokens they are using. This transparency helps avoid surprise charges.

Using the New Features

Below is a quick guide on how to take advantage of the new token‑counting and memory features.

Checking Token Usage

opencrabs status

The command shows the current token count for the last iteration. If you want to see the total usage for billing, use:

opencrabs usage

Managing Memory

Add a note to the brain

opencrabs brain add "Remember to check the API key"

View daily logs
```
opencrabs logs today
```
Search past conversations
```
opencrabs memory_search "API key"
```

If QMD is not installed, the search tool will read the log file directly and give you a plain‑text result.

Configuring Compaction

You can adjust the compaction threshold in the config file:

[agent]
compaction_threshold = 0.7

This setting tells OpenCrabs to compact when the context window is 70 % full.

Real‑World Example

Imagine you’re building a customer support agent that needs to remember a user’s order history. With the old token counter, the agent might think it has room for more messages and then fail when the model hits the limit. With the new counter, the agent will compact earlier, keep a summary of the order history, and use the memory search tool to pull up details quickly. The result is a smoother experience for the user and fewer errors for the developer.

Integrating OpenCrabs with Neura AI

If you’re already using Neura AI’s platform, you can integrate OpenCrabs to add advanced memory and token‑counting features. Neura’s Router Agents can call OpenCrabs as a tool, and the memory search can feed back into the agent’s reasoning loop.

For more details on how to set up Neura’s Router Agents, visit the Neura AI product page or check out the Neura leadership page.

If you want to see how other teams use OpenCrabs, read the case studies on the Neura blog: Case Studies.

Future Roadmap

The OpenCrabs team plans to add more advanced memory compression techniques and support for additional tokenizers. They also want to make the compaction process fully configurable via environment variables, so developers can finetune the behavior for different workloads.

Conclusion

OpenCrabs 0.2.2 brings precise token counting and smarter memory management to AI agents. These updates help developers build more reliable, cost‑effective, and user‑friendly applications. By using the new token‑counting logic and memory tools, you can keep your conversations within limits, retrieve past information quickly, and avoid unexpected billing surprises.