OpenCrabs 0.2.2: Token Counting & Memory Improvements

OpenCrabs 0.2.2 is the latest release of the Rust‑based AI agent orchestration layer that many developers use to build and run AI agents. This update focuses on two core areas: better token counting and smarter memory handling. In this article we’ll walk through what changed, why it matters, and how you can start using the new features right away.

Why Token Counting Matters

When you run an AI model, the amount of text you send to the model is measured in tokens. A token can be a word, part of a word, or even a punctuation mark. Most providers charge by the number of tokens, so knowing exactly how many tokens you’re using helps you control costs and avoid hitting limits.

OpenCrabs 0.2.2 improves token counting in two ways:

Accurate context token display – The UI now shows the exact number of tokens used in the last API call, not a cumulative total that can be misleading.
Per‑message token count – Each message’s token count is shown separately, so you can see which parts of the conversation are the most expensive.

These changes mean you can spot expensive prompts faster and adjust them before you hit a quota.

Memory Management Overhaul

Memory is the brain of an AI agent. It stores past conversations, tool results, and any other data the agent needs to remember. OpenCrabs 0.2.2 introduces a three‑tier memory system:

Tier	What it holds	When it’s used
Brain MEMORY.md	User‑curated notes that stay forever	Loaded at every session start
Daily Memory Logs	Auto‑summaries of each day’s conversation	Saved after each compaction
Memory Search	Quick lookup of past logs via QMD	Used when the agent needs to recall something

The new memory search tool uses QMD (a lightweight search engine) to find relevant past logs. If QMD isn’t installed, the tool falls back to a simple file read and returns a helpful hint.

Auto‑Compaction and Summaries

When the conversation grows too long, OpenCrabs automatically compacts the history. The new compaction summary is now shown to the user as a system message, so you know exactly what was kept. This transparency helps you debug why an agent might forget something.

QMD Auto‑Index

After each compaction, OpenCrabs runs qmd update in the background. This keeps the search index fresh without manual intervention.

How to Upgrade to 0.2.2

If you’re already using OpenCrabs, upgrading is straightforward:

cargo install opencrabs --force

The binary will download the latest release. After installation, run:

opencrabs --version

You should see OpenCrabs 0.2.2. The configuration file (~/.opencrabs/config.toml) will be updated automatically, preserving your custom settings.

Using the New Token Counting Features

OpenCrabs 0.2.2 shows token counts in two places:

Context Token Display – In the terminal UI, the top bar now shows context_tokens for the last API call. This replaces the old inflated value that added up all calls.
Message Token Count – Each chat message now has a token_count field that shows only the output tokens, not the input.

Example

Suppose you send a prompt that is 120 tokens long and the model returns 80 tokens. The UI will now display:

Context: 120 tokens
Message 1: 80 tokens

You can see that the prompt cost 120 tokens, and the response cost 80 tokens. If you had a long chain of messages, you could spot which one is the biggest cost driver.

Leveraging the Memory Search Tool

The memory_search tool is a powerful addition. Here’s how to use it:

opencrabs memory_search "last meeting notes"

The tool will query the QMD index and return the most relevant log file. If QMD isn’t available, you’ll see a message like:

QMD not found. Try reading the daily log directly with read_file.

You can then open the log file manually:

opencrabs read_file ~/.opencrabs/memory/2025-10-01.md

Practical Use Case

Imagine you’re building a customer support agent that needs to remember a user’s order history. With the new memory search, the agent can quickly pull up the last order details without storing everything in memory all the time. This keeps the conversation short and within token limits.

Configuring the New Features

OpenCrabs 0.2.2 introduces a few new config options. Open the config file:

nano ~/.opencrabs/config.toml

You’ll find sections like:

[agent]
max_concurrent = 4
approval_policy = "auto-session"

The max_concurrent setting controls how many tool calls can run at once. The approval_policy determines whether the agent asks for approval before executing a plan.

You can also adjust the compaction threshold:

[compaction]
threshold = 70

This sets the compaction trigger to 70% of the context window instead of the previous 80%.

Integrating OpenCrabs with Other Tools

OpenCrabs works well with other open‑source projects. For example:

WhisperClip – Use WhisperClip to transcribe audio and feed the text into OpenCrabs for further processing.
ClawSocial – Automate social media posts and let OpenCrabs decide the best time to post based on engagement data.
Artifacto – Combine Artifacto’s chat interface with OpenCrabs to create a more interactive agent experience.

Example Workflow

Record a meeting with WhisperClip.
Transcribe the audio to text.
Feed the transcript into OpenCrabs.
Use the memory search to pull up past meeting notes.
Generate a summary and post it to social media via ClawSocial.

This end‑to‑end workflow shows how the new token counting and memory features help keep the conversation concise and relevant.

Tips for Managing Token Costs

Even with accurate token counting, you still need to be mindful of costs. Here are some quick tips:

Shorten prompts – Keep your questions concise. Remove unnecessary filler words.
Chunk large documents – Split long documents into smaller sections before sending them to the model.
Use summarization – Summarize long responses before storing them in memory.
Monitor token usage – Use the context token display to spot spikes.

By following these practices, you can keep your usage within budget while still getting high‑quality responses.

Common Questions About OpenCrabs 0.2.2

Question	Answer
Does 0.2.2 support all providers?	Yes, it works with OpenAI, Anthropic, Gemini, and any OpenRouter‑compatible provider.
Can I disable the memory search?	Yes, set `memory_search.enabled = false` in the config.
What happens if QMD is missing?	The tool falls back to a simple file read and gives a helpful hint.
Is the new token counting backward compatible?	Existing projects will still run; the new display is an addition, not a change to the API.

Getting Help and Community Resources

If you run into issues, the OpenCrabs community is active on GitHub and Discord. You can also check out the Neura AI blog for tutorials on integrating OpenCrabs with other tools. For example, the case study on FineryMarkets shows how a financial firm used OpenCrabs to automate data analysis. You can read it here: https://blog.meetneura.ai/#case-studies.

Final Thoughts

OpenCrabs 0.2.2 brings practical improvements that make building AI agents easier and more cost‑effective. Accurate token counting helps you stay within budget, while the new memory system keeps conversations focused and relevant. Whether you’re a hobbyist or a professional developer, these changes will help you create smarter, more reliable agents.

If you’re interested in exploring more AI tools, check out Neura AI’s product lineup at https://meetneura.ai/products. You can also learn about the team behind these innovations at https://meetneura.ai/#leadership.