If you spend enough time working with AI coding agents, you quickly hit a wall of exhaustion. Not from the coding itself, but from the relentless, overwhelming politeness. You ask for a simple script optimization, and you receive a three-paragraph essay on the nature of functions, a polite greeting, and a cheerful sign-off.
It is exhausting to read. More importantly, it is expensive to run. Every "Certainly! I'd be happy to help you with that!" is a handful of tokens you are paying for—both in actual API costs and in the time it takes the model to stream the text to your screen.
Enter a brilliantly primitive solution: Caveman.
The Grug Brain Approach to AI
A new open-source GitHub project, JuliusBrussee/caveman, is gaining serious traction as a plugin for Claude Code and Codex. Its core pitch is elegantly simple: make the agent speak in a stripped-down, prehistoric style, while leaving the actual code, errors, and technical terminology perfectly intact.
Caveman actively removes the filler, the apologies, and the verbose English output. The result? A claimed reduction in output token usage by around 75%.
This isn't just a novelty trick for a quick laugh on X. For developers deep in iterative coding or long debugging sessions, shorter responses mean fundamentally faster interactions. Less token spend, lower latency, and zero mental overhead required to skim past paragraphs of pleasantries.
How to Arm Your Agent with a Club
Developers are currently utilizing Caveman as a plugin or skill inside their terminal-based agent workflows. Once installed, it persistently alters the agent's output style, though users can easily toggle it off by issuing simple commands like "normal mode" or "stop caveman".
If you want to experiment with compressed outputs, the installation paths are straightforward:
For general skill systems:
Bash
npx skills add JuliusBrussee/caveman
For Claude Code specifically: You can pull it directly from the marketplace:
Bash
claude plugin marketplace add JuliusBrussee/caveman
claude plugin install caveman@caveman
For Codex: The workflow is a bit more manual. You need to clone the repository, open it within Codex, and then locate and install Caveman using the /plugins command.
The Practical Reality (And Where We Are Going)
Let's look at the actual math. The most defensible way to view Caveman is as a highly aggressive prompting and plugin compression trick—it isn't changing the underlying weights of the foundation model. Furthermore, while a 75% reduction in output tokens is massive, your true, end-to-end savings might be slightly lower once you factor in the system overhead of loading the plugin itself.
But focusing purely on the exact penny-savings misses the wider signal.
Caveman is a glaring indicator of exactly where professional AI tools are heading. We are rapidly exiting the era of "conversational" AI for productivity tasks. When we use AI for heavy-duty software execution, we don't want a chatty assistant; we want a silent, hyper-efficient engine that understands intent and executes without a word.
The future of AI interaction isn't full natural-language chat. It is highly compressed, task-specific execution. Sometimes, the most advanced way forward is to just embrace the caveman.