Back to Blog
RustCLILLMDeveloper ToolsOpen Source

RTK — Cut LLM Token Usage by 80% with a Single Rust Binary

RTK is an open-source CLI proxy that sits between your AI coding assistant and the terminal. Smart filtering, grouping, and deduplication reduce token consumption by 60–90% across 100+ commands — with under 10ms overhead.

2026-04-07

The Problem: AI Assistants Are Drowning in Terminal Noise

Every time your AI coding assistant runs git status, ls, or cargo test, the raw output floods its context window with boilerplate, whitespace, and redundant information. A single npm testrun can produce 25,000 tokens — most of which the model will never meaningfully use.

This is not just a cost problem. Context windows are finite. Every token wasted on verbose git log output is a token that could have been used to reason about your actual code. The result: slower responses, worse suggestions, and higher bills.

RTK (Rust Token Killer)is an open-source CLI proxy that solves this. It sits between your AI assistant and the terminal, compressing command output by 60–90% — transparently, in under 10ms, with zero changes to your workflow.

How It Works

RTK intercepts shell commands before they reach the terminal and applies four optimization strategies to the output:

Smart Filtering

Strips boilerplate headers, progress bars, ANSI escape codes, and formatting noise that carries zero semantic value for the model.

Grouping

Aggregates similar items — e.g., 50 passing tests become a single summary line, while failures are preserved verbatim.

Truncation

Preserves the relevant head and tail of long outputs while removing the redundant middle. Context is maintained, noise is not.

Deduplication

Collapses repeated entries (like identical warnings across files) into a single entry with an occurrence count.

The entire process is transparent. Your AI assistant calls git status as usual — a hook rewrites it to rtk git status behind the scenes. The assistant never knows the difference. You get the same information, at a fraction of the token cost.

Real-World Benchmarks

Measured over a 30-minute Claude Code session on a medium-sized TypeScript/Rust project:

OperationCallsStandardRTKSavings
ls / tree10x2,000400-80%
cat / read20x40,00012,000-70%
grep / rg8x16,0003,200-80%
git status / diff / log20x15,5003,600-77%
cargo / npm test9x33,0003,300-90%
Total~118,000~23,900-80%

That's ~94,000 tokens saved in a single 30-minute session. Overhead per command: under 10ms.

Works with Every Major AI Coding Tool

RTK integrates with 10 AI coding assistants through a single rtk init command:

rtk init -g                   # Claude Code
rtk init -g --copilot         # GitHub Copilot (VS Code & CLI)
rtk init -g --agent cursor    # Cursor
rtk init -g --gemini          # Gemini CLI
rtk init -g --codex           # OpenAI Codex
rtk init --agent windsurf     # Windsurf
rtk init --agent cline        # Cline / Roo Code
rtk init -g --opencode        # OpenCode

The mechanism is hook-based command rewriting. For Claude Code, rtk init installs a PreToolUse hook that transparently rewrites git status to rtk git status. The AI tool is completely unaware of the transformation.

Key Features

100+ Supported Commands

RTK ships with optimized filters for file operations (ls, find, cat), git, test runners (cargo, pytest, jest, playwright), build tools, linters, package managers, AWS CLI (25 subcommands), Docker, and Kubernetes.

Built-In Analytics

Track exactly how many tokens you're saving:

rtk gain              # summary of total savings
rtk gain --graph      # ASCII chart over 30 days
rtk gain --history    # recent commands with per-command savings
rtk gain --daily      # day-by-day breakdown
rtk discover          # find commands you're running without RTK

Single Binary, Zero Dependencies

RTK is written in Rust and compiles to a single static binary. No runtime dependencies, no Node.js, no Python. Install via Homebrew, cargo, or download a pre-built binary for macOS (Intel + ARM), Linux, or Windows.

Privacy-First Design

All filtering happens locally. No source code, file paths, command arguments, or secrets leave your machine. Optional anonymous telemetry (aggregate usage stats only) can be disabled with a single environment variable:

export RTK_TELEMETRY_DISABLED=1

Getting Started

# Homebrew (macOS / Linux)
brew install rtk

# Quick install script
curl -fsSL https://raw.githubusercontent.com/rtk-ai/rtk/refs/heads/master/install.sh | sh

# From source
cargo install --git https://github.com/rtk-ai/rtk

Then initialize for your AI tool:

rtk init -g    # Claude Code (global hooks)
rtk --version  # verify installation

Note

RTK requires no configuration beyond rtk init. It works immediately with 100+ commands. Advanced users can customize filters and excluded commands via ~/.config/rtk/config.toml.

Why It Matters

Lower costs— 80% fewer tokens per session means significantly lower API bills for teams using AI-assisted development at scale.
Better responses— less noise in the context window means the model spends its capacity reasoning about your code, not parsing boilerplate.
Longer sessions— context windows fill up slower, so your assistant maintains coherence across larger tasks without losing earlier context.
Zero friction — no workflow changes required. One rtk init and it works in the background forever.
Open source— MIT licensed, 19k+ stars, actively maintained with multiple releases per week. Full transparency on what gets filtered and how.

Optimizing your AI-assisted development workflow?

We help teams integrate AI coding tools efficiently — reducing costs and improving output quality. Let’s talk.

Send a Message

Related Articles