ArgosBrain turns your codebase into a deterministic graph your AI agent can query directly. No more grep loops, no more re-reads, no more "I think this is where it lives". Just exact answers, in ms, at zero token cost.
Claude reads files to remember. ArgosBrain remembers, so Claude doesn't have to. Same prompt, same repo, same model.
"Uber has exhausted its full 2026 AI token budget and is managing costs by deploying expensive frontier models for initial development, then switching to cheaper or open-source alternatives at scale."
— @TheDeepDiveFeed · publicly disclosed · 2026
→ Budget runs out by Q3. Forces downgrade to cheaper models at scale.
→ Same dollar budget. 3.7× more agent work. Frontier models stay on the entire pipeline.
The choice isn't "frontier model vs cheaper model". It's "agent that re-reads vs agent that remembers". Drop the re-read tax and you don't have to downgrade anything.
↳ Methodology, raw data, and the 30-day field study these numbers come from: The Re-Read Tax →
ArgosBrain is a local structural graph of your codebase — every symbol, every call, every import — that your coding agent queries via standard MCP. Claude Code, Cursor, Codex, Cline, Aider — all talk to it natively with zero changes to your workflow. You install it once, and your agent stops re-reading the same files for the rest of your life.
That's the chunk ArgosBrain eliminates.
Honest claim ready to fact-check.
| Repo size | Static | Repo context | History | Code | Net savings |
|---|---|---|---|---|---|
| Side project (10K LOC) | 17K · 45% | 18K · 47% | 2K · 5% | 1K · 3% | ~35-45% |
| Mid SaaS (50K LOC) | 17K · 28% | 40K · 65% | 3K · 5% | 1K · 2% | ~55-65% |
| Large product (250K LOC) | 17K · 15% | 80K · 73% | 10K · 9% | 1K · 1% | ~70-75% |
| Monorepo (1M+ LOC) | 17K · 8% | 180K · 85% | 12K · 6% | 2K · 1% | ~80-85% |
The bigger your repo, the more we save. On a 250K-LOC project we cut roughly 70%. On a monorepo, closer to 85%. We can't make the 16% Claude Code system overhead disappear — that's Anthropic's architecture, not ours.
Defaults from Anthropic's published averages: $6/dev/day Claude Code average, $13/dev/day enterprise, top 10% up to $30/day.[6][7]
The existing options — LongMemEval, RULER — measure generic recall on chat transcripts. None of them touched actual codebases. We authored ours and put it under MIT.
LongMemCode kubernetes-2k is our open-source corpus of 1,456 structural scenarios across 8 categories on the real Kubernetes v1.32.0 codebase (333 MB Go source, 38,771 symbols, 232,756 call-graph edges) — symbol existence, caller enumeration, reachability, naming convention, blast radius, plus 100 real bug-fix commits mined from git history. Every scenario has a deterministic ground truth derived from the actual AST. No LLM judge. Either the answer matches the AST or it doesn't.
Yes, we built it. Yes, our engine runs against it. But the runner, the scenarios, and the per-scenario raw results are public — anyone can clone, run on their own laptop, and try to break the numbers. Reproducibility is the only honest answer to "but you graded yourselves." Source: github.com/CataDef/LongMemCode
17,171 files. 303,722 symbols. 2,245,124 call-graph edges. Two runs, two skills. Security audit: 22 sink categories triaged, zero reachable critical findings, library gaps disclosed publicly. 70 seconds, $0.33. Architectural code tour: the AI deduced the engineering culture — spine, heartbeat, naming convention modulo machine-generated noise. 6 seconds, $0.11.
A lot of devs we talk to already keep notes about their code in Obsidian, Notion, a wiki, or a CLAUDE.md file. That's a good habit — keep it.
ArgosBrain isn't a replacement for any of those. It's a different layer entirely: instead of storing what you think about your code, it stores what your code is, and your agent queries it directly via MCP. The two tools stack — they don't compete.
| Obsidian (and similar) | ArgosBrain | |
|---|---|---|
| Job | Your second brain — thoughts, designs, decisions | Your agent's code index — symbols, callers, types |
| Who writes the data | You, when you sit down to think | Your repo, automatically on every commit |
| Who reads it | You (and occasionally your agent, if you paste) | Your agent, on every prompt, via MCP |
| Best for | PRDs, design docs, journal entries, learnings | Code structure questions, refactoring, call graphs |
| Stays in sync with code | You maintain it | File-hash invalidation re-indexes only the diff |
| Costs tokens to query | Yes — pasting notes loads them into context | No — 0 tokens, sub-ms graph lookup |
CLAUDE.md is narrative. ArgosBrain is structural. They stack.
CLAUDE.md is a markdown file you put in your repo root. Claude Code reads it on every session and injects it into the agent's prompt as project context. It's the place for your dev journal — preferences, conventions, the things you want the agent to always know.
Keep writing those. Just understand what they're good at — and what they're not.
/spec, run with vitest, never jest."npm install."Preferences and conventions. The kind of thing that doesn't change every commit.
It can't answer questions that depend on the current state of the code. Three concrete examples — same scenario, what each can and can't do.
verifyToken called?"verifyToken to validateToken and commit.verifyToken. The agent gets stale instructions and may invent a wrong API.validateToken the moment your file is saved.argos.list_callers() on demand and gets only the answer it needs — file:line of 6 callers, ~30 tokens.
Tool-use histograms from two real Claude Code sessions on the same Next.js codebase (~400 files). The only variable: whether the project had a CLAUDE.md telling Claude to reach for ArgosBrain first.
189 Edit 177 Read 146 Bash 127 Grep ← every code question 9 Glob 9 Write 6 ToolSearch 2 AskUserQuestion 0 mcp__argos__* ← MCP installed, never called
26 Bash 24 Read 23 Edit 16 TodoWrite 13 mcp__argos__symbol_exists 8 mcp__argos__search 3 ToolSearch 1 mcp__argos__ingest_codebase 0 Grep ← agent trusts structure now
In the second session, search also surfaced four SSRF call-sites the first session's Grep had missed — they lived outside the directory the audit was pointed at, reachable only via Causal edges in the call graph.
CLAUDE.md for what humans write — preferences, conventions, narrative.We didn't write these reviews. Claude Opus 4.7 did — unprompted — during a live 1 237-turn coding session on a production Next.js SaaS. The agent graded ArgosBrain against Grep and RAG on real jobs it had to do that day. Below are its seven own-word assessments, unedited beyond light trimming. The eighth card (multi-modal) ships in v0.2 — it arrived after the review, so it's ours, labelled as such.
"The initial audit scoped src/app/api/ and found two SSRF sites. ArgosBrain surfaced four more in src/lib/services/ — the agent had to follow causal edges across directories Grep wasn't pointed at."
— Claude Opus 4.7 · dogfood session · 2026-04-22
2× RECALL VS. GREP"Argos returned a CLEAR match: uploadVideoToTikTok(videoBuffer: Buffer, …) takes a Buffer, not a URL. The agent was about to patch the call site as if it accepted a URL — that retrieval prevented a silently-broken commit."
— Claude Opus 4.7 · dogfood session · 2026-04-22
PREVENTED A BAD COMMIT"Before deleting an RLS-bypassing route I thought was dead, I asked Argos for its callers. It returned NO_CONFIDENT_MATCH — exhaustive over the ingested codebase. Not 'I didn't find any'; 'there are none.' Deleted with confidence, no regression."
— Claude Opus 4.7 · dogfood session · 2026-04-22
SAFE DEAD-CODE CUT"I was about to write a new handler. Argos pulled up the existing one from an older session — same behaviour, already tested. Saved me a duplicate route and the tech debt that comes with it."
— Claude Opus 4.7 · dogfood session · 2026-04-22
NO DUPLICATE HANDLERS"Before adding a new admin check, Argos surfaced ADMIN_EMAILS as the project's established pattern. The agent used the same convention instead of inventing its own. Tiny detail; compounds over months."
— Claude Opus 4.7 · dogfood session · 2026-04-22
STYLE-CONSISTENT PRS"'Does sanitizeHtml exist in this project?' — answered 'no' in 40ms with confidence 1.0. Grep on 400 files would have taken a full second and left the question ambiguous. The agent stopped hunting for ghosts."
— Claude Opus 4.7 · dogfood session · 2026-04-22
< 50 MS DEFINITIVE NEGATIVES"Before committing to a feature, the agent used Argos to map every file a change would touch — six, across three service boundaries. It flagged the effort as disproportionate and deferred the work. A human tech lead would have done the same scope check."
— Claude Opus 4.7 · dogfood session · 2026-04-22
ACCURATE EFFORT ESTIMATES"User shared a UI mockup. The LLM interpreted it — 'a 3-step Stripe checkout, Place Order button disabled until terms accepted' — and Argos stored that interpretation linked to checkoutHandler. Two weeks later, the 'why is the button disabled?' question resolved instantly."
1 CALL = IMAGE + CONTEXT + CODE LINKEach service is a working pipeline backed by deterministic structural retrieval — file:line citations, sub-millisecond P99, $0 per query, runs locally. Click in for the full pitch, the side-by-side math, and how to reproduce it on your repo. See all 14 services →
ArgosBrain ships one binary that adapts to where you work — security audits, MCP-compatible coding agents, enterprise compliance, e-commerce theming. Click your scenario for the opinionated install + tooling guide.
sections/*.liquid on every turn.Every session starts from scratch. Every query re-embeds files you've already seen. Every run rebuilds the repo map and throws it away.
The community has a name for this: context rot. Chroma's 2025 study measured it across 18 frontier models — every one degrades as input grows. Anthropic shipped a Memory Tool in September 2025, but it's a file primitive, not a brain.
Meanwhile, you're paying for the same file to be read 40 times a week. Cursor Ultra is $200/mo. Claude Max is $200/mo. Token bills don't lie.
Compiled Rust binary runs locally. Tree-sitter + SCIP parse your codebase into a unified graph. 28 languages. Updates instantly on file save.
Any agent asks structural questions via standard MCP tools — symbol_exists, resolve_member, list_symbols, search. Sub-ms answers. Integrate it into your custom internal tools effortlessly.
$0 per query, forever. No LLM in the retrieval loop. Local-first. Zero data egress. Toggle on/off, see the diff.
Same repo. Same prompt. Same model (Claude Opus 4.7, temperature=0).
Left window: agent alone. Right window: agent + ArgosBrain.
LLM summarization destroys ASTs. We parse them.
$52M raised in the category — zero products built for code.
100M tokens still hallucinate symbols. And cost real money per call.
Our answers are ground truth, in 0.8 milliseconds.
They ship your code to their servers and bill LLM cost per query.
We run local. $0 per query. Zero data egress.
Locked to one editor. We work in every MCP agent — including the ones above.
One brain, every tool.
One giant table is unreadable. Here's the same information split into seven categories — ArgosBrain first, everyone else ranked against us. Click any competitor for the full page with citations.
| ArgosBrain | $0 — no LLM on read path |
| Zep / Graphiti | Free retrieval (graph + semantic) |
| Mem0 | Embedding + vector search |
| MCP memory server | Substring + full body |
| Aider | ~1 000 tokens / request |
| Continue | Prompt tokens (chunks injected) |
| Cursor · Windsurf · Copilot | Prompt tokens every relevant query |
| CLAUDE.md | Full file in system prompt, every turn |
| Cline Memory Bank | Full MD bank at every session start |
| Letta | LLM tool-call on every read |
| ArgosBrain | SCIP + live LSP + tree-sitter, tiered per language |
| Aider | Tree-sitter surface names + PageRank |
| Continue | Tree-sitter text chunks for embedding |
| Copilot | Semantic repo indexing (opaque) |
| Cursor · Windsurf | Undocumented |
| Cline · Mem0 · Zep · Letta · CLAUDE.md · MCP memory | No code indexing — prose / text / JSON |
| ArgosBrain | File-hash invalidation, automatic |
| Copilot | 28-day auto-expire + citation validation |
| Aider | Recomputed per request (always fresh) |
| Zep / Graphiti | Bi-temporal edges (not code-aware) |
| Continue | On re-index |
| Cursor · Windsurf | Unknown |
| Cline · CLAUDE.md | Manual edit only |
| Mem0 · Letta · MCP memory | None |
| ArgosBrain | Yes, default — runs in-process |
| Windsurf · Zed · Cline · Aider · Continue · CLAUDE.md · MCP memory | Yes |
| Mem0 · Letta | OSS self-host yes; Cloud no |
| Zep | CE deprecated Apr 2025 — Graphiti OSS only |
| Cursor · Copilot | Cloud-only |
| ArgosBrain | LongMemCode 99.2–100% across 16 corpora, P99 ≤ 0.82 ms |
| Zep | DMR 94.8%, LongMemEval +18.5% / −90% latency |
| Mem0 | LoCoMo 91.6%, LongMemEval 93.4% |
| Letta | Terminal-Bench #1 OSS (Letta Code) |
| All others | None published |
| ArgosBrain | Is an MCP server — runs under every MCP client |
| MCP memory server | Yes (reference implementation) |
| Continue · Cline · Mem0 · Zep · Letta | MCP client only — can consume, not serve |
| CLAUDE.md | Convention, not a protocol |
| Cursor · Windsurf · Copilot · Aider | No — memory locked inside their tool |
"State integrity degrades at 500K to 2M tokens. Roughly one-fifth to one-tenth the scale where retrieval architecture becomes critical." — Mark Hendrickson · Apr 2026
Long-context models don't solve memory. BEAM scores showed RAG degrading from 30.7% at 1M tokens to 24.9% at 10M, and contradiction resolution near zero at every tier. ArgosBrain's verify / dispute / zone transitions are exactly the write-integrity layer those numbers say is missing.
"Every practitioner has felt it. Your GraphRAG system is useless for weeks — hallucinating, missing obvious connections. Then suddenly, it works." — Alexander Shereshevsky · Graph Praxis
Flat vector RAG breaks on codebases because codebases are high-connectivity graphs (call sites, inheritance, imports). ArgosBrain is graph-first by design — petgraph + HNSW + keyword hybrid — which is why every cell in the "code-native" row above is red except ours.
"We win at everything" is a lie and engineers smell it instantly. Here's what we don't ship today.
Cursor and Copilot ship memory inside the editor with zero install. ArgosBrain runs as an MCP server you configure.
Mem0 Cloud and Zep Cloud offer multi-user team memory out of the box. ArgosBrain is local-first; team sync is roadmap, not shipped.
Mem0 holds 91.6% on LoCoMo. ArgosBrain targets ≥91.6% on LongMemEval — match, not beat. Our moat is code, not chat.
For pure-English queries like "rate limit fail open" — no symbol names, no identifiers — Grep is still the faster tool. Argos is for structural code questions; we'll point you at Grep when that's the right answer.
Database rows, RLS policies, deploy logs, third-party API responses, runtime errors. Not our job. Use psql, provider CLIs, deploy hooks, browser devtools. We store code memory — not a proxy to production systems.
We don't ship a vision stack. Your agent's LLM interprets the file; we make sure that interpretation is remembered — linked to your codebase. One less binary, one less supply-chain surface, one less thing to audit.
Sign in with GitHub to get your free key. Your dashboard then shows a single copy-paste install line that includes your key — paste it in your terminal and you're done.
↑ This is what you'll paste in your terminal. Sign in to get the version with your free key embedded.
127.0.0.1:3733 — open it with argosbrain dashboard.No 30-day trial clock. No credit card on the Free tier. Cancel any paid plan at any time — your subscription stays active through the end of the billing period and we offer a 14-day refund on your first paid charge.
argosbrain dashboard)