Building Your Own LLM Wiki with Claude Code: A Minimalist's Guide (Without the Obsidian Lock-In)

TL;DR
- The indexing era is ending. Once LLMs start generating most of your notes, traditional vector databases become a tax — embeddings to rebuild, egress fees to pay, dimensions to rotate. A folder of plain markdown files is suddenly a feature, not a limitation. [Link]
- Karpathy's LLM Wiki idea went viral for a reason. Instead of re-reading raw documents on every query, you compile a living markdown wiki once and let the agent keep it fresh. [Link]
- You do not need Obsidian to enjoy Obsidian's best gift: its searchable conventions (
[[wikilinks]], YAML frontmatter, atomic notes). Borrow the convention, drop the app. mq+ SQLite FTS5 + ripgrep beats a vector DB for personal-scale archives. Reindexing 200 files takes about 3 seconds. BM25 ranking gives you the right files first, instead of an alphabetical dump.- The garden, not the mausoleum. Pair
/deep-thinking:groundfor retrieval and/deep-thinking:deep-researchfor generation, and your knowledge base grows itself while you sleep. [Link]
Introduction: When the Filing Cabinet Becomes the Liability
For about thirty years, the unspoken rule of personal knowledge was that a human writes, a database indexes. You typed; PostgreSQL, Elasticsearch, or — more recently — Pinecone quietly catalogued. That contract is breaking in 2026, and not for the reason most people expected.
The break is not that humans stopped writing. It is that AI agents started writing far more than humans ever could. A single
/deep-researchrun from Claude Code routinely lands a long-form, citation-heavy markdown dossier in a single sitting — denser, better-cited, and more contextually loaded than any blog post you would have written by hand. Multiply that by a few queries a week, and the curve goes vertical fast.Suddenly the operating cost of "managing" this content gets ugly. Vector database providers in 2026 quietly bake in egress fees of about $0.08–$0.09 per GB on AWS, index rebuild compute of $12–$40 per 10M vectors, and roughly 1.5× HNSW storage overhead. [Link] For a personal archive, that is overkill the size of a small mortgage.
Doing it locally hurts in a different way. Spinning up your own Milvus or Weaviate instance, embedding every note, monitoring rebuilds — the operational overhead alone has been called out as the silent cost of self-hosted vector stores. [Link] For one person, that is a part-time job nobody asked for.
So a thought sneaks in: what if the markdown files themselves were the database? Andrej Karpathy thought it loud enough that the entire ecosystem heard him — but he was not the first to notice that the old contract was broken. Ten months earlier, somebody had already buried the funeral wreath.
Karpathy's Argument: A Wiki Is a Compiled Artifact, Not a Live Query
- On June 16, 2025, the Australian writer Joan Westenberg deleted, by her own hand, five years of an Obsidian vault: roughly ten thousand notes, thousands of backlinks, the whole thing meticulously mapped under Tiago Forte's PARA system. Her essay "I Deleted My Second Brain" hit the front pages of Hacker News and r/ObsidianMD on the same afternoon. [Link] The diagnosis was brutal:
"Instead of accelerating my thinking, it began to replace it. Instead of aiding memory, it froze my curiosity into static categories... PKM systems promise coherence, but they deliver abstracted confusion. The map swallows the territory." — Joan Westenberg, I Deleted My Second Brain (2025) [Link]
Westenberg's closing line — "I don't want to manage knowledge. I want to live it" — became the unofficial epitaph of capture-maximalist PKM. Ten months later, Karpathy answered the same diagnosis with a different prescription. Westenberg buried the mausoleum. Karpathy drew the blueprint for a garden. Same dead patient, opposite funerals.
In April 2026, Andrej Karpathy published a small GitHub Gist titled
llm-wiki.md. [Link] The thesis is brutally short: stop running expensive retrieval on every question, and instead let the LLM continually re-compile your raw notes into a structured, navigable markdown wiki.The clean way to put it comes from MindStudio's explainer: instead of having the LLM re-read your raw documents every time you ask a question, build a persistent, structured wiki once and keep it updated forever. [Link] Karpathy's analogy was software compilation. Your notes are source code; the wiki is the binary; Claude Code is the build system.
This flips a quiet assumption that RAG (Retrieval-Augmented Generation) had baked into the field for two years. RAG says, "Keep raw text, and re-retrieve at query time." LLM Wiki says, "Keep raw text, but also keep a living refined index — and let the agent maintain it." [Link]
Why does this matter for cost? Because a markdown file is its own index. There is no embedding to rebuild when you edit a paragraph, no dimensional collapse when your domain shifts, no monthly bill for the privilege of hosting your own thoughts. Steph Ango, Obsidian's CEO, wrote the line that aged from a 2023 essay into a 2026 anthem:
"The files you create are more important than the tools you use to create them. Apps are ephemeral, but your files have a chance to last." — Steph Ango, File over app (2023) [Link]
- That is the philosophical foundation. The next question is engineering: how minimal can you make the stack while still getting 80% of the benefit?
The Minimalist Twist: Borrow the Convention, Skip the App
The dominant LLM-Wiki recipe in 2026 pairs Obsidian with Claude Code. AgriciDaniel/claude-obsidian crossed roughly 3,500 stars on GitHub within three weeks of its early-April 2026 launch, and a half-dozen serious clones followed in its wake (e.g., NicholasSpisak/second-brain, eugeniughelbur/obsidian-second-brain). [Link] This works. It is also, frankly, more software than most people need.
Here is the crucial observation that the Reddit crowd surfaced first — the top comment of a 311-upvote
r/ClaudeAIguide on running Claude Code over an Obsidian vault:
"An Obsidian vault is just a mass of markdown text files... putting Claude in front of the vault folder is enough. If you want to optimize, use ripgrep & fts5 in front of your vault." — u/emptyharddrive, r/ClaudeAI [Link]
Obsidian is, in the end, a markdown renderer with a graph view. You can love it (I do) and still admit that if you already live in a terminal with Claude Code, the renderer is optional. What is not optional is Obsidian's conventions — and that is the trick.
The Obsidian vault contract gives you four free superpowers that every modern markdown tool understands:
| Convention | What It Gives You | Why The Agent Cares |
|---|---|---|
[[Wikilinks]] |
First-class internal references | Lets the agent traverse a graph instead of a flat search [Link] |
| YAML frontmatter | Typed metadata per file (tags, dates, human_reviewed) |
Filterable provenance; closes the cognitive-debt loop |
| Atomic notes (Matuschak) | One concept per file — "atomic, concept-oriented, densely linked" | BM25 scores stay meaningful; no "kitchen-sink" dilution |
| Folder-as-domain | software-engineering/, economy/, fashion/ |
Clean directory partitions for grounding scope |
- So the minimalist move is: make the agent write Obsidian-flavored markdown, but never actually open Obsidian unless you feel like it. The vault is a folder. Your editor is whatever (vim, VS Code,
glowfor in-terminal preview). And the search engine? Three small CLIs.
The CLI Stack: mq, FTS5, ripgrep
- The trick to making a folder feel like a database is choosing search tools that understand markdown's shape, not just its bytes. Three pieces do almost everything a personal-scale wiki needs.
1. mq — The jq for Markdown
mqis exactly what its tagline says: a jq-like query language for markdown's AST. [Link] Where ripgrep sees text,mqsees structure —## Heading,```code blocks```, lists, tables — as first-class objects. [Link]Why is that worth installing one more tool? Because
rg '^## 'happily matches## commentinside a code block. In a code-heavy doc, that's a 40% false-positive rate.mq '.h2'ignores code blocks by design. When you are mapping a 2,000-line research file before deciding what to read, this difference is the difference between a clean ten-line outline and ten lines of confusion.
# Quick TOC of any markdown file — never read cold again
$ mq '.h2' deep-research.md
# Auto-generate a clickable table of contents
$ mq 'select(or(.h1,.h2)) | to_link' deep-research.md
# Pull just JavaScript code blocks
$ mq '.code("js")' deep-research.md
2. SQLite FTS5 — A Local Search Engine That Fits in One File
SQLite FTS5 is the official full-text search extension shipped inside SQLite. [Link] It supports BM25 ranking out of the box, runs as a single file with no daemon, and reindexes faster than you can context-switch.
The default ranking function —
bm25()— returns a relevance score where lower is better, just like a golf scorecard. [Link] Plug that into your query and you get the top 5 most relevant files instead of the top 30 alphabetical hits. For a 200-file wiki, that single change saves about 70% of the context budget on every grounding call.
-- The query at the heart of the /ground skill
SELECT rel_path, bm25(notes_fts) AS score
FROM notes_fts
WHERE notes_fts MATCH 'fts5 OR mq OR "LLM Wiki"'
AND (human_reviewed != 'false' OR human_reviewed IS NULL)
ORDER BY score
LIMIT 10;
One footgun worth flagging up front: FTS5's default tokenizer treats hyphens and whitespace as token boundaries. So
MATCH 'remote-control'parses asremote AND control, andcontrolthen gets read as a column name, blowing up withError: no such column: control. The fix is mechanical — wrap multi-word or hyphenated terms in double quotes:MATCH '"remote-control"'. Once you know it, you never trip on it again.Reindexing is the other thing FTS5 quietly nails. Even a fresh-from-scratch reindex of ~200 markdown files completes in roughly 3 seconds on a modern laptop. That is fast enough to run after every batch of
/deep-researchoutputs, which means your search index is never meaningfully stale.
3. ripgrep — The Default Hammer
ripgrep (
rg) is the universal donor of code search. It is so well-regarded that Claude Code's built-inGreptool is internally backed by@vscode/ripgrep, and one CodeAnt writeup makes the case bluntly: every coding agent should be usingrginstead of GNUgrep. [Link] For our purposes,rgis the fallback when FTS5 is unavailable or when you need raw content extraction with-C 5line context for citation.Together, the layered roles look like this:
| Stage | Tool | Job | Bad Habit It Replaces |
|---|---|---|---|
| Discovery | Glob + FTS5 |
Rank files by keyword density (BM25) | Reading every file alphabetically |
| Map | mq '.h2' |
Outline a file before opening it | Cold-reading a 2,000-line doc |
| Pinpoint | rg -n -C 5 |
Pull cited passages with line numbers | Quoting from memory |
| Verify | Read offset/limit + frontmatter check |
Confirm provenance before quoting | Citing your own LLM-generated synthesis as truth |
- This stack costs you exactly $0/month and roughly two
brew installlines. Compare with a Pinecone subscription. The math writes itself.
The Skill Layer: /deep-thinking:ground and /deep-thinking:deep-research
The CLIs above are the muscles. You still need a brain telling them what to do. That is what Claude Code skills are for — small, declarative instruction sets that a slash command pulls in on demand. [Link]
I packaged my own search-and-generate workflow as a public plugin called deep-thinking, which lives at github.com/JSON-OBJECT/claude-code/tree/main/plugins/deep-thinking. Two skills do the heavy lifting.
/deep-thinking:ground — Five-Stage Markdown Source Grounding
/groundis the answer-anything skill. It enforces a six-stage pipeline (numbered 0 through 5) before a single sentence of an answer is allowed to be written:
| Stage | Activity | Primary Tool |
|---|---|---|
| 0. Awareness | Pick the right tool layer (internal vs shell vs MCP) | — |
| 1. Discovery | Narrow + rank candidates | FTS5 BM25, Glob fallback |
| 2. Map | Outline long files cheaply | mq '.h2' |
| 3. Pinpoint | Extract quotable context | rg -n -C 5 |
| 4. Verify | Targeted read + frontmatter provenance gate | Read offset/limit |
| 5. Augment | Web search, only for genuine gaps | Brave Search MCP |
Two design choices keep it honest. First, every claim in the final answer must carry a
file:linecitation — no "I think the file says X" allowed. Second, the answer is written as flowing magazine-style prose with metaphors, not a bulleted data dump. Bullets are for the grounding summary at the top; the answer body reads like a tech essay. This was a deliberate call: the moment knowledge-base output starts looking like a Wikipedia infobox, your brain stops engaging with it.Worth being upfront about a quiet hazard the skill protects against: when an LLM cites its own previously generated synthesis as ground truth, you get a closed feedback loop that MIT Media Lab's 2025 EEG study (Kosmyna et al., Your Brain on ChatGPT) labeled the "accumulation of cognitive debt" — measurable declines in critical thinking and information depth in LLM-assisted writers. [Link] The Stage-4 frontmatter gate is the brake: every agent-generated note carries
generated_by:andhuman_reviewed: falseuntil a human stamps it, and/groundrefuses to use unreviewed notes as primary evidence. The mausoleum keeps its skeletons; the garden keeps its compost pile.
/deep-thinking:deep-research — The Generator
/deep-researchis the other half of the loop. Hand it a topic; it dispatches multiple sub-agents in parallel to gather sources, cross-verify, and write a long-form, citation-heavy markdown report in Obsidian convention —[[wikilinks]]to other notes in the vault, YAML frontmatter declaring its own provenance, atomic enough that a future query can grab one section without dragging in the rest.This is the input side of the wiki. You ask, it researches, the file lands in the right folder. The next time you ask a related question,
/groundfinds it via FTS5 BM25 in milliseconds. The cycle closes.Schedule
/deep-researchon a cron (or Claude Code's built-in/loop) for the topics you actively track, and you have what Karpathy promised: a wiki that compiles itself, indefinitely. Set a budget cap with--max-budget-usdso a hallucinating sub-agent cannot torch your monthly quota — the April 2026 GitHub thread from a Max 20x subscriber who burned 80% of their weekly window on a single Opus 4.6 spiral is the kind of cautionary tale you only need to read once. [Link]
A Day in the Life of the Garden
What does it actually feel like? Less like operating software, more like tending a small plot.
You wake up. Overnight, a cron-scheduled
/deep-researchproduced a fresh 300-line dossier on a topic you were tracking — say, the April 2026 Anthropic outage [Link] or the latest Obsidian Bases changelog. [Link] The file is parked in_inbox/withhuman_reviewed: false. You spend ten minutes reading it, fix two factual nits, change the flag totrue, move it into the right domain folder. The agent did the gathering; you did the judging.At lunch you ask Claude a question.
/groundruns FTS5 in 80ms, picks the three most relevant files, maps their headings withmq, pulls cited passages withrg, and writes you a paragraph-style answer withfile:linecitations you can audit. No vector DB was queried. No embedding was rebuilt. Total cost: about $0.04 of API tokens.In the afternoon, you delete a stale file. No reindex required — well, technically yes, but the reindex is
python3 fts5-reindex.pyand finishes before you can switch tabs.This is the gardener's posture. You plant (a question turns into a
/deep-researchrun), you weed (you reject hallucinated outputs at thehuman_reviewedgate), you compost (_archive/for files that have served their purpose), and you watch a graph of related markdown files quietly thicken over months. Maggie Appleton's "digital garden" metaphor was always the right one; the LLM era just made it operationally cheap. [Link]
Conclusion: The Filing Cabinet Was Always the Wrong Metaphor
The deeper lesson of the LLM Wiki movement is that knowledge management was solving the wrong problem for two decades. Tools like Notion and Evernote assumed the bottleneck was capture — get more in, faster, with prettier toolbars. But once an agent can pour more high-context text into your vault overnight than you could write in a year, capture is no longer the bottleneck. Curation is. What survives the
human_reviewed: truegate becomes your real second brain. The rest is compost.This stands in deliberate contrast to the cloud-PKM industry, which spent 2024 and 2025 chasing AI-tinted features — Notion AI, Mem.ai, Reflect — at premium subscription tiers, while quietly tightening lock-in. Cory Doctorow's coined word for that arc, enshittification, is no longer a slur; it is a forecast. The minimalist stack — markdown plus three CLIs plus a Claude Code plugin — is enshittification-proof in the way only plain text can be: when Anthropic prices change, when Obsidian's license drifts, when whatever startup you trusted gets acquired, your folder of
.mdfiles keeps working.The result is something quietly architectural: the convention layer and the rendering layer get separated for the first time. Obsidian gave us a beautiful coupling — the conventions and the app shipped together — and that coupling earned its roughly 1.5 million monthly actives honestly. But coupling is also a liability the moment the convention is the only part you actually need. Borrowing Obsidian's vocabulary while running on
mq+ FTS5 +rgis the same trick that made Markdown itself a winner: the format outlived every editor that ever rendered it.Of course, this approach has trade-offs. You lose the graph view (until you write your own with
mqand D3, which is not hard but is not free either). You lose the casual-user onboarding ramp — non-terminal users will rightly stick with Obsidian. And you take on the discipline of frontmatter hygiene; withouthuman_reviewedflags, you absolutely will fall into the cognitive-debt loop the MIT study warned about. The minimalist stack is a sharp knife; treat it like one.Looking ahead, the more interesting question is not "which tool wins" but what happens to the unit economics of personal knowledge when generation is essentially free and curation becomes the scarce resource. My bet is that the next two years belong to people who learn to be ruthless gardeners: who delete more than they keep, who promote fewer files than the agent generates, and who treat their wiki not as an archive to preserve but as a living plot they walk through every morning with a pair of shears. The mausoleum was always the wrong building. We were supposed to be growing something.
References
- Primary Sources
- https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f
- https://www.sqlite.org/fts5.html
- https://github.com/harehare/mq
- https://github.com/JSON-OBJECT/claude-code/tree/main/plugins/deep-thinking
- https://stephango.com/file-over-app
- https://code.claude.com/docs/en/skills
- https://obsidian.md/changelog/2025-05-21-desktop-v1.9.0/
- https://www.media.mit.edu/publications/your-brain-on-chatgpt/
- Tech Media
- https://arstechnica.com/ai/2026/04/anthropic-tested-removing-claude-code-from-the-pro-plan/
- https://www.mindstudio.ai/blog/andrej-karpathys-llm-wiki-knowledge-base-claude-code
- https://www.mindstudio.ai/blog/llm-wiki-vs-rag-internal-codebase-memory
- https://ranksquire.com/2026/03/04/vector-database-pricing-comparison-2026/
- https://airbyte.com/data-engineering-resources/milvus-database-pricing
- Tooling References
- https://www.sqlitetutorial.net/sqlite-full-text-search/
- https://www.codeant.ai/blogs/why-coding-agents-should-use-ripgrep
- https://dev.to/harehare/mq-the-missing-link-between-jq-and-markdown-bge (community-authored)
- https://www.obsibrain.com/blog/obsidian-linking-the-complete-guide-to-connecting-your-notes (Personal Blog, Tier 3)
- Personal Blogs
- https://maggieappleton.com/ (digital garden essays)
- https://www.joanwestenberg.com/i-deleted-my-second-brain-692aa40d59d5f06dd5131e43/ (June 2025 PKM critique that opened the mausoleum metaphor)
- https://notes.andymatuschak.org/Evergreen_notes (Andy Matuschak's three principles: atomic, concept-oriented, densely linked)
- Community Discussions
- https://www.reddit.com/r/ClaudeAI/comments/1qr19df/ (user-reported
rg + fts5setup) - https://github.com/anthropics/claude-code/issues/46727 (community-reported sub-agent hallucinations)
- https://www.reddit.com/r/ClaudeAI/comments/1qr19df/ (user-reported



