Home
Profile
Work
Notebook
Media

April 18, 2026

Building a Persistent Knowledge Base for Claude Code: From Session Amnesia to Compounding Intelligence

A long-form engineering deep-dive into applying Karpathy's LLM-wiki pattern to Claude Code: transcript-replay auto-updates, Obsidian as source of truth, the honest trade-offs of Graphiti and custom builds, and a full survey of where else a personal knowledge base can live in 2026.

Building a Persistent Knowledge Base for Claude Code

Author: Puvaan Shankar Date: April 2026 License: MIT Keywords: Claude Code, Knowledge Base, Obsidian, Karpathy LLM Wiki, Graphiti, Transcript Replay, MCP, Basic Memory, Agent Memory, Digital Garden


1. Abstract

LLM coding agents have a chronic problem: every session begins with amnesia. Claude Code opens a repo, re-reads the architecture, re-greps for patterns, and re-discovers the same gotchas that were solved three sessions ago. Retrieval-Augmented Generation (RAG) helps, but it re-derives context on every query rather than accumulating it. In January 2026, Andrej Karpathy proposed an alternative: have the LLM incrementally maintain a wiki — a persistent, cross-referenced markdown knowledge base that sits between raw sources and the agent, so synthesis compounds.

This post documents the full build of that pattern for Claude Code, from the Karpathy gist to a working system in a week. The architecture has three layers (raw codebases, an Obsidian vault, and a schema in CLAUDE.md), two hooks (SessionStart injects the wiki, Stop spawns a background transcript-replay extractor), two slash commands (/wiki-bootstrap, /wiki-lint), and a specialized wiki-curator subagent. It also documents everything I considered and rejected — Graphiti, a custom SQLite+vec+graph tool, and a handful of hosted alternatives — and why. The blog closes with an honest survey of where else a personal knowledge base can live in 2026, including the unusual case of X/Twitter as a wiki.


2. The problem: session amnesia is a tax you pay every day

Open Claude Code in a mid-sized repo. The first 20 minutes look like this:

  1. Read CLAUDE.md (if it exists) — ~500 tokens.
  2. Grep the codebase for “where does X live?” — 5–10 tool calls, 5–10 K tokens of context.
  3. Ask the model to summarize the architecture — ~1 K tokens response.
  4. Start the actual work.

That preamble happens on every session. Over a month of daily use, you are spending 50–80% of your first-quarter tokens re-deriving what you already knew yesterday. If you /clear, you pay it again. If you switch between two repos during the day, you pay it twice. If you return from vacation, Claude re-learns the project from scratch as if you’d never worked on it.

Worse, the knowledge is never written down in a form the model can reuse. You might have realized last Tuesday that a specific API endpoint returns unexpected errors for a particular input shape, or that a background job has a silent timeout that only surfaces under load. You fixed the bug. You moved on. Nothing captured that insight in a form your next session will read. Git log has “fix(scope): pin config” — technically correct, operationally useless; it doesn’t tell the model why that configuration matters or what the failure mode looks like.

This is the tax. A persistent knowledge base is the only way to stop paying it.


3. The Karpathy wiki pattern

On February 23, 2026, Andrej Karpathy published a short gist proposing an alternative to RAG. The core insight:

Traditional RAG requires the LLM to rediscover knowledge from scratch on every question. The knowledge isn’t accumulated — it’s re-derived. Asking a question requiring synthesis of five documents means the LLM must find and reassemble relevant fragments each time.

His proposed architecture has three layers:

LayerRole
Raw sources (immutable)Your curated documents. The codebase, in our case.
The wiki (LLM-generated)Markdown files with summaries, entity pages, and cross-references. The LLM maintains this.
Schema documentConfiguration telling the LLM how to maintain the wiki.

And three operations:

  • Ingest: when adding a source, the LLM extracts key information, updates the relevant wiki pages, maintains cross-references, and logs the action.
  • Query: ask questions against the wiki; valuable answers can be filed back as new pages.
  • Lint: periodically health-check for contradictions, orphaned pages, missing cross-references.

The wiki is a persistent, compounding artifact. Cross-references already exist; contradictions are flagged; synthesis reflects everything accumulated. The human curates sources and asks questions; the LLM handles the bookkeeping.

It’s worth noting what this is not: it’s not RAG (no vector retrieval on every query), not a knowledge graph (no triples or SPARQL), not a memory system (no sliding window or summarization). It’s a persistent markdown artifact that the model reads at the start of a session and updates at the end. The simplicity is the point.


4. What I already had (and why I was only a few hooks short)

An audit before building was the right first move. It turned out my existing stack already had most of the pieces:

1. A mature Obsidian vault at ~/obsidian/myvault/. Structured folders (Work/, Notes/Claude/, Personal/), 13 templates (ADR, incident postmortem, TIL, meeting notes), standardized YAML frontmatter, a consistent tag system. Years of curation. Existing notes on Claude Code hooks, tips, and prompts already collected under Notes/Claude/.

2. Per-project CLAUDE.md files across ~30 codebases. Uneven depth — work projects rich, side projects sparse — but the pattern is established.

3. Claude Code’s auto-memory system at ~/.claude/projects/<proj>/memory/. 12 projects were maintaining MEMORY.md files plus specialized feedback files (feedback_issue_creation.md, feedback_mr_template.md). Active, but session-scoped.

4. PreToolUse hooks at ~/programming/claude-code-setup/ blocking .env files and sensitive paths from Read|Grep.

Three layers of knowledge, well-maintained, completely disconnected. Claude Code never read the Obsidian vault on session start. Learnings from sessions never flowed into the vault. The memory feedback files were growing into mini-wikis inside Claude’s own state, duplicating effort.

The missing piece was exactly what Karpathy described: a schema and the two hook integrations — one to inject the vault at SessionStart, one to harvest learnings at SessionStop. Everything else already existed.


5. The design space: three alternatives, one pick

Before building anything, I mapped the design space into three viable architectures.

5.1 Option A — Obsidian only

Extend ~/obsidian/myvault/ with a new Projects/<repo>/ folder. Each codebase gets _index.md, Gotchas.md, Decisions/, Patterns.md, Glossary.md. SessionStart hook reads these pages and injects into context. Stop hook prompts Claude to append learnings.

Pros: zero infrastructure, human-readable, survives tool churn, works with the Obsidian ergonomics I already use daily.

Cons: depends on Claude’s in-session discipline to update; no temporal reasoning; keyword search only.

5.2 Option B — Graphiti (temporal knowledge graph)

Graphiti is Zep AI’s open-source framework for building temporal knowledge graphs as agent memory. An LLM extracts (subject, predicate, object) triples from unstructured text and stores them in Neo4j with valid-from and valid-until timestamps. When new facts contradict old ones, the old ones are invalidated (not overwritten), preserving history. You can query “what did I believe about X as of January?” and get the old answer.

Pros: automatic entity extraction, bi-temporal reasoning, robust against drift, hybrid search (graph + vector + FTS), MCP server already exists.

Cons: Neo4j running locally (500 MB idle), LLM API costs per ingestion ($0.01–0.05 per session-end), schema-bound (re-ingest on upgrades), not human-readable without a UI.

5.3 Option C — Build a custom “Obsidian + graph” tool

Markdown as source of truth, SQLite-with-extensions as the index:

  • FTS5 for full-text search
  • sqlite-vec for semantic embeddings
  • Graph tables (entities, relations) with temporal columns
  • A file watcher re-indexes on change
  • FastMCP server exposes search, neighbors, as_of, related_files

Pros: no Neo4j, no vendor lock-in, markdown stays portable, full control.

Cons: 3–4 weekends to MVP plus ongoing maintenance; the “build vs. buy” tax; Basic Memory MCP already does roughly 80% of this.

5.4 The pick

Option A. For a handful of codebases with a few thousand facts, the value of a temporal graph is theoretical. Keyword search over markdown is fine. The real problem is reliable session-end updates, not query power — and that can be solved without a graph DB.

Option B remains available as a Phase 2 parallel index if the Obsidian-only version hits its ceiling. Option C stays shelved until I’ve validated the gap with Basic Memory first.

This is the build-vs-buy discipline: don’t build what you can’t yet articulate a specific need for. Ship the minimum, run it for a month, then decide from data.


6. Architecture

┌──────────────────────────────────────────────────────────────┐
│  Raw sources (immutable)                                     │
│  ~/work/*,  ~/programming/*   — codebases                    │
└──────────────────────────────────────────────────────────────┘
                          ↑ read on demand
┌──────────────────────────────────────────────────────────────┐
│  The wiki (LLM-maintained, human-reviewed)                   │
│  ~/obsidian/myvault/Projects/<repo>/                         │
│    ├─ _index.md         Architecture overview + wikilinks    │
│    ├─ Gotchas.md        Append-only list of non-obvious traps│
│    ├─ Decisions/        ADRs per non-obvious choice          │
│    ├─ Patterns.md       Codebase-specific patterns           │
│    ├─ Glossary.md       Domain terms                         │
│    └─ _pending.md       Auto-drafted staging file            │
└──────────────────────────────────────────────────────────────┘
                          ↑ consulted first at SessionStart
┌──────────────────────────────────────────────────────────────┐
│  The schema (how-to-maintain)                                │
│  ~/.claude/CLAUDE.md      — "Project Wiki" section           │
│  Per-project CLAUDE.md    — points to vault pages            │
└──────────────────────────────────────────────────────────────┘

Three mechanisms drive it:

  1. SessionStart hook — injects _index.md + Gotchas.md as additionalContext so Claude is oriented before reading any code.
  2. Stop hook + background extractor — reads the session transcript, drafts wiki-worthy learnings to _pending.md via a headless claude -p call, and exits without blocking.
  3. Slash commands and a subagent/wiki-bootstrap seeds a new project; /wiki-lint audits for drift; the wiki-curator agent handles all other vault operations so the main thread stays clean.

7. Implementation

7.1 The SessionStart hook

Every new session, resumption, or /clear fires a script that resolves the current repo, reads the matching wiki folder, and emits it as additional context. Hard-capped at ~2 K tokens (8 KB) so it never bloats the session.

#!/bin/bash
# ~/programming/claude-code-setup/wiki-session-start.sh

set -uo pipefail
INPUT=$(cat)
VAULT="$HOME/obsidian/myvault/Projects" INPUT="$INPUT" \
/usr/bin/python3 <<'PYEOF'
import json, os, subprocess, sys

# Byte cap (~2 K tokens). Using bytes avoids a tokenizer dependency;
# UTF-8 prose averages ~4 bytes/token so 8192 bytes ≈ 2 K tokens.
MAX_BYTES = 8192
payload = json.loads(os.environ.get("INPUT", "{}"))
cwd = payload.get("cwd") or os.getcwd()

# Resolve repo name: git root basename
try:
    r = subprocess.run(["git", "-C", cwd, "rev-parse", "--show-toplevel"],
                       capture_output=True, text=True, timeout=2)
    repo = os.path.basename(r.stdout.strip()) if r.returncode == 0 else None
except Exception:
    repo = None

if not repo:
    sys.exit(0)  # Not a git repo — silent exit

wiki_dir = os.path.join(os.environ["VAULT"], repo)
index = os.path.join(wiki_dir, "_index.md")
gotchas = os.path.join(wiki_dir, "Gotchas.md")
pending = os.path.join(wiki_dir, "_pending.md")

parts = []
if os.path.exists(index):
    parts.append(f"## Project wiki: {repo} (from Obsidian)\n")
    parts.append(f"Source: `~/obsidian/myvault/Projects/{repo}/`\n")
    parts.append("### _index.md\n")
    parts.append(open(index).read())
    if os.path.exists(gotchas):
        parts.append("\n### Gotchas.md\n")
        parts.append(open(gotchas).read())
    if os.path.exists(pending):
        parts.append("\n### _pending.md (unreviewed)\n")
        parts.append(open(pending).read())
else:
    parts.append(f"## No wiki for `{repo}` yet. Run `/wiki-bootstrap`.")

context = "\n".join(parts)
if len(context.encode("utf-8")) > MAX_BYTES:
    context = context.encode("utf-8")[:MAX_BYTES].decode("utf-8", errors="ignore")
    context += "\n\n[...truncated — see Obsidian]"

print(json.dumps({
    "hookSpecificOutput": {
        "hookEventName": "SessionStart",
        "additionalContext": context
    }
}))
PYEOF

7.2 The Stop hook and transcript replay

The Stop hook’s only job is to not block session exit. It spawns the extractor in the background with nohup ... &; disown and returns immediately:

#!/bin/bash
# ~/programming/claude-code-setup/wiki-session-stop.sh

set -uo pipefail
INPUT=$(cat)
EXTRACTOR="$HOME/programming/claude-code-setup/wiki-extract-learnings.sh"
LOG_DIR="$HOME/programming/claude-code-setup/logs"
mkdir -p "$LOG_DIR"

[ ! -x "$EXTRACTOR" ] && exit 0

LOG_FILE="$LOG_DIR/wiki-extract-$(date +%Y%m%d-%H%M%S).log"
(
  echo "$INPUT" | nohup "$EXTRACTOR" >"$LOG_FILE" 2>&1 &
  disown
) >/dev/null 2>&1 &

exit 0

The extractor (running detached) is where the real work happens. It reads the transcript JSONL, strips tool calls and results (too noisy, too large), builds a compact view of just user and assistant messages, caps it at ~10 K tokens (40 KB), and feeds it to a headless claude -p call with a narrow prompt:

You are extracting durable, wiki-worthy learnings from a Claude Code session transcript into Obsidian notes. Rules: only include non-obvious architectural insights, decision rationale, and gotchas that aren’t derivable from code. Skip file paths, function signatures, recent change history. Avoid duplicates against the existing Gotchas.md (provided). If nothing qualifies, output exactly NO_LEARNINGS.

The output is written to Projects/<repo>/_pending.md. At the next session start, the hook reads _pending.md along with the rest of the wiki and flags it to the user for review. Nothing auto-merges into Gotchas.md or the ADR folder — that’s a deliberate human-in-the-loop step.

This decouples update quality from Claude’s in-session discipline. The extractor has one job, the full transcript, a narrow prompt, and a staging target. It’s far more reliable than asking mid-session Claude to remember to file things.

7.3 Slash commands

Two user commands live in ~/.claude/commands/:

  • /wiki-bootstrap — seeds a new repo’s wiki. Reads the project’s CLAUDE.md, scans .claude/projects/*/memory/feedback_*.md, greps for TODO|FIXME|HACK|XXX (with 2 lines of context), reads the last 100 commits for fix(/refactor( rationale, synthesizes _index.md + Gotchas.md, and shows the drafts for approval before writing.
  • /wiki-lint — audits an existing wiki. For every page: verifies referenced file paths still exist; greps for function and class names in backticks and flags zero-match references; checks ADRs whose referenced code has moved; finds orphan wikilinks. Produces a punch list; never auto-fixes. Sampled mode for large vaults.

7.4 The wiki-curator subagent

The main thread shouldn’t do vault operations inline — it bloats context with file reads and frontmatter validation. A specialized subagent at ~/.claude/agents/wiki-curator.md handles it all: query, append gotcha, create ADR, update _index.md, lint, migrate from memory/ feedback files, review _pending.md. The agent has tight conventions (vault layout, frontmatter schema, what-goes-where rules) and returns a compact micro-report:

Action: appended 1 gotcha to Gotchas.md
Files touched: ~/obsidian/myvault/Projects/<your-repo>/Gotchas.md
Key finding: background job silently drops items when queue exceeds threshold (now documented with workaround)
Follow-ups: none

The main thread sees the micro-report; the agent absorbs the token cost of reading three vault files, checking for duplicates, and formatting the frontmatter. Classic delegation pattern.

7.5 Schema: the global CLAUDE.md update

The final piece is writing down the rules so Claude knows what to do. A new “Project Wiki” section in ~/.claude/CLAUDE.md specifies:

  • Where the vault lives.
  • When to write: non-obvious architecture, decision rationale, gotchas, codebase-specific patterns.
  • When NOT to write: anything derivable from code, git history, task state, or already captured in CLAUDE.md.
  • How the Stop hook and _pending.md staging file work.
  • How to invoke /wiki-bootstrap and /wiki-lint.

This is the “schema document” Karpathy described — the configuration telling the LLM how to maintain the wiki.


8. Token accounting: is this actually efficient?

A fair question. Every injection costs tokens. Let’s do the math.

A note on units: the scripts cap context in bytes for simplicity (no tokenizer dependency). UTF-8 prose averages ~4 bytes/token, so 8 KB ≈ 2 K tokens and 40 KB ≈ 10 K tokens. All size estimates below include both units for clarity.

Per-session cost (what you pay during the session):

MomentWhat loadsSize
SessionStart_index.md + Gotchas.md≤2 K tokens (8 KB, capped)
Mid-sessionClaude reads code only for specificsSaves ~5–25 K tokens of re-exploration
StopBackground spawn only0 tokens in-session

Background cost (decoupled from the session):

MomentWhat runsCost
Session endswiki-extract-learnings.sh spawns claude -p~10 K tokens / ~$0.02 per session-end

Net: a single _index.md injection saves the model from doing 5–10 Read/Grep calls to orient itself. On a typical session that previously burned ~5–12 K tokens on “where does this service live? what does that module do? what’s the request flow?” the injection collapses that to ~500 tokens of pre-digested context. For a developer running 5–10 sessions per day across 10+ repos, the monthly savings are substantial and the background extraction cost is a rounding error.

More importantly, this is asymptotically better than RAG: RAG pays retrieval cost on every query; this pays ingest cost once per session and retrieval is a single file read. The cost curve flattens as the wiki grows.

One caveat: the first few bootstrapped projects will have sparse wikis, and the injection won’t yet pay for itself. Value compounds over weeks. This is an investment, not an optimization.


9. What could go wrong (and mitigations)

I don’t want this to read like a success story. There are real risks.

The wiki could become stale. The codebase moves faster than the wiki. File paths get renamed, decisions get overturned, and the wiki still says the old thing. Mitigation: /wiki-lint on a cadence (quarterly, or after large refactors). More importantly, the “never write file paths or function signatures” rule means most wiki content is architectural — it stays true through renames.

The Stop hook’s extractor could drift in quality. The prompt decides what counts as “wiki-worthy.” A bad prompt produces noise; a too-strict prompt produces NO_LEARNINGS when there was real value. Mitigation: _pending.md is a staging file reviewed by a human, not an auto-merge. The first 2–3 weeks of output are training data for the prompt.

The vault could bloat. If every session produces a new ADR, the Decisions/ folder becomes unnavigable. Mitigation: the prompt explicitly rejects decisions that belong in Gotchas.md (shorter, bullet form) and penalizes duplicates.

Claude might hallucinate from the wiki. If _index.md is wrong, Claude confidently repeats the wrongness. Mitigation: lint catches stale references to code; the wiki’s role is orientation, not authority. Code is still the source of truth for specifics.

Background extraction could silently fail. No one notices for weeks. Mitigation: log file per run in ~/programming/claude-code-setup/logs/; occasional ls -la shows whether extractions are happening. Phase 2 could add a systemd-style health check.

Vault-as-git-repo conflicts. If the vault is a git repo and the Stop hook writes _pending.md during an active Obsidian sync, you can get merge conflicts. Mitigation: the extractor writes atomically (temp file + mv), and _pending.md is excluded from any vault sync.


10. The build-vs-buy survey: where else could a knowledge base live in 2026?

The user-facing question at the end of this build was: “Where else could I put a wiki?” Because Obsidian is one answer, not the only answer. Here’s the honest survey.

10.1 Local-first markdown tools

Obsidian remains the gold standard. Local plaintext, offline-first, extensive plugin ecosystem, trivial for Claude Code to read via file paths. No native MCP but that’s a half-day of glue code. Best for: anyone whose knowledge base is already here, control-oriented users, academic writers.

Logseq is the outliner alternative: block-based, free, open-source, native PDF annotation. Agent access works the same (file-based) but block structure is noisy to extract whole-note summaries. Best for: daily journalers, bottom-up knowledge construction.

Dendron is Obsidian inside VS Code — hierarchical markdown, same file-based agent access. No mobile app. Best for: developers who refuse to leave the editor, projects with hierarchical API docs.

Foam, Reor, SilverBullet are niche. Foam is VSCode-based Obsidian-lite, Reor adds local AI search to markdown but isn’t mature, SilverBullet has embedded Lua for scripted notes. Work fine with agent file access but aren’t designed for it.

Verdict: Obsidian for ecosystem depth. Logseq for outliners. Dendron for editor-centric workflows. Skip the rest unless you have a specific need.

10.2 Hosted / cloud knowledge tools

Notion dominates this category. Flexible, mature API, official MCP server in the registry. Agents can read/write pages and query databases. Token efficiency depends on your schema — bloated pages with long property lists get expensive. Best for: teams on Notion already, collaborative editing, anyone who needs a polished UI without self-hosting.

Roam Research pioneered bidirectional linking and the “second brain” category. Block-level, dense interconnection. No official MCP but API exists. $15/mo minimum. Best for: people already using Roam as their thinking model.

Tana is Roam’s evolution with supertags (typed data). Bridges Roam’s freeform feel with Notion’s structure. Agent integration unclear. Best for: early adopters of structured outlining.

Mem.ai is AI-native — organizes your notes with LLM embeddings automatically. No open API. Not agent-accessible for external tools. Best for: passive users who don’t want to organize, useless for agent-driven workflows.

Craft, RemNote, Capacities are design-forward niche tools. Craft for Apple users, RemNote for spaced-repetition learners, Capacities for object-oriented knowledge. None are agent-accessible to speak of.

Verdict: Notion is the only mature option here for agent access. Everything else is proprietary with no hooks. If you’re already on Roam, staying is fine; migrating for agent access isn’t worth it yet.

10.3 Git-native approaches

GitHub wiki (built-in to every repo) is free, version-controlled, Markdown in a wiki branch. Claude Code reads it directly via the GitHub MCP server. No search UI, small attack surface. Best for: open-source projects, team onboarding, API references alongside code.

Repo-based /docs — just a folder in the repo. No tooling. Version-controlled alongside the code. Perfect for inline ADRs, runbooks, and CONTRIBUTING-adjacent docs. Agent reads it via file access. Underrated.

Static site generatorsQuartz, Astro + Starlight, Docusaurus — build a searchable site from Markdown. Production-grade, free hosting on GitHub Pages. Agents read the source markdown via git clone; the built site is for humans.

  • Quartz for digital gardens with backlinks and graph view; minimal JS.
  • Starlight for modern docs with great accessibility.
  • Docusaurus for versioned, React-based docs.

Verdict: /docs in git for team knowledge, Quartz for personal digital gardens, Starlight/Docusaurus for public docs. None have APIs, but all are readable via file fetch once cloned — which is often enough for a coding agent.

10.4 Self-hosted wikis

Wiki.js is the modern lightweight pick — Node.js, beautiful UI, strong auth. API available. Best for: teams wanting polish without MediaWiki complexity.

BookStack organizes as Books → Chapters → Pages. PHP-based, clean, great for internal corporate docs. No official API but the database is yours. Best for: structured corporate wikis.

Outline is modern and collaborative with a documented REST API. Self-hostable. Best for: teams wanting a Notion-like experience without vendor lock-in.

TiddlyWiki is a single-file wiki — your entire knowledge base is one HTML file. Ultra-portable, offline-first, vintage-feeling but bulletproof. Best for: solo users, travelers, minimalists.

MediaWiki is the WordPress of wikis. Powers Wikipedia. Overkill for most. Best for: large communities, archival knowledge, projects needing Wikipedia-scale infrastructure.

DokuWiki is lightweight, file-based (no database), PHP. Simple, ancient, stable. Best for: sysadmins already running PHP.

Verdict: Wiki.js or Outline for teams. BookStack for structured internal docs. TiddlyWiki for solo users who want portability. Skip MediaWiki/DokuWiki unless you already run them.

10.5 Agent-native and MCP-native systems

This is the category that’s moved fastest in 2026.

Basic Memory — MCP server that treats local markdown as a knowledge graph. Entity extraction, semantic search, already Claude-Code-native. Closest off-the-shelf equivalent to what this blog post builds from scratch.

Mem0 is the most mature agent memory system in 2026. Combines vector embeddings with optional knowledge graphs, supports multi-LLM backends, official MCP server. Hosted or self-hosted. Compliance features for teams. Best for: personalization agents, long-running bots, regulated environments.

Zep / Graphiti models memory as temporal knowledge graphs with validity windows. REST API. Best for: agents that need to track how facts change over time.

Letta is an agent framework where the agent explicitly manages its own memory using dedicated tools. More transparent than automatic extraction. Best for: long-running stateful agents where explicit memory control matters.

OpenMemory, SuperMemory, and smaller projects exist but are less mature. Revisit in a year.

Verdict: Basic Memory if you want the wiki pattern pre-built. Mem0 for managed agent memory at scale. Zep/Graphiti when your domain is genuinely temporal. Letta for agent-owned memory.

10.6 The unconventional: “public thinking” and X/Twitter as a wiki

This is where the user asked — “where else besides Obsidian, maybe X?” — and it deserves a honest answer.

People do use X/Twitter as a quasi-knowledge base. Build-in-public threads, TILs, permalinks to your own insights. Tools like TweetThreadSaver let you archive threads and export to Markdown or LLM-friendly JSON. Each tweet has a permanent URL.

The strengths are real: high visibility, real-time discussion, audience building, accountability (writing publicly forces clarity), and permalinks that survive across devices.

The weaknesses are also real:

  • Character limits destroy nuance. A thread is a hack around a fundamental constraint.
  • Search is crippled. X’s native search misses your own tweets from three months ago.
  • Algorithmic decay. Old tweets are functionally invisible.
  • Privacy is public-by-default. Every rough draft is indexed by Google.
  • Threading is nonlinear and fragile. Edit history is limited; tweets can vanish.
  • Agent access requires the API, which is rate-limited and now requires paid tiers.

Verdict: use X for broadcasting insights and building audience, not as primary storage. The discoverability is real but retrieval is poor. Treat X as an output channel, not a database. Pair it with a proper knowledge base that you export from (Obsidian, Notion, a blog) to X, and archive your own threads into structured storage for later LLM retrieval.

Dev.to, Medium, personal blogs are strictly better for this: longer-form, searchable, archival, often with APIs. If you’re going to publish, publish somewhere permanent. (This blog you’re reading is an Astro static site — the source markdown lives in git, which means a coding agent can read every post with git clone and grep. That’s a knowledge base by accident, and a good one.)

10.7 Specialized: scientific and academic

Zettlr is a Markdown editor built for academic writing with native Zotero integration via BibTeX. Graph view, Zettelkasten support, Pandoc-powered exports. Best for: researchers, dissertation writers, citation-heavy workflows.

Zotero is the reference manager — stores PDFs, annotations, metadata. Not a knowledge base alone, but the foundation for citation-driven work. Obsidian + Zotero is a common pairing.

Verdict: Zettlr + Zotero for anyone doing real research. Agent access requires file-system reads, not an API.

10.8 Database-backed

SQLite + FTS5 + sqlite-vec is the minimalist’s dream. Single file, full-text search, vector embeddings via a Rust extension. Direct SQL from the agent. Excellent token efficiency because you filter before sending to Claude. Best for: solo developers, embedded workflows, portable knowledge bases.

Postgres + pgvector if you already run Postgres. Vector similarity search alongside regular relational queries. Best for: teams that have infrastructure, anything beyond a single machine.

Vector databases (Qdrant, Chroma, Weaviate):

  • Qdrant — open-source, Rust-based, strong metadata filtering. Best for large-scale retrieval.
  • Chroma — Python-native, in-process or server. Best for LLM RAG pipelines.
  • Weaviate — integrates vector search with knowledge graphs. Best for enterprise semantic search.

Verdict: SQLite + sqlite-vec for portability. Postgres + pgvector if Postgres is already running. Chroma for quick RAG. Qdrant for scale. Weaviate for graph-adjacent workflows. None of these are knowledge bases — they’re retrieval engines you wrap.

10.9 The summary table

Use casePickWhy
Solo developer, markdown purist, offline-firstObsidian + git syncLocal ownership, mature, zero SaaS
Team collaboration, content managementNotionMature MCP, scales, collaborative
Open-source project docsGitHub wiki + DocusaurusFree, versioned, beautiful
Personal digital gardenQuartz + gitAesthetic backlinks, ownership
Internal corporate wikiBookStack / Wiki.jsSelf-hosted, structured, no lock-in
Agent memory / long-running botsMem0 or LettaBuilt for agents, stateful, managed
Academic writing + citationsZettlr + ZoteroCitation management, pro output
Vector search at scaleQdrant or Postgres+pgvectorMature, self-hostable
Ultra-minimalist stackSQLite + sqlite-vecSingle file, no servers
Building in publicX/Twitter, then export to proper storageAudience, not retrieval
Want the wiki pattern pre-builtBasic Memory MCPClose match, already MCP-native

The honest conclusion: no single tool fits everyone. Obsidian for locals, Notion for teams with MCP needs, database-backed for AI-heavy pipelines. The best knowledge base is the one you’ll actually maintain and feed — discipline matters more than the stack.


11. A specialized agent for knowledge-base operations

The main thread shouldn’t curate the vault inline — it’s a narrow, detail-heavy job. I built a wiki-curator subagent at ~/.claude/agents/wiki-curator.md that specializes in exactly seven operations:

  1. Query — given a topic + repo, find relevant pages and return a compact summary.
  2. Append gotcha — check for duplicates; if new, append to Gotchas.md with what/why/workaround.
  3. Create ADR — fill the existing Architecture Decision Record.md template and cross-link.
  4. Update _index.md — edit existing sections, never duplicate.
  5. Lint — verify file paths and function names against the codebase; produce a punch list.
  6. Migrate from memory/ — sweep feedback files, classify, promote the durable ones.
  7. Review _pending.md — classify auto-drafted suggestions as accept/reject/needs-user.

The agent’s prompt explicitly forbids:

  • Writing anything derivable from code.
  • Touching Credentials.md or files under Template/.
  • Using Write when Edit would do (don’t rewrite files whose surrounding content you didn’t know about).
  • Reporting lint drift without actually grepping (a function might have moved, not been deleted).

Return format is a four-line micro-report: Action / Files touched / Key finding / Follow-ups. The main thread sees the report; the agent absorbs the token cost of reading three files, validating frontmatter, and formatting output. Classic delegation pattern that keeps both contexts clean.


12. What’s next (Phase 2 candidates)

Phase 1 ships as above. I plan to run it for 4–6 weeks before evaluating these candidates:

A. Install Basic Memory MCP as a parallel experiment. See whether its auto-extraction + graph layer changes my queries. If it’s 80% of what I want, keep it and skip Graphiti. If it’s 100%, I saved a month of custom building.

B. Add Graphiti as a secondary index. Only if specific temporal queries start feeling necessary (“what did I believe about service X in Q1?”). Obsidian stays as source of truth; Graphiti ingests transcripts in parallel.

C. Build a custom SQLite+vec toolonly if Basic Memory and Graphiti both miss something I can articulate concretely. Speculative builds waste weekends; targeted builds ship.

D. Cross-repo pattern surfacing. If the same gotcha appears in three repos (e.g., “GCP region availability quirks”), promote it from per-project Projects/<repo>/ to Notes/GCP/. The wiki-curator can detect this with duplicate analysis.

E. Vault-as-git-repo with pre-commit hooks. Commit the vault on every _pending.md write. Gives an audit trail and rollback. Low cost, high safety.

F. An Obsidian plugin to visualize wiki drift. Heatmap of which pages lint-failed in the last run. Purely nice-to-have.


13. Closing thoughts

The Karpathy pattern is, in retrospect, obvious. LLMs are synthesis engines, not knowledge stores. Giving them a place to compound what they’ve synthesized converts every session into an investment instead of a fixed cost. Markdown is the right substrate because it survives tool churn — five years from now, Obsidian might not exist, but the files will still open.

The meta-lesson is build-vs-buy discipline. Every piece of this system was one small decision: (1) Obsidian is already running, so reuse it; (2) Graphiti is overkill for current scale, so skip it; (3) custom tool tempting but Basic Memory exists, so validate the gap first; (4) the hardest problem was reliable updates, not query power — solved by transcript replay, not by a graph database. Each “no” saved weeks.

The framing I keep coming back to is: session amnesia is a tax you didn’t know you were paying. Once you start paying attention to it, you can’t stop noticing. Every time Claude re-greps the codebase on a fresh session, every time you explain the same architecture to the same model, every time you /clear and feel the room go cold — that’s the tax. The wiki is how you stop.

If you’re building LLM coding workflows in 2026, you should have some version of this. It doesn’t have to be Obsidian, doesn’t have to be this architecture. But somewhere between the codebase and the model, there needs to be a layer that accumulates, or you’ll re-derive your way through the same twelve gotchas every Monday morning forever.


14. Appendix: the complete file layout

~/.claude/
├── CLAUDE.md                    # Global schema — includes "Project Wiki" section
├── settings.json                # SessionStart + Stop hooks wired
├── agents/
│   └── wiki-curator.md          # Specialist subagent
└── commands/
    ├── wiki-bootstrap.md        # Seed a new repo's wiki
    └── wiki-lint.md             # Audit for drift

~/programming/claude-code-setup/
├── wiki-session-start.sh        # SessionStart hook — injects vault
├── wiki-session-stop.sh         # Stop hook — fire-and-forget
├── wiki-extract-learnings.sh    # Background transcript replay
└── logs/                        # Extraction run logs

~/obsidian/myvault/
├── CLAUDE.md                    # Vault-specific conventions
├── Template/                    # Existing templates (ADR, TIL, etc.) — unchanged
└── Projects/                    # NEW — the wiki root
    ├── README.md                # How the system works
    └── <repo>/                  # Per-codebase
        ├── _index.md
        ├── Gotchas.md
        ├── Decisions/
        ├── Patterns.md
        ├── Glossary.md
        └── _pending.md          # Staging file from Stop hook

15. References


This post documents a real build shipped in April 2026. Source files, hook scripts, and agent definitions are in my personal dotfiles and available on request. If you build a variation, I’d genuinely like to hear how it went — the community around agent knowledge bases is still forming, and every implementation is a data point.