Why This Matters
If you use AI to research anything serious (tax law, immigration, financial planning, career strategy, health decisions), you will end up with hundreds of files. Some of them are good. Some are outdated. Some are duplicates. Some are AI-generated with no source attribution. You will not know which is which.
I ended up with 1,150 markdown files after six weeks of AI-assisted research. Searching for "Beckham Law 401k" returned five files. None of them declared itself the answer. Dated task lists sat next to canonical knowledge docs. There was no way to tell if the information was from last week or six weeks ago.
The system I describe here solves three problems: knowing which file is authoritative on a topic, knowing when information went stale, and being able to search by concern instead of by folder. If your vault is similar to mine, this is the structure that made it usable again.
Folder Structure
Numbered top-level folders, grouped by domain. The numbers enforce sort order and make it easy to navigate. Adapt the domain names to your life; the pattern matters more than the specific labels.
Three rules that kept my vault clean: nothing gets deleted (losers move to Archive with a manifest), working artifacts live separate from knowledge (a gate check from February is not research), and one canonical file per major topic.
This vault is one layer of a three-layer system: Google Drive holds source documents (PDFs, contracts, tax forms), the vault holds extracted knowledge, and a workspace holds operational scripts and configs. The folder numbers mirror across all three so the same mental model works everywhere.
Freshness Metadata
This is the highest-value change you can make. Every research file carries YAML frontmatter that tells you at a glance whether to trust it.
Status is time-based: current if the content was verified within 14 days, needs-review within 30, stale beyond that. Confidence depends on source quality; government publications and court rulings get high, AI-generated research without manual verification gets low. Source tracks where the information came from so you know whether it was verified against primary material.
Once files have this metadata, Dataview can surface everything that went stale. Instead of trusting a file because it exists, you trust it because the frontmatter tells you when it was last checked and how confident you should be.
confidence: high
source: secondary
status: current
tags: [move-critical, needs-lawyer, tax-spain]
Beckham Law 2026 Updates
Canonical Authority
When the same topic lives in multiple files, nobody knows which one to trust. The fix is simple: pick a winner and label it.
The canonical file gets an info banner at the top: "This is the authoritative document on [Topic]." Every other file on that topic gets a redirect: "See [[Canonical File]]." When I search for a topic and land on a secondary file, the first thing I see is where the real answer lives. No guessing.
confidence: high
status: needs-review
This works because Obsidian's callout syntax (> [!info] and > [!tip]) renders as colored banners that are impossible to miss. The reader knows immediately: this file has context, but the canonical lives elsewhere.
Cross-Cutting Tags
Folders organize by domain. Tags organize by concern. If you need to prepare for a lawyer call, the relevant files are scattered across four folders. A single search for #needs-lawyer surfaces everything in one place.
Define tags based on actions and concerns, not topics (the folders already handle topics). Here is the taxonomy I use:
Tags can be auto-extracted from content using regex patterns. The script scans for IRS form numbers, Spanish legal terms, visa keywords, property addresses, and action-oriented language, then writes the tags into the frontmatter. You define the patterns once; the script applies them to every file.
The Dashboard
Home.md opens on vault launch. It has three sections: quick access links to the 10 files I open every week, Dataview queries that surface problems (stale files, items needing lawyer review, actionable deadlines), and navigation links to every domain folder.
The Dataview queries are the point. A table that shows all files with status: stale sorted by oldest first means I always know what needs refreshing. A vault health summary (48 current, 257 needs-review, 69 stale) tells me at a glance whether the system is decaying or maintained.
🏠 My Vault
⚡ Quick Access
| Topic | Canonical File |
|---|---|
| Beckham Law | Beckham Law 2026 Updates |
| Immigration | Immigration Index |
| 401k + Retirement | 401k Beckham Canonical |
| Exit Tax | US Exit Tax Canonical |
| FIRE Strategy | FIRE Strategy Index |
🔴 Stale Research
| File | Verified | Status |
|---|---|---|
| Fiscal Residency Rules | 2026-02-14 | stale |
| Beckham Application Process | 2026-02-14 | stale |
| Healthcare Transition | 2026-02-15 | stale |
| Portfolio Construction | 2026-02-16 | stale |
📊 Vault Health
The Plugins
Six community plugins. Each solves a specific problem; none add sync complexity or framework dependencies. If your vault has lots of research files, cross-domain topics, and metadata you want to query, these are the ones that matter.
Install via CLI
You do not need to click through Obsidian's settings. All community plugins can be installed by downloading release files directly into .obsidian/plugins/ and registering them in community-plugins.json. This is how I did it; it is also how you script it for a new machine.
The Graph
Obsidian's built-in graph view is pretty but not useful out of the box. Configured properly (archive excluded, orphans hidden, color-coded by domain), it becomes a structural diagnostic tool. You can see when a topic cluster is disconnected from the rest of the vault, or when a domain has grown disproportionately large.
Maintenance
A knowledge vault decays by default. New files appear without metadata. Old files go stale. Duplicates creep back in when different tools create files with slightly different names.
I have an AI assistant that handles this automatically; it runs freshness checks, applies tags, flags stale research, and deduplicates when naming collisions appear. But the vault itself is plain markdown. Nothing about the structure requires AI or any specific tool. If your maintenance approach is a weekly 20-minute review where you check the Dataview stale list and update frontmatter by hand, that works too.
The system is the structure, not the automation. The automation just keeps it honest.
If you are interested in how AI agents read and maintain a knowledge vault like this (memory systems, session continuity, multi-agent coordination), I wrote a separate guide on that: How My AI Remembers.
Knowledge Vault Setup Guide
Folder structure, YAML schema, tag taxonomy, plugin configs, templates, Home.md dashboard, CLI install script, and dedup logic. Hand it to your AI or follow it yourself.
Download .md