Matt Giaro AI Second Brain Build

Last updated: 31 May 2026

Build a living, AI-queryable knowledge system in Obsidian and Codex that ingests web content, journals grounded in your own saved knowledge, and maintains a personal CRM — all interconnected and self-updating.

// TL;DR

The Matt Giaro AI Second Brain Build is a step-by-step framework for creating a living, AI-queryable knowledge system in Obsidian and Codex. It combines three core pillars — a Wiki/Knowledge Base, a Journal, and a CRM — all governed by a single agents.md file. Content enters through a RAW folder, gets processed by AI into cross-linked wiki pages, and the system grows smarter with every interaction. Use it when you want a personal knowledge management setup that actively resurfaces your saved content instead of becoming a passive dumping ground.

Framework

// When should I use the Matt Giaro AI Second Brain framework?

Use this skill when setting up a personal knowledge management system that goes beyond passive storage, or when a user wants to build a wiki, journaling layer, or CRM that an AI can actively query and cross-reference.

// What do I need before building the AI Second Brain?

Obsidian vault pathrequired
The local folder path where the new Obsidian vault will be created and stored
Content sourcesrequired
List of YouTube videos, articles, tweets, podcast transcripts, or other web content the user wants to seed the system with
Three core pillars selectionrequired
Which of the three pillars the user needs: Wiki/Knowledge Base, CRM, Journal — and what domain-specific variants apply (e.g. workout logs instead of CRM)
AI coding environmentrequired
The IDE/agent front-end the user will use to build and chat with the system (Codex, Claude Code, or equivalent)
GitHub repo URL (optional)
Private GitHub repository URL for automated backup commits

// What are the core principles behind the AI Second Brain?

The Dumping Ground Problem

Most second brain systems are just storage — information goes in and never comes back out. The system only has value if it actively resurfaces knowledge when you need it, not just when you go looking for it.

Three Core Pillars

Every build centres on three elements: the Wiki/Knowledge Base (the central store), the CRM or equivalent relational layer (people, contacts, events), and the Journal (the active interaction layer). The knowledge base sits at the centre and everything else connects to it.

Knowledge Base at the Centre

The wiki is not one pillar among equals — it is the gravitational core. The journal and CRM derive their value from being grounded in and cross-linked to the knowledge base. Remove the centre and the other layers become isolated silos.

Grounded Responses (Not Blank-Slate AI)

When journaling or querying, the AI must not respond from its generic LLM knowledge alone. Responses must be grounded in the user's own saved content: 'You saved this video 3 days ago that says...' is the target output quality, not a generic ChatGPT answer.

Wiki as Living Entity

Borrowed from Andrej Karpathy's LLM Wiki concept: the wiki is not static. Every query, journal entry, and new ingestion updates the wiki — adding pages, cross-linking entities, and logging changes — so the system grows smarter with every interaction.

agents.md as the Brain's Constitution

All behaviour of the system is governed by a single agents.md file that contains plain-language instructions (prompts) for how the AI should ingest, process, query, journal, and manage the CRM. Tweaking behaviour means editing this file, not rebuilding the system.

RAW → Processed Pipeline

Source material always enters through the RAW folder — immutable and unmodified. Once processed by the AI into wiki pages, the source file moves to RAW/Processed so the user always knows what has and has not been ingested.

Pattern Detection Across Journals

The journal layer should not just respond to today's entry in isolation. It must scan past journal entries for recurring themes and struggles, surface patterns, and factor those patterns into its response — making advice progressively more personalised over time.

// How do you build the AI Second Brain step by step?

1
Define the user's Three Core Pillars
Confirm which pillars apply: Wiki/Knowledge Base (always required), plus any of: Journal, CRM, classroom notes, workout logs, client records, recipes, research papers. The wiki is non-negotiable; the other two are customisable to the user's domain.
2
Create a new Obsidian vault and note its local folder path
Install Obsidian (free, obsidian.md). Create a vault named meaningfully (e.g. 'Second Brain'). Save it in a dedicated folder. Record the exact folder path — it is required for the Codex project setup in step 4.
3
Install and configure the Obsidian Web Clipper browser extension
Add Obsidian Web Clipper to Chrome. In settings: add the vault name exactly as it appears in Obsidian, set the default template to pull source title, source URL, creation date, and a 'web-clip' tag, and set Note Location to 'RAW' so all clipped content lands in the correct folder. The Web Clipper automatically pulls full YouTube transcripts — this is the primary ingestion mechanism.
4
Open the Obsidian vault folder as a project in Codex (or Claude Code)
In Codex, choose 'Add New Project' → 'Use Existing Folder' → select the vault folder. This links the AI coding environment directly to the Obsidian vault so all file operations are visible in both tools simultaneously.
5
Build the wiki architecture using Karpathy's LLM Wiki as the structural blueprint
Prompt Codex: 'Build out the wiki architecture based on Karpathy's LLM Wiki [GitHub URL]. The current folder is the Obsidian vault. Build from scratch.' The correct minimal file structure is: /RAW (source material), /RAW/Processed (ingested sources), /RAW/Assets (optional attachments), /Wiki (AI-generated markdown pages), agents.md (operational instructions), index.md (catalogue of all wiki content), log.md (change history). If Codex builds extra files, prompt it to prune back to this minimal structure explicitly.
6
Customise agents.md to enforce the full processing pipeline
The agents.md file governs all system behaviour. Ensure it includes these processing steps for ingestion: (1) Read source from RAW, (2) Create or update wiki pages, (3) Update entity/concept/topic/overview/synthesis/comparison pages, (4) Cross-link generated wiki pages back to the original source page, (5) Update index.md, (6) Append an entry to log.md, (7) Move the source file from RAW to RAW/Processed. For YouTube sources: extract and add channel name to the original source page front matter (not the wiki page).
7
Seed the knowledge base with initial content via the Web Clipper
Navigate to relevant YouTube videos, articles, or web pages. Click the Web Clipper to save each one directly to the RAW folder. Seed with enough content (5–10 sources minimum) to give the wiki meaningful cross-linking from the start. Meta-tip: ingesting the Karpathy LLM Wiki GitHub page itself is a useful first seed.
8
Process RAW files via Codex to build out the wiki
In Codex, open a new chat in the second brain project and prompt: 'Process the files inside the RAW folder.' The AI will read each source, generate wiki pages, extract entities (people, companies, tools, ideas, themes), cross-link related pages, update index.md and log.md, and move files to RAW/Processed. Processing time scales with content volume — allow 3–6 minutes per batch.
9
Add Journal and CRM pillars by updating agents.md via a Codex prompt
Prompt Codex to update agents.md with these rules — Journal: if a chat begins with 'journal', save the full conversation as a new markdown file in /Journal, name the file [date]-[short-title].md, add entry to /Journal/index.md with date/title/summary link, log the entry in log.md, and ground the AI response in wiki content, past journal entries, and CRM data. CRM: if the user says 'add to CRM' or 'CRM:', create or update a file in /CRM named after the person, capturing contact details, how/where met, relationship context, and notes; maintain /CRM/index.md in alphabetical order with a short bio per contact; log updates in log.md.
10
Test all three pillars before automating
Test wiki query: ask a question in a new Codex chat and verify the response cites specific saved sources. Test CRM: say 'Add to CRM: [Name] — met at [event] in [year]' and verify a CRM file and index entry are created. Test journal: start a chat with 'journal' on a separate line, write a real entry, and verify the response references wiki content (not generic LLM output), a journal file is created in /Journal, and the journal index and log are updated.
11
Set up the hourly automation to process new RAW files without manual prompting
In Codex, go to Automations → New Automation. Title: 'Process Second Brain RAW Files'. Work tree: Local. Project: Second Brain. Schedule: Hourly (or chosen cadence). Prompt: 'If there are any unprocessed files inside the RAW directory, please process them.' Model: use the strongest available (e.g. GPT-4.5 or equivalent) on high reasoning. This means the user only needs to clip content — processing happens automatically.
12
Connect a private GitHub repository for automated backup
Create a new private repository on GitHub. In Codex, prompt: 'Commit this current version to my private GitHub repo [URL].' Then edit the automation prompt to append: 'Once everything is processed, commit and push the current version to the main branch on GitHub.' Backup now happens every hour alongside processing.
13
Iterate on agents.md to dial in behaviour over time
The system is entirely prompt-driven. Any behavioural change — new folder categories, additional entity types to extract (companies, tools, people), stricter cross-linking rules — is made by editing agents.md directly in Obsidian or by prompting Codex to update it. Use the Obsidian graph view to visually monitor how interconnected the wiki is becoming; this is the best signal of system health.

// What are real-world examples of the AI Second Brain in action?

A freelance designer wants to stop losing track of interesting design articles and YouTube tutorials they watch, and wants to journal about creative blocks with AI support grounded in their own saved content.

Build a three-pillar system: Wiki/Knowledge Base seeded with design tutorials and articles via Web Clipper; Journal pillar so when they write 'journal: I feel creatively blocked on this client brief', the AI responds by citing their own saved content ('You saved a video 2 weeks ago about design constraints as creative fuel — here's what it suggested...') rather than generic advice; CRM pillar to store notes about design clients and collaborators met at events. The graph view will show design tools, techniques, and client names all interconnecting over time.

A student wants a research assistant that connects lecture notes, academic papers, and textbook highlights, and allows them to ask questions that draw only on material they have actually studied.

Replace the CRM pillar with a 'Classroom Notes' pillar. Clip research papers and articles via Web Clipper into RAW. Paste or type lecture notes directly into RAW as markdown files. The wiki processes all sources and extracts concepts, authors, and themes. When the student asks 'What did I save about neuroplasticity?', the grounded response cites only their ingested sources. Journal entries become reflection logs on study sessions, grounded in their own saved research.

A sales professional wants to remember every conversation from conferences and connect those people to the ideas discussed, so they can follow up meaningfully months later.

The CRM pillar is the primary value driver. After each event, open Codex and say 'Add to CRM: [Name] — met at [Conference], discussed [topic], their role is [X], follow up about [Y].' The system creates a contact record and cross-links it to any wiki pages on the discussed topics. Later, prompt 'What did I discuss with [Name] about [topic]?' and the response draws from both the CRM record and relevant wiki content.

// What mistakes should I avoid when building the AI Second Brain?

Building a dumping ground: saving content into RAW without the processing pipeline in place means information just accumulates and dies — the same problem as any passive second brain system.
Letting Codex over-build the initial architecture: it will create dozens of unnecessary files if not constrained. Explicitly prompt it to prune back to the minimal Karpathy structure if it builds more than the five required elements.
Skipping the agents.md cross-linking instruction: without the 'cross-link wiki pages back to the original source page' rule, wiki pages become orphaned and the interconnection that makes the graph view valuable never develops.
Adding the YouTube channel name to the wiki page instead of the original source page: the channel name belongs in the front matter of the source file in RAW, not on the generated wiki page.
Journaling without the 'journal' prefix: the system only routes to journal-handling logic if the chat begins with the 'journal' trigger word. Without it, the entry is treated as a wiki query and is not saved as a journal file.
Using a weak AI model for the automation: the hourly automation should use the strongest available model on high reasoning — using a lightweight model produces shallow wiki pages with poor entity extraction and cross-linking.
Never looking at the graph view: the graph view in Obsidian is the primary signal of whether the system is working. A flat, disconnected graph after weeks of use indicates the cross-linking rules in agents.md are not functioning correctly.

// What do the key terms in the AI Second Brain system mean?

Three Core Pillars: The three structural layers of the system: (1) Wiki/Knowledge Base — the central store of all ingested content, (2) CRM or equivalent relational layer — notes about people, clients, contacts, or events, (3) Journal — the active interaction layer where the user writes and receives AI responses grounded in the other two pillars.
Knowledge Base at the Centre: The architectural principle that the wiki is the gravitational core of the system — the journal and CRM derive value only by being grounded in and cross-linked to it.
Grounded Response: An AI reply that is anchored to the user's own saved wiki content rather than generated from generic LLM knowledge. The target quality marker is a response that says 'You saved this video 3 days ago that says...' rather than a blank-slate ChatGPT answer.
RAW Folder: The immutable input layer of the system. All source material — web clips, YouTube transcripts, articles, meeting notes — enters here untouched before AI processing.
RAW/Processed: The subfolder where source files are moved after the AI has successfully ingested and wiki-fied them, providing a clear record of what has and has not been processed.
agents.md: The system's operational constitution — a single markdown file containing all plain-language instructions (prompts) that govern how the AI ingests content, queries the wiki, handles journal entries, and manages the CRM. Changing system behaviour means editing this file.
index.md: A catalogue file maintained at the wiki level (and optionally within Journal and CRM folders) that lists all entries with titles, dates, and links, allowing the AI to scan the full knowledge base efficiently before responding.
log.md: A running change history file that records every ingestion, wiki update, journal entry, CRM update, and query — providing an audit trail of how the system has evolved.
Obsidian Web Clipper: A Chrome browser extension that saves any web page or YouTube video (including its full transcript) as a markdown file directly into the RAW folder of the Obsidian vault, serving as the primary content ingestion mechanism.
LLM Wiki (Karpathy's concept): Andrej Karpathy's architectural blueprint for using an LLM to maintain a self-building, cross-linked markdown wiki from raw source material. Matt Giaro uses this as the foundational wiki layer and extends it with journal and CRM pillars.
Graph View: Obsidian's visual map of all notes and their interconnections. Used as the primary health metric of the second brain — a densely interconnected graph indicates the cross-linking and entity extraction are working correctly.
Pattern Detection: The journal layer's capability to scan past journal entries for recurring themes and struggles, surface those patterns explicitly, and factor them into responses to new journal entries — making the system progressively more personalised.
Dumping Ground Problem: Matt Giaro's term for the failure mode of most second brain systems: information is saved but never resurfaces, making the system a passive archive rather than an active knowledge tool.

// FREQUENTLY ASKED QUESTIONS

What is the Matt Giaro AI Second Brain?

The Matt Giaro AI Second Brain is a framework for building a living, AI-queryable knowledge system in Obsidian and Codex. It uses three core pillars — a Wiki/Knowledge Base at the centre, a Journal for AI-grounded reflections, and a CRM for tracking people and relationships. All behaviour is governed by a single agents.md file, content enters through a RAW folder pipeline, and the system grows smarter with every interaction instead of becoming a passive dumping ground.

What is agents.md in the AI second brain system?

The agents.md file is the operational constitution of the entire system — a single markdown file containing plain-language prompts that govern how the AI ingests content, processes wiki pages, handles journal entries, and manages the CRM. Changing any system behaviour means editing this one file rather than rebuilding the architecture. It enforces the full processing pipeline: reading RAW sources, creating wiki pages, cross-linking, updating the index and log, and moving processed files.

How do I build an AI second brain in Obsidian step by step?

Start by creating a new Obsidian vault, installing the Web Clipper extension, and opening the vault folder as a project in Codex. Prompt Codex to build the wiki architecture based on Karpathy's LLM Wiki blueprint, then customise agents.md to enforce your processing pipeline. Seed the system with 5-10 sources via Web Clipper, process them through Codex, add Journal and CRM pillars by updating agents.md, test all three pillars, then set up hourly automations for processing and GitHub backup.

How do I set up the Obsidian Web Clipper for the AI second brain?

Install the Obsidian Web Clipper Chrome extension and configure it by adding your vault name exactly as it appears in Obsidian. Set the default template to capture source title, source URL, creation date, and a 'web-clip' tag. Set the Note Location to 'RAW' so all clipped content lands in the correct input folder. The Web Clipper automatically pulls full YouTube transcripts, making it the primary ingestion mechanism for the system.

How does the Matt Giaro AI Second Brain compare to a regular Obsidian vault or Notion setup?

A regular Obsidian vault or Notion workspace is passive storage — you save content but rarely resurface it. The Matt Giaro AI Second Brain actively processes, cross-links, and queries your content through AI. Responses are grounded in your own saved material rather than generic LLM output. The agents.md file governs all behaviour in one place, the RAW-to-Processed pipeline ensures nothing gets lost, and the system grows smarter with every interaction through automatic wiki updates and pattern detection across journals.

When should I use the AI second brain framework instead of a regular note-taking app?

Use this framework when your note-taking has become a dumping ground — you save content but never find or use it again. It is ideal when you need AI to actively query and cross-reference your saved knowledge, when you want journaling grounded in your own material rather than generic advice, or when you need a personal CRM that connects people to the ideas you discussed. If you only need simple note capture without retrieval, a regular app suffices.

What results can I expect after building the AI second brain?

After building the system, you can ask questions and receive answers grounded in your own saved content — responses like 'You saved this video 3 days ago that says...' instead of generic ChatGPT output. Your journal entries will reference relevant wiki content and detect recurring patterns over time. The CRM cross-links people to topics discussed. The Obsidian graph view shows a densely interconnected knowledge network that grows automatically with every piece of content you clip.

What are the three core pillars of the AI second brain?

The three core pillars are: (1) Wiki/Knowledge Base — the central store of all ingested and processed content, always required; (2) CRM — a relational layer for tracking people, contacts, clients, and relationship context; and (3) Journal — the active interaction layer where you write entries and receive AI responses grounded in your wiki and CRM data. The Knowledge Base sits at the centre as the gravitational core; the other two derive their value from being cross-linked to it.

What is a grounded response in the AI second brain?

A grounded response is an AI reply anchored to your own saved wiki content rather than generated from the LLM's generic training data. The target quality is a response that says 'You saved this video 3 days ago that says...' or 'Based on the article you clipped about X...' rather than a blank-slate ChatGPT answer. This is enforced through the agents.md instructions and only works when the wiki has been properly seeded and processed with your content.

Can I use Claude Code instead of Codex for the AI second brain?

Yes, Claude Code works as a direct substitute for Codex. The framework requires any AI coding environment that can open a local folder as a project and perform file operations within it. Claude Code, Codex, or any equivalent agent front-end that reads and writes to your Obsidian vault folder will work. The key requirement is that the AI can see all files in the vault, create and modify markdown files, and execute the processing pipeline defined in agents.md.

// GET THIS SKILL — FREE