Skip to content

Category: Vibe Coding

12 min read The Art of Vibe Coding, Claude, Vibe Coding

Stop Arguing With Your AI. Start Showing It What You See.

Imagine trying to teach someone to cook over the phone.

You’re walking them through your grandmother’s pasta recipe—the one with the garlic that needs to be just golden, not brown. You describe every step perfectly. The timing. The technique. The little flip of the wrist when you toss the noodles.

And then they say: “It’s burning. What do I do?”

Here’s the thing: you can’t help them. Not really. Because you can’t see the pan. You can’t see how high the flame is. You can’t see that they accidentally grabbed the chili flakes instead of oregano. All you have is their panicked description and your best guess about what might be going wrong.

This, my friend, is exactly what happens when you ask Claude Code to fix a bug.

(Stay with me here.)

.

.

.

The Merry-Go-Round Nobody Enjoys

You’ve been on this ride before. I know you have.

You describe the bug to Claude. Carefully. Thoroughly. You even add screenshots and error messages because you’re a good communicator, dammit.

Claude proposes a fix.

You try it.

It doesn’t work.

So you describe the bug again—this time with more adjectives and maybe a few capitalized words for emphasis. Claude proposes a slightly different fix. Still broken. You rephrase. Claude tries another angle. Round and round we go.

This is the debugging merry-go-round, and nobody buys tickets to this ride on purpose.

The instinct—the very human instinct—is to blame the AI.

  • “Claude isn’t smart enough for this.”
  • “Maybe I need a different model.”
  • “Why can’t it just SEE what’s happening?”

That last one?

That’s actually the right question.

Just not in the way you think.

Here’s what I’ve learned after spending more time than I’d like to admit arguing with AI about bugs: Claude almost never fails because it lacks intelligence. It fails because it lacks visibility.

Think about what you have access to when you’re debugging. Browser dev tools. Console logs scrolling in real-time. Network requests you can inspect. Elements that highlight when you hover. The actual, living, breathing behavior playing out on your screen.

What does Claude have?

The code. Just the code.

That’s it.

An infographic titled "THE VISIBILITY GAP" compares the debugging experience of a human developer ("WHAT YOU SEE" with an eyes emoji) to that of an AI model ("WHAT CLAUDE SEES" with a brain emoji). The left column, for the developer, has a list of six items with green checkmarks: "Browser dev tools", "Console logs in real-time", "Network requests", "Elements highlighting", "Actual behavior on screen", and "The sequence of events". The right column, for Claude, shows a red 'X' and the phrase "Just the code" for every single one of these six points. A thick navy line separates the two columns. Below this comparison, a horizontal bar separates a final summary section with two pill-shaped callout boxes. The left green box says "YOU: Full visibility" with a magnifying glass icon, and the right red-accented box says "CLAUDE: Reading with a blindfold" with a blindfold icon. The background is a light off-white, and the overall style is clean and minimalist.

You’re asking a brilliant chef to fix your burning pasta—but they can only read the recipe card. They can’t see the flame. They can’t smell the smoke. They’re working with incomplete information and filling in the gaps with educated guesses.

Sometimes those guesses are right. (Claude is genuinely brilliant at guessing.)

Most of the time? Merry-go-round.

.

.

.

The Two Bugs That Break AI Every Time

After countless Claude Code debugging sessions—some triumphant, many humbling—I’ve noticed two categories that consistently send AI spinning:

The Invisible State Bugs

React’s useEffect dependencies.

Race conditions. Stale closures. Data that shapeshifts mid-lifecycle like some kind of JavaScript werewolf. These bugs are invisible in the code itself. You can stare at the component for hours (ask me how I know) and see nothing wrong. The bug only reveals itself at runtime—in the sequence of events, the timing of updates, the order of renders.

It’s happening in dimensions Claude can’t perceive.

The “Wrong Address” Bugs

CSS being overridden by inline JavaScript. WordPress functions receiving unexpected null values from somewhere upstream. Error messages that point to line 7374 of a core file—not your code, but code three function calls removed from the actual problem.

The error exists.

But the source? Hidden in cascading calls, plugin interactions, systems talking to systems.

Claude can’t solve either category by reading code alone.

So what do we do?

We give Claude eyes.

(I told you to stay with me. Here’s where it gets good.)

.

.

.

Method 1: Turn Invisible Data Into Evidence Claude Can Actually See

Let me walk you through a real example.

Because theory is nice, but showing you what this looks like in practice? That’s the good stuff.

I had a Products Browser component. Simple filtering and search functionality—the kind of thing you build in an afternoon and then spend three days debugging because life is like that sometimes.

Each control worked beautifully in isolation:

Products Browser interface showing a search for "apple" returning 3 matching results: Apple fruit priced at $1.99, Apple Airpods at $129.99, and Apple MacBook Pro at $1999.99. The status shows "Ready - products: 100 · view: 3"

Search for “apple” → Three results. Beautiful.

Products Browser interface with category filter set to "laptops" displaying 5 laptop products including MacBook Pro, Asus Zenbook, Huawei Matebook X Pro, Lenovo Yoga 920, and Dell XPS 13. Status shows "Ready - products: 100 · view: 5"

Filter by “laptops” → Five results. Chef’s kiss.

But combine them?

Products Browser showing broken behavior when combining search term "apple" with category filter "laptops". Results incorrectly display Apple fruit and Apple Airpods alongside MacBook Pro, ignoring the laptops-only filter

Search “apple” + category “laptops” → Broken. The filter gets completely ignored, like I never selected it at all.

Classic React hook dependency bug.

If you’re experienced with React, you spot this pattern in your sleep. But if you’re newer to the framework—or if you vibe-coded this component and touched a dozen files before realizing something broke—you’re stuck waiting for Claude to get lucky.

I spent three rounds asking Claude to fix it. Each fix addressed a different theoretical cause. None worked.

That’s when I stopped arguing and started instrumenting.

An infographic titled "METHOD 1: THE LOGGING WORKFLOW" shows a four-step flowchart with navy blue boxes and arrows on a light background. The first box, labeled "STEP 1", contains the text "Ask for logging (not fix)" with the caption below it: "Add logging" not "fix this". An arrow points right to the second box. The second box, labeled "STEP 2", contains the text "Run test + copy console" with the caption: You become the bridge. An arrow points right to the third box. The third box, labeled "STEP 3", contains the text "Feed logs back to Claude" with the caption: Evidence in, insight out. An arrow points right to the fourth and final box. The fourth box, labeled "RESULT", contains the text "Claude SEES the problem" with the caption: One-shot fix (finally!).

Step 1: Ask Claude to Add Logging (Not Fixes)

Instead of another “please fix this” prompt, I asked Claude to help me see what was happening:

Terminal command showing user request to Claude: "I can search products by keywords or filter them by category, but both don't work together at the same time. Add logging to track data changes."

Notice what I didn’t say: “Fix this bug.”

What I said: “Add logging to track data changes.”

This is the mindset shift that changes everything.

Claude added console.log statements to every useEffect that touched the view state:

Claude Code diff view showing 37 lines added to src/App.tsx, adding console.log statements to useEffect hooks that track triggers like 'products loaded', 'query changed', and 'filter changed' with their corresponding state values

Each log captured which effect triggered, what the current values were, and what got computed. Basically, Claude created a running transcript of everything happening inside my component’s brain.

Step 2: Run the Test and Capture What You See

I opened the browser, selected “laptops” from the category filter, then typed “apple” in the search box.

Split screen showing Products Browser on the left with search "apple" and category "laptops" selected, and Chrome DevTools console on the right displaying multiple useEffect log entries showing the filter and search triggers firing

The console lit up like a Christmas tree of evidence.

Step 3: Feed the Logs Back to Claude

Here’s where the magic happens. I copied that console output—all of it—and pasted it directly into Claude:

Terminal showing user pasting console logs to Claude, displaying useEffect:filters and useEffect:search log entries that reveal each search keystroke triggers a reset of category, minRating, maxPrice, and sort to DEFAULTS

And Claude? Claude saw everything:

Claude's analysis of the logs explaining the bug: the category filter was ignored because applySearchOnly resets category back to "all", and the search effect ran last so it "won" and overwrote the filter results. Proposes fix using single useMemo

Claude found the bug immediately.

The logs revealed the whole story: when I selected a category, useEffect:filters fired and correctly filtered the products. But then when I typed in the search box, useEffect:search fired—and it ran against the full product list, completely ignoring the category filter.

The search effect was overwriting the filter results.

Last effect wins. (JavaScript, you beautiful chaos gremlin.)

Claude proposed the fix: replace multiple competing useEffect hooks with a single useMemo that applies all transforms together:

const view = useMemo(() => {
  return transformProducts(products, {
    query,
    category,
    minRating,
    maxPrice,
    sortKey,
    sortDir,
  });
}, [products, query, category, minRating, maxPrice, sortKey, sortDir]);

One attempt. Bug fixed.

The difference between “Claude guessing for 20 minutes” and “Claude solving it instantly” was 30 seconds of logging.

That’s not hyperbole. That’s just… math.

.

.

.

Method 2: Map the Problem Before Anyone Tries to Solve It

The second method works for a different beast entirely—the kind of bug where even the error message is lying to you.

Here’s a WordPress error that haunted me for hours:

Deprecated: strpos(): Passing null to parameter #1 ($haystack) of type string 
is deprecated in /var/www/html/wp-includes/functions.php on line 7374

Warning: Cannot modify header information - headers already sent by 
(output started at /var/www/html/wp-includes/functions.php:7374) 
in /var/www/html/wp-includes/option.php on line 1740

If you’ve done any WordPress development, you recognize this particular flavor of suffering.

The error points to core WordPress files—not your code. Something, somewhere, is passing null to a function that expects a string. But where? The error message is about as helpful as a fortune cookie that just says “bad things happened.”

I’d made changes to several theme files.

Any one of them could be the culprit.

And the cascading nature of WordPress hooks meant the error could originate three or four function calls before the actual crash.

After a few rounds of Claude trying random fixes (bless its heart), I tried something completely different.

The Brainstorming Prompt That Changes Everything

Terminal showing user pasting WordPress deprecation errors (strpos and str_replace warnings about null parameters) and requesting: "Let's brainstorm ways to fix this. Use ASCII diagrams."

Instead of “fix this,” I asked Claude to brainstorm debugging approaches—and to visualize them with ASCII diagrams.

(I know. ASCII diagrams. In 2025. But stay with me, because this is where Claude Code debugging gets genuinely interesting.)

Claude Maps the Error Chain

Claude started by analyzing the flow of the problem:

Claude's ASCII diagram titled "ERROR CHAIN" showing data flow from Theme Code through WordPress Core (functions.php lines 7374 and 2196) to PHP Engine, with null values being passed that trigger strpos() and str_replace() deprecated warnings

The diagram showed exactly what was happening: some theme code was passing null to WordPress core functions, which then passed that null to PHP string functions, which threw the deprecation warning.

But which theme code? Claude identified the suspect locations:

Claude's ASCII diagram titled "SUSPECT LOCATIONS" listing four possible sources of null values: filter callbacks returning null, options/meta returning null from get_option(), admin menu/page registration with null titles, and wp_safe_redirect with null URL

Four possible sources.

Each with code examples showing what the problematic pattern might look like.

This is Claude thinking out loud, visually. And it’s incredibly useful for Claude Code debugging because now we’re not guessing—we’re investigating.

Multiple Debugging Strategies (Not Just One)

Rather than jumping to a single fix and hoping, Claude laid out several approaches:

Claude's ASCII diagram showing "APPROACH OPTIONS" with four debugging strategies: A) Search for all filter callbacks, B) Find WordPress functions using strpos internally, C) Instrument error with debug_backtrace(), D) Search for common patterns
  • Option A: Search all filter callbacks for missing return statements.
  • Option B: Find which WordPress functions use strpos internally.
  • Option C: Add debug_backtrace() at the error point to trace the caller.
  • Option D: Search for common patterns like wp_redirect with variables.

Four different angles of attack.

This is what systematic debugging looks like—and it’s exactly what you need when you’re stuck in the merry-go-round.

Claude Does Its Homework

Here’s where Opus 4.5 surprised me.

Instead of settling on the first approach, it validated its theories by actually searching the codebase:

Claude Code executing multiple search commands: searching for wp_redirect, add_filter, get_option patterns, and reading specific files like AdminMenu.php and MembershipPlans.php

It searched for wp_redirect calls, add_filter patterns, get_option usages—systematically eliminating possibilities like a detective working through a suspect list.

Then it updated its diagnosis based on what it found:

Claude's updated ASCII diagram showing the probable call chain: Something returns null path, which goes to wp_normalize_path(), which then causes the strpos() error. Suspects listed: wp_enqueue with null source, file_exists on null path, filter returning null instead of path

The investigation narrowed.

The error was coming from path-handling functions—something was returning a null path where a string was expected.

The Summary That Actually Leads Somewhere

Claude concluded with a clear summary of everything we now knew:

Claude's Investigation Summary showing three identified errors receiving NULL values, the functions they're called during (script/style enqueueing, admin page rendering, template loading), and noting that the original fix was correct but there's a second source of null values

And multiple approaches to fix it, ranked by how surgical they’d be:

Claude's "Possible Approaches" diagram showing four options: A) Add debug backtrace to find exact source, B) Wrap all path-sensitive functions with null checks, C) Check FluentCart plugin for similar issues, D) Disable theme components one by one

Did it work?

First attempt. Approach A—adding a debug backtrace—immediately revealed a function in FluentCartBridge.php that was returning null when $screen->id was empty.

One additional null check.

Bug gone.

All those rounds of failed attempts? They were doomed from the start because Claude was guessing blindly. Once it could see the error chain visually—once it had a map instead of just a destination—the solution was obvious.

.

.

.

Why This Actually Works (The Part Where I Get a Little Philosophical)

Both of these methods work because they address the same fundamental gap in Claude Code debugging: AI doesn’t fail because it’s not smart enough. It fails because it can’t see what you see.

When you’re debugging, you have browser dev tools, console logs, network requests, and actual behavior unfolding on your screen. Claude has code files.

That’s it.

It’s working with incomplete information and filling the gaps with educated guesses.

Here’s the mindset shift that changed everything for me:

👉 Stop expecting AI to figure it out. Start helping AI see what you see.

You become the eyes. AI becomes the analytical brain that processes patterns and proposes solutions based on the evidence you feed it.

It’s a collaboration. A partnership. Not a vending machine where you insert a problem and expect a solution to drop out.

A flowchart infographic titled "WHICH METHOD SHOULD YOU USE?". Below the title, a question asks "Bug won't die?". An arrow points down to a central box labeled "What kind of bug is it?". From this box, two arrows branch out.

The left branch leads to a box titled "STATE/TIMING" with bullet points: "• 🧠 React hooks", "• ⏱️ Race conditions", "• 🔄 Data flow", "• 🗄️ Stale closures". Below this box, an arrow points to a final orange-accented box labeled "✅ METHOD 1: ADD LOGGING". The description reads "Turn invisible data into visible evidence 👀" with a monospaced code snippet console.log().

The right branch leads to a box titled "CASCADING/WRONG ADDRESS" with bullet points: "• 📍 Error points to wrong file", "• 🔀 Multiple possible sources", "• 🔌 Plugin/system interactions". Below this box, an arrow points to a final orange-accented box labeled "✅ METHOD 2: ASCII BRAINSTORM". The description reads "Map before you solve 🧠" with monospaced code snippets diagram.txt and diagram.txt.

The overall aesthetic is clean with navy blue borders and text, orange accents for the result boxes, and a light gray background.

When to Use Logging

Add logs when the bug involves:

  • Data flow and state management
  • Timing issues and race conditions
  • Lifecycle problems in React, Vue, or similar frameworks
  • Anything where the sequence of events matters

The logs transform invisible runtime behavior into visible evidence.

React’s useEffect, state updates, and re-renders happen in milliseconds—too fast to trace mentally, but perfectly captured by console.log. Feed those logs to Claude, and suddenly it can see the movie instead of just reading the script.

When to Use ASCII Brainstorming

Use the brainstorming approach when:

  • Error messages point to the wrong location
  • The bug could originate from multiple places
  • You’ve already tried the obvious fixes (twice)
  • The problem involves cascading effects across systems

Asking Claude to brainstorm with diagrams forces it to slow down and map the problem systematically. It prevents the merry-go-round where AI keeps trying variations of the same failed approach. By exploring multiple angles first, you often find the root cause on the very first real attempt.

.

.

.

The Line Worth Tattooing Somewhere (Metaphorically)

Here’s what I want you to take away from all of this:

Don’t argue with AI about what it can’t see. Show it.

The next time Claude can’t solve a bug after a few rounds, resist the urge to rephrase your complaint. Don’t add more adjectives. Don’t type in all caps. (I know. I KNOW. But still.)

Instead, ask yourself: “What am I seeing that Claude isn’t?”

Then find a way to bridge that gap—through logs, through diagrams, through screenshots, through any method that gives AI the visibility it needs to actually help you.

.

.

.

Your Next Steps (The Warm and Actionable Version)

For state and timing bugs:

  1. Pause. Take a breath. Step off the merry-go-round.
  2. Ask Claude to add logging that tracks the data flow.
  3. Run your test, copy the console output, paste it back to Claude.
  4. Watch Claude solve in one shot what it couldn’t guess in twenty.

For complex, cascading bugs:

  1. Paste the error message (yes, the whole confusing thing).
  2. Add: “Let’s brainstorm ways to debug this. Use ASCII diagrams.”
  3. Let Claude map the problem before it tries to solve it.
  4. Pick the most surgical approach from the options it generates.

That bug that’s been driving you up the wall? The one Claude keeps missing?

Give it eyes.

Then watch it solve what seemed impossible.

You’ve got this. And now Claude does too.

11 min read The Art of Vibe Coding, ChatGPT, Claude, Vibe Coding

The 4 Golden Rules of Vibe Coding (A Year-End Manifesto)

You remember early 2025, right?

I was bouncing between ChatGPT Pro, Claude web, and Cursor like a pinball with a deadline. Copy from o1 pro. Paste into my editor. Fix the bug it introduced. Pray it works. Try Cursor for a second opinion. Watch it rewrite my entire file when I asked for one measly line.

Rinse. Repeat. Question your life choices.

(We’ve all been there. And if you say you haven’t, well, I’m not sure I believe you.)

Then May hit. Anthropic added Claude Code to their Max plan—same $200/month I was already burning on ChatGPT Pro, but now I could stop copy-pasting and start orchestrating.

That shift changed everything.

Here’s the thing: I wrote 30+ articles this year documenting every breakthrough, every spectacular failure, every “wait, that’s how it’s supposed to work?” moment. If you only read one piece from me in 2025—make it this one.

What follows are the 4 immutable laws of Vibe Coding I discovered this year. They turned chaotic AI sessions into systematic, predictable wins. Once you see them, you can’t unsee them.

Ready? Let’s go.

.

.

.

Rule #1: The Blueprint Is More Important Than The Code

Let me tell you about the single biggest mistake I see developers make.

They type “build me a task management app” and hit Enter. Claude generates code. Components. Database schemas. Authentication logic.

And then… it’s nothing like what they imagined.

They blame the AI. “It hallucinated again.”

But here’s what I’ve learned after shipping dozens of projects with Claude Code: hallucinations are usually just ambiguity in your prompt. That’s it. That’s the secret nobody wants to admit.

AI is a terrible architect. Give it vague instructions, and it fills in the blanks with whatever patterns it’s seen most often. (Which, spoiler alert, aren’t YOUR patterns.)

But AI is an amazing contractor.

Give it clear blueprints—specific requirements, explicit constraints, visual references—and it executes with surgical precision. Like a really talented carpenter who just needs you to stop saying “make it nice” and start handing over actual measurements.

Infographic titled "THE BLUEPRINT SPECTRUM" comparing two approaches to using AI for software development. The left side, under the heading "VAGUE PROMPT," shows the input "Build me a task app" leading to a box labeled "AI GUESSES" with a brain emoji. This results in a red box with cross marks (❌) listing failures: "Wrong auth," "Wrong UI," "Wrong schema," and "3 rewrites." Below this, an orange pill-shaped box indicates a "Success Rate: ~30%" with a warning emoji (⚠️). The right side, under the heading "DETAILED BLUEPRINT," shows the input "PRD + ASCII Wireframes + User Stories + Constraints" leading to a box labeled "AI EXECUTES" with a robot arm emoji. This results in a green box with checkmarks (✅) listing successes: "Exact auth," "Exact UI," "Exact schema," and "First try." Below this, a blue pill-shaped box indicates a "Success Rate: 97%" with an upward trend arrow emoji (📈). A vertical blue line separates the two comparison sections.

The technique: Interview yourself first

Instead of asking Claude to “build me an app,” I use a brainstorming prompt (inspired by McKay Wrigley and Sabrina Ramonov) that flips the entire script.

The AI interviews me.

  • “What’s the core problem this solves?”
  • “Who uses it?”
  • “What does the main screen look like?”
  • “What happens when the user clicks X?”

By the time I’ve answered those questions, I’ve got a Product Requirements Document. Not AI-generated slop—my vision, clarified.

Claude becomes the junior dev who asks great questions before writing a single line of code. I stay the architect who actually understands what we’re building.

(This is the way it should be.)

The secret weapon: ASCII wireframes

Text descriptions get misinterpreted. Every. Single. Time.

You say “a sidebar with navigation.” Claude hears “full-width hamburger menu.”

So I started including ASCII art wireframes in my prompts:

+------------------+---------------------------+
|   SIDEBAR        |      MAIN CONTENT         |
|  [Dashboard]     |                           |
|  [Projects]      |   +-------------------+   |
|  [Settings]      |   |   Project Card    |   |
|                  |   +-------------------+   |
|                  |   +-------------------+   |
|                  |   |   Project Card    |   |
+------------------+---------------------------+

Sounds primitive, right? Almost embarrassingly low-tech.

The results say otherwise.

When I started including visual plans, my first-try success rate hit 97%. Claude understood layout and hierarchy immediately. No more “that’s not what I meant” rewrites. No more three rounds of “closer, but still wrong.”

👉 The takeaway: Stop typing code and start drawing maps. The blueprint is where the real work happens.

Want the full workflow?

.

.

.

Rule #2: Separate The “Thinker” From The “Builder”

At the beginning, I was using Claude Code for everything.

Planning. Building. Reviewing. Debugging.

One model to rule them all.

And it almost worked.

Almost.

But I kept running into the same problems. Claude would rewrite perfectly good code. Add complex abstractions I never asked for. Solve a simple bug by restructuring half my app.

I asked for email OTP login. I got a 12-file authentication framework.

I asked to fix a type error. Claude decided my entire architecture was wrong.

(It wasn’t. I promise you, it wasn’t.)

The discovery: Specialized roles

Then I stumbled onto a workflow that changed everything—and honestly, I felt a little silly for not seeing it sooner.

Use one model to think. Use another to build.

For me, that’s GPT-5/Codex (The Thinker) and Claude Code (The Builder).

Codex asks clarifying questions. It creates comprehensive plans. It reviews code like a senior engineer who’s seen every possible edge case and still remembers them all.

Claude Code executes. Fast. Reliably. It handles files, terminal commands, and edits without wandering off into philosophical debates about code architecture.

Together? Magic.

The review loop

The workflow looks like this:

  1. Plan (Codex): Describe what I want to build. Codex asks questions, creates a detailed implementation plan.
  2. Build (Claude Code): Feed the plan to Claude. Let it execute.
  3. Review (Codex): Paste the implementation back to Codex. It checks against the original plan, catches bugs, finds edge cases.
Infographic titled "THE THINKER-BUILDER LOOP" with a brain emoji and a robot arm emoji. The diagram shows a cyclical workflow between two AI models. It starts with a box labeled "GPT-5 / CODEX (Thinker)" pointing down to "STEP 1: Plan" ("Ask questions", "Create spec"), which leads to a box labeled "CLAUDE CODE (Builder)". Claude Code then points down to "STEP 2: Build" ("Execute the plan", "Files, commands"). A feedback loop points from Claude Code back up to GPT-5 with the text "Catches bugs single-model workflows miss." Another arrow from Claude Code points up to "STEP 3: Review" ("Does this match the plan?"), which then points back to the GPT-5 box, completing the loop. At the bottom, an orange banner summarizes "THE RESULT: Specialized models = Specialized results," with the comparison "7 hours of solo debugging ✅ → 27 minutes with the loop ⚡".

That third step—the review loop—catches issues that single-model workflows miss every time. EVERY time.

Taming the overengineering monster

Claude has a tendency to overcomplicate. It’s well-documented at this point. (If you’ve used it for more than a week, you know exactly what I’m talking about.)

My fix? The Surgical Coding Prompt.

Instead of “add this feature,” I tell Claude:

“Analyze the existing patterns in this codebase. Implement this change using the minimal number of edits. Do not refactor unless explicitly asked. Show me the surgical changes—nothing more.”

From 15 files to 3 files. From 1000+ lines to 120 lines.

Same functionality. 90% less complexity.

👉 The takeaway: Treat your AI models like a team, not a swiss-army knife. Specialized roles produce specialized results.

Ready to go deeper?

.

.

.

Rule #3: Don’t Just Prompt—Teach “Skills”

Here’s a question that haunted me for months:

“Why do I keep explaining the same patterns over and over?”

Every new project, I’d spell out my authentication approach. My database schema conventions. My error handling patterns. Every. Single. Time.

Claude would forget by the next session. Sometimes by the next prompt.

I was treating AI like a goldfish with a keyboard.

(No offense to goldfish. They’re trying their best.)

The “I know kung fu” moment

Then Claude launched Skills—and everything clicked.

Skills let you package your coding patterns into reusable modules. Instead of explaining “here’s how I do authentication” for the 47th time, you create an auth-skill. Enable it, and Claude instantly knows your entire implementation.

The exact patterns. The exact folder structure. The exact error messages.

Every project uses the same battle-tested approach. Zero drift. Zero “well, last time I used a different library.”

It’s like downloading knowledge directly into Claude’s brain.

Matrix-style. (Hence the name.)

Building your first skill

The process is stupidly simple:

  1. Take code that already works in production
  2. Document the patterns using GPT-5 (it’s better at documentation than execution)
  3. Transform that documentation into a Claude Skill using the skill-creator tool
  4. Deploy to any future project

The documentation step matters. GPT-5 creates clean, structured explanations of your existing implementations. Claude Skills uses those explanations to replicate them perfectly.

The compound learning effect

Here’s where it gets really interesting.

I built an Insights Logger skill that captures lessons while Claude “code”. Every architectural decision, every weird bug fix, every “oh that’s why it works that way” moment—automatically logged.

At the end of each session, I review those insights. The good ones get promoted to my CLAUDE.md file—the permanent knowledge base Claude reads at the start of every project.

Infographic titled "THE COMPOUND LEARNING PIPELINE". The top shows four columns labeled "SESSION 1", "SESSION 2", "SESSION 3", and "SESSION N". Each session has a downward arrow pointing to a box labeled "Insight Logger" with a brain emoji. Arrows from all four "Insight Logger" boxes converge and point downwards into a large central blue box titled "CLAUDE.md (Permanent Knowledge)". Inside this box, a bulleted list with emojis includes: "Auth patterns from Session 1 🔒", "Auth patterns from Session 2 💾", "Database quirks from Session 2 💻", "API fixes from Session 3 🔌", and "...compounding over time 📈". A final arrow points downwards from the "CLAUDE.md" box to a large orange callout box at the bottom with the title "EVERY NEW SESSION STARTS SMARTER". Text inside reads: "Claude reads CLAUDE.md → Already knows your patterns, quirks, and lessons 🚀". The flow illustrates how insights from multiple sessions are collected and stored to improve future interactions.

Each coding session builds on the last. Compound learning, automated.

👉 The takeaway: Prompting is temporary. Skills are permanent. If you’re explaining something twice, you’re doing it wrong.

For the complete Skills series,

.

.

.

Rule #4: Friction Is The Enemy (So Automate It Away)

Let me describe a scene you’ll recognize.

You’re deep in flow state. Claude Code is humming along. Building components, wiring up APIs, making real progress.

And then:

Allow Claude to run `npm install`? [y/n]

You press Enter.

Allow Claude to run `git status`? [y/n]

Enter.

Allow Claude to run `ls src/`? [y/n]

Enter. Enter. Enter. Enter. Enter.

By prompt #47, you’re not reading anymore. You’re a very tired seal at a circus act nobody asked for.

(Stay with me on this metaphor—it’s going somewhere.)

Anthropic calls this approval fatigue. Their testing showed developers hit it within the first hour of use.

And here’s the terrifying part: the safety mechanism designed to protect you actually makes you less safe. You start approving everything blindly. Including the stuff you should actually read.

Infographic contrasting two security models. The left side, titled "APPROVAL FATIGUE TIMELINE," shows a four-hour progression: Hour 1 (green bar): "Carefully reading prompts", quote "Hmm, let me check this...". Hour 2 (yellow bar): "Starting to skim", quote "Yeah yeah whatever". Hour 3 (orange bar): "Reflexively pressing Enter", quote "Enter", "Enter", "Enter". Hour 4 (red bar): "Approving ANYTHING blindly", quote "*doesn't even look*". A red box below shows the result: "RESULT: Malicious postinstall script steals SSH keys. You approved it without reading." with a red X icon. The right side, titled "SANDBOX PROTECTION MODEL," shows two columns: "YOUR SYSTEM (Blocked)" with system directories (~/.ssh/, ~/.aws/, ~/.bashrc) marked with a red X and connected to a box labeled "INVISIBLE TO CLAUDE (OS kernel enforced)". The second column, "YOUR PROJECT (Free)", lists project files (src/, components/, package.json) with green checkmarks indicating "Read/Write", connected to a box labeled "FULL AUTONOMY (Zero prompts)". A green box below shows the result: "RESULT: Prompt injection tries to read ~/.ssh/id_rsa. Kernel says NO. Claude literally cannot see it." with a green checkmark icon. A large "VS." separates the two main sections.

The sandbox solution

Claude Code’s sandbox flips the entire model.

Instead of asking permission for every tiny action, the sandbox draws clear boundaries upfront. Work freely inside them. Get blocked immediately outside them.

On Linux, it uses Bubblewrap—the same tech powering Flatpak. On macOS, it’s Seatbelt—the same tech restricting iOS apps.

These boundaries are OS-enforced. Prompt injection can’t bypass them.

Claude can only read/write inside your project directory. Your SSH keys, AWS credentials, shell config? Invisible. Network traffic routes through a proxy allowing only approved domains.

You run /sandbox, enable auto-allow mode, and suddenly every sandboxed command executes automatically. No prompts. No friction. No approval fatigue.

The 84% reduction in permission prompts? Nice. The kernel-level protection that actually works? Essential.

Parallel experimentation with Git Worktrees

Here’s another friction point that kills vibe coding: fear of breaking the main branch.

My fix: Git Worktrees with full isolation.

Standard worktrees share your database. They share your ports. Three AI agents working on three features leads to chaos. (Ask me how I know.)

I built a tool that gives each worktree its own universe. Own working directory. Own PostgreSQL database clone. Own port assignment. Own .env configuration. Now I run three experimental branches simultaneously. Let three Claude instances explore three different approaches. Pick the winner. Delete the losers.

No conflicts. No fear. No “let me save my work before trying this crazy idea.”

👉 The takeaway: Safe environments allow for dangerous speed. Eliminate friction, and experimentation becomes free.

Ready to set it up?

.

.

.

The Synthesis: What Separates Hobbyists From Shippers

These 4 rules are what separate “people who play with AI” from “people who ship software with AI.”

  • Rule #1: The blueprint is more important than the code.
  • Rule #2: Separate the thinker from the builder.
  • Rule #3: Don’t just prompt—teach skills.
  • Rule #4: Friction is the enemy.

Each rule builds on the last.

Clear blueprints feed into specialized models. Specialized models benefit from reusable skills. Reusable skills only matter if friction doesn’t kill your flow.

It’s a system. Not a collection of random tips.

Where to start

Don’t try to implement all four at once.

That’s a recipe for burnout.

  1. Start with Rule #4. Enable the sandbox. Regain your sanity. Stop being a tired circus seal.
  2. Then move to Rule #1. Before your next feature, write the PRD first. Interview yourself. Draw the ASCII wireframe.
  3. Rule #2 and Rule #3 come naturally after that. You’ll feel the pain of overengineering (and want specialized roles). You’ll get tired of repeating yourself (and want skills).

The system reveals itself when you need it.

Your challenge for 2026

Pick one project you’ve been putting off. Something that felt too complex for AI assistance.

  • Apply Rule #1: Write the blueprint first. ASCII wireframes and all.
  • Apply Rule #4: Set up the sandbox before you start.

Then let Claude execute.

Watch what happens when AI has clear boundaries and clear instructions. Watch how different it feels when you’re orchestrating instead of babysitting.

What will you build first?

Here’s to an even faster 2026.

Now go ship something.


This post synthesizes a year’s worth of vibe coding experimentation. Browse the full archive to dive deeper into any technique—from CLAUDE.md setup to sub-agent patterns to WordPress automation.

9 min read The Art of Vibe Coding, Claude, Codex, GPT-5, Vibe Coding

Claude Skills Part 2: How to Turn Your Battle-Tested Code Into a Reusable Superpower

You’ve built the perfect authentication system.

It took months. Every edge case handled. Every security hole plugged. Production-tested across three different apps.

And now you’re starting project number four.

Time to rebuild it. Again. From scratch.

Because that’s how AI works, right? It gives you a solution, not your solution.

Wrong.

Last week, I showed you how Claude Skills changed everything – letting you replicate YOUR exact patterns across every project.

Today, I’m going to show you exactly how to create your own Claude Skills.

By the end of this article, you’ll know how to turn any feature into a reusable skill that Claude Code can deploy perfectly every time.

.

.

.

The Secret: It’s Not About Code, It’s About Documentation

Here’s what most developers get wrong about Claude Skills.

They think it’s about copying code files. Dumping your lib/auth folder into a skill and calling it done.

That’s not how it works.

Claude Skills aren’t code repositories.

They’re implementation guides that teach Claude your specific patterns, your architecture decisions, your way of solving problems.

And the key to creating a powerful skill?

Comprehensive documentation that captures not just WHAT your code does, but HOW and WHY it works.

Let me show you exactly how I turned my authentication system into the Claude Skill I demonstrated in Part 1.

.

.

.

Step 1: Let GPT-5 Document Your Implementation (10 Minutes)

This is counterintuitive, but stay with me.

I don’t use Claude to document my Claude Skills. I use GPT-5.

Why? Because GPT-5 is meticulous. It’s the senior architect who notices every pattern, every decision, every subtle implementation detail.

Here’s my exact process:

Initial prompt to GPT-5 asking it to analyze the authentication codebase

I give GPT-5 this prompt:

I want to update the authentication implementation docs at 
#file:authentication.md to match the current implementation of authentication for 
this app.

Read the codebase, analyze how this app implemented the authentication, then 
update the docs.

Ask me clarifying questions until you are 95% confident you can complete this task 
successfully.

a. If the question is about choosing different options, please provide me with a list 
of options to choose from. Mark the option with a clear label, like a, b, c, etc.
b. If the question need custom input that is not in the list of options, please ask me 
to provide the custom input.

Always mark each question with a number, like 1/, 2/, 3/, etc. so that I can easily 
refer to the question number when I answer.

For each question, add your recommendation (with reason why) below each 
options. This would help me in making a better decision.

Notice the key elements:

  • 95% confidence threshold (forces thoroughness)
  • Structured question format (speeds up the process)
  • Recommendations included (leverages GPT-5’s analysis)
GPT-5 analyzing the codebase, showing it reading files and understanding patterns

Watch as GPT-5 systematically explores:

  • Authentication routes (/app/api/auth/**)
  • Library files (auth.ts, jwt.ts, rate-limit.ts)
  • Middleware implementation
  • Database schema
  • Environment variables

GPT-5 asks targeted questions:

GPT-5 asking clarifying questions about CSRF protection, hCaptcha, and other implementation details

I answer with just the option letters.

Me answering with simple "1/a, 2/a, 3/a" format

No lengthy explanations needed. GPT-5 already understands my intent.

The result?

GPT-5 compiling the comprehensive authentication documentation

A complete authentication implementation guide covering:

  • System architecture
  • Security measures (CSRF, rate limiting, audit logging)
  • Database schema with relationships
  • API endpoints with exact paths
  • JWT token strategy with sessionVersion
  • Environment variables with defaults
  • Migration checklist for new projects

302 lines of detailed documentation. Every decision documented. Every pattern explained.

Time spent: 10 minutes.

Now, you might notice something different in my screenshots – I’m using GPT-5 in GitHub Copilot instead of my usual Codex.

The reason?

I’d hit my Codex weekly limits when writing this. (Yes, even I burn through those limits when I’m deep in development mode.)

But here’s what I discovered: GPT-5 in GitHub Copilot is an excellent substitute for Codex. In terms of performance – especially when it comes to analyzing codebases – I honestly can’t tell the difference.

Same meticulous analysis.
Same comprehensive documentation.
Same quality output.

.

.

.

Step 2: Create ASCII Wireframes for the UX Flow (5 Minutes)

Here’s where most skill creators stop. They have the backend documentation.

But Claude Skills need to understand the FULL implementation – including the UI.

This is where ASCII Wireframes become your secret weapon.

I ask GPT-5:

Asking GPT-5 to create ASCII wireframes

Why ASCII instead of HTML mockups?

HTML mockup for login page: ~500 lines, ~15,000 tokens ASCII wireframe for login page: ~50 lines, ~1,500 tokens

Same information. 90% less tokens.

GPT-5 creating comprehensive ASCII wireframes document

GPT-5 creates wireframes for every screen:

Every interaction mapped. Every flow documented. Claude will know EXACTLY what UI to build.

Total documentation time: 15 minutes.

.

.

.

Step 3: Transform Documentation Into a Claude Skill (5 Minutes)

Now we have comprehensive documentation and wireframes. Time to turn them into a Claude Skill.

First, you need the skill-creator skill itself. Get it from Anthropic’s skills repository.

Project structure showing skill-creator in .claude/skills folder and documentation in notes folder

My project structure:

.claude/
  skills/
    skill-creator/     # The skill that creates skills
notes/
  authentication.md            # Our documentation
  authentication_wireframes.md # Our wireframes

Start a new Claude Code session and ask:

Asking Claude "what are the available skills?"

Claude shows the available skills:

Now the magic moment:

Asking Claude to create authentication skill
Please use the skill-creator skill to create a new skill with the skill-creator that shows 
how to set up authentication exactly like this app does. Please refer to the documentation 
@.notes/authentication.md and wireframes @.notes/authentication_wireframes.md.

Watch as Claude:

  1. Reads the skill-creator instructions
  2. Explores your authentication codebase
  3. Analyzes your documentation
  4. Studies the wireframes
Claude exploring the codebase to understand the authentication implementation

It’s not just copying files.

It’s understanding your implementation and transforming it into teachable instructions.

Claude creating the authentication-setup skill structure

Claude creates a complete skill structure:

authentication-setup/
  ├── SKILL.md                    # Main skill instructions
  ├── scripts/                    # Initialization scripts
  ├── references/                 # Your documentation
  │   ├── database_schema.md      # Prisma schema reference
  │   ├── environment_variables.md # Required env vars
  │   ├── dependencies.md         # NPM packages needed
  │   └── implementation_guide.md # Step-by-step guide
  └── assets/                     # Templates and examples
      └── migration_checklist.md  # Deployment checklist

Key insight: The skill doesn’t contain your actual code files. It contains instructions for recreating your patterns.

Claude packages everything into authentication-setup.zip:

Claude showing the packaged skill is ready

Time to create skill: 5 minutes.

.

.

.

Step 4: Deploy Your Skill to Another Project (2 Minutes)

This is where the payoff happens.

Uploading the skill to a new Next.js project's .claude/skills folder

Take your authentication-setup.zip and extract it to any project:

new-project/
  .claude/
    skills/
      authentication-setup/  # Your skill goes here

Start Claude Code in the new project. Type:

In new project, typing "/authentication-setup" to trigger the skill

Claude immediately recognizes the skill:

Claude recognizing and loading the authentication-setup skill
Claude implementing the complete authentication system

Claude doesn’t just copy files. It:

  • Analyzes your current project structure
  • Adapts to your existing patterns
  • Integrates with your current code
  • Maintains your architectural decisions

Your exact authentication system. In a completely new project. In under 10 minutes.

.

.

.

The Power of Compound Knowledge

Think about what just happened.

You took months of refinement, every bug fix, every security improvement, every UX enhancement…

And packaged it into 20 minutes of documentation work.

Now that skill can be deployed to:

  • Every new project you start
  • Every team member’s workspace
  • Every client implementation
  • Every prototype that needs auth

The math is staggering:

1 skill × 10 projects = 10 hours saved 10 skills × 10 projects = 100 hours saved Your entire toolkit as skills = Career-changing productivity

But here’s what nobody talks about…

.

.

.

Why GPT-5 + Claude Skills Is The Ultimate Combo

Using GPT-5 to document for Claude Skills isn’t random. It’s strategic.

GPT-5’s superpower: Meticulous analysis and comprehensive documentation
Claude’s superpower: Following detailed instructions perfectly

When you combine them:

  1. GPT-5 extracts every pattern and decision from your code
  2. Claude Skills preserves that knowledge permanently
  3. Claude Code implements it flawlessly every time

It’s like having a senior architect (GPT-5) document your best practices, then having infinite junior developers (Claude Code instances) who can implement those practices perfectly.

No knowledge loss.
No pattern drift.
No “I think I did it differently last time.”

.

.

.

Common Mistakes to Avoid

Mistake #1: Skipping the Documentation Phase

“I’ll just copy my code files into the skill.”

Wrong.

Skills need context, not just code.

Without documentation, Claude won’t understand your architectural decisions.

Mistake #2: Forgetting the UI Wireframes

Backend-only skills create Frankenstein features.

Same logic, completely different UI.

Always include wireframes.

Mistake #3: Not Testing in a Clean Project

Always test your skill in a fresh project.

That’s where you’ll discover missing dependencies or assumptions.

.

.

.

Your Skills Library Starts Today

Here’s your action plan:

1. Identify Your Most Reused Feature

What do you build in every project?

  • Authentication system?
  • Admin dashboard?
  • Payment integration?
  • File upload handling?

2. Document It With GPT-5 (15 minutes)

Use my exact prompt. Let GPT-5 extract every pattern.

3. Create ASCII Wireframes (5 minutes)

Map the UI flow. Every screen. Every interaction.

4. Generate The Skill (5 minutes)

Use skill-creator. Let Claude package your knowledge.

5. Test In a New Project

Deploy it. Use it. Refine it.

6. Repeat For Your Next Feature

Build your library one skill at a time.

.

.

.

The Compound Effect Nobody Sees Coming

Every skill you create makes the next project easier.

But here’s what really happens:

Month 1: You create 3 skills (auth, payments, dashboard) Month 2: You create 5 more (file upload, search, notifications…) Month 3: You realize you can build entire apps in hours

By month 6?

You’re not coding anymore. You’re orchestrating.

  • “Use authentication-setup skill.”
  • “Use payment-processing skill.”
  • “Use admin-dashboard skill.”

Complete applications assembled from your battle-tested components.

Each implementation identical to your best work.

No quality degradation.
No pattern drift.
No forgotten edge cases.

This isn’t the future of development. It’s available right now.

.

.

.

Part 3 Preview: Teaching Claude Any Library

Next week, I’ll show you something even more powerful.

How to create Claude Skills that teach Claude to perfectly integrate ANY library or SDK into your apps.

Imagine:

  • “Use the stripe-integration skill” → Your exact Stripe patterns
  • “Use the websocket-setup skill” → Your real-time architecture
  • “Use the testing-harness skill” → Your testing methodology

Not generic implementations. YOUR implementations.

But for now…

Open that project with your best authentication system.

Document it with GPT-5.

Turn it into a skill.

Watch as 10 minutes of work today saves you 10 hours next month.

What feature will you turn into a Claude Skill first?

Stop rebuilding.

Start packaging.

Now.


P.S. – Since creating my authentication-setup skill two weeks ago, I’ve deployed it to 6 different projects. Total time saved: 14 hours. Total consistency: 100%. Every deployment identical to my best implementation. That’s the power of turning your code into Claude Skills.

P.P.S. – The skill-creator skill itself is open source. You can find it at github.com/anthropics/skills. But the real magic? It’s in the skills YOU create from YOUR battle-tested code.

8 min read Claude, Codex, The Art of Vibe Coding, Vibe Coding

Claude Skills: Your “I Know Kung Fu” Moment Has Arrived (Part 1 of 3)

You’re building your fifth Next.js app this month.

Time for authentication. Again.

You fire up Claude Code. “Build me email OTP authentication with JWT, password management, rate limiting, CSRF protection…”

Claude starts coding.

It looks good.

But wait – the JWT implementation is different from your last project. The rate limiting uses a different pattern.

The password hashing… is this bcrypt or argon2 this time?

Every. Single. App.

A different variation of the same feature.

Not because you want variety.

But because that’s how AI works – it gives you a solution, not your solution.

Until now.

.

.

.

The Problem That’s Been Staring Us in the Face

We’ve all been here.

You’ve built authentication for your SaaS app.

It’s perfect.
Production-tested.
Battle-hardened.

Next project comes along.

Same authentication needs.

But when you ask Claude Code to build it:

  • Different JWT strategy (why are we using RS256 now?)
  • Different database schema (wait, where’s the sessionVersion field?)
  • Different security patterns (no rate limiting on the OTP endpoint?)
  • Different file structure (why is auth logic scattered across 15 files?)

Sure, it works.

But it’s not your authentication system. The one you’ve refined over months. The one with all the edge cases handled.

It’s like having a master chef forget their signature recipe every time they walk into a new kitchen.

The real tragedy?

You have the perfect implementation.

It’s sitting right there in your last project. But there’s been no way to teach Claude Code your patterns, your architecture, your way of doing things.

Until Claude Skills changed everything.

.

.

.

Enter Claude Skills: The “I Know Kung Fu” Moment

Remember that scene in The Matrix?

Neo downloads kung fu directly into his brain. Instant mastery. No training montage required.

That’s Claude Skills.

But instead of martial arts, you’re uploading your battle-tested code patterns directly into Claude’s knowledge base.

Here’s what just became possible:

Before Claude Skills:

  • “Build authentication” → Random implementation each time
  • Hours of tweaking to match your standards
  • Inconsistent patterns across projects
  • The eternal “wait, how did I do this last time?” dance

After Claude Skills:

  • “Use the authentication-setup skill” → Your exact implementation
  • Same patterns, same structure, same security measures
  • 10 minutes from zero to production-ready auth
  • Perfect consistency across every project

This isn’t just about saving time. It’s about something much bigger.

.

.

.

The Authentication Skill That Changes Everything

Let me show you exactly what I mean.

I built a Next.js boilerplate with a complete authentication system:

  • Email OTP with 6-digit codes via Resend
  • JWT sessions (access + refresh tokens)
  • Password management with bcrypt
  • CSRF protection on all mutations
  • Rate limiting per email/IP
  • Audit logging for security events
  • Whitelist-only email access
  • PostgreSQL + Prisma ORM

This wasn’t a weekend hack.

This was weeks of refinement. Every edge case handled. Every security hole plugged.

Then I turned it into a Claude Skill.

Now watch what happens.

Claude listing available skills when asked

I simply ask: “What are the available skills?”

Claude shows me my arsenal:

  • authentication-setup – Complete auth system with everything I mentioned
  • multi-tenant-setup – Organizations, memberships, invitations
  • dashboard-shell-setup – Production-ready dashboard with resizable sidebar
  • ai-sdk-v5 – AI integrations for OpenAI, Anthropic, Gemini

Each skill is a complete feature set, ready to deploy.

.

.

.

The 10-Minute Authentication Implementation

Here’s where it gets interesting.

I tell Claude: “Please use the authentication-setup skill to setup the email OTP authentication for this app.”

I provide my PostgreSQL connection string and mention I need dev-only password login for testing.

Claude recognizes the skill and asks permission.

This is important – you maintain control over what gets implemented.

Claude asking permission to use the skill

Watch as Claude:

  1. Examines your current project structure
  2. Reads the skill instructions
  3. Checks for existing code to avoid conflicts
Claude initializing the skill and exploring the codebase

This isn’t blind copy-paste. It’s intelligent integration.

Here’s the genius part.

Claude asks configuration questions:

  • Want to use shadcn/ui components or plain HTML?
  • Have a Resend API key ready?
  • Which additional features to include?

These aren’t random questions.

They’re part of the skill definition, ensuring the implementation matches YOUR specific needs for THIS project.

I select my preferences.

User answering the configuration questions

Notice how Claude provides options but also accepts custom inputs. The skill is flexible, not rigid.

Then Claude presents the complete implementation plan:

The plan includes:

  • Phase 1: Dependencies & Infrastructure
  • Phase 2: Core Library Files
  • Phase 3: API Routes
  • Phase 4: Middleware & Route Protection
  • Phase 5: UI Implementation
  • Phase 6: Database & Scripts
  • Phase 7: Configuration & Documentation

Every. Single. Detail. Planned. Before writing any code.

.

.

.

Watching the Magic Happen

With the plan approved, Claude goes to work.

Installing dependencies and setting up Prisma

First, dependencies. Notice it installs exactly what my authentication system needs:

  • @prisma/client for database
  • bcrypt for password hashing
  • jose for JWT operations
  • resend for emails
  • zod for validation
  • @zxcvbn-ts for password strength

Not a generic “auth package” in sight. These are my chosen tools.

Creating the database schema and migrations

The Prisma schema matches my pattern exactly:

  • User model with sessionVersion for instant session invalidation
  • OtpToken model with attempt tracking
  • AuditLog for security events

This isn’t Claude’s guess at a schema.

This is MY schema.

Building the authentication library files

Watch as it creates:

  • lib/auth.ts – Core authentication logic
  • lib/csrf.ts – CSRF protection
  • lib/rate-limit.ts – Rate limiting with database tracking
  • lib/jwt.ts and lib/jwt-edge.ts – Separate JWT handlers for Node and Edge runtime
Creating API routes with proper structure

326 lines in lib/auth.ts alone. Every function, every pattern, exactly as I designed it.

The API routes follow my exact pattern:

app/api/auth/
├── request-otp/route.ts
├── verify-otp/route.ts
├── dev-signin/route.ts
├── signout/route.ts
├── refresh/route.ts
└── profile/
    ├── set-password/route.ts
    └── change-password/route.ts

Not scattered.

Not random.

My structure.

Implementing middleware and protected routes

Edge middleware that:

  • Verifies JWT without database calls (performance!)
  • Protects /dashboard/* and /settings/*
  • Redirects with ?next= parameter for smooth auth flow

The login page with:

  • Tabbed interface (OTP + Dev Password)
  • React Hook Form + Zod validation
  • Sonner toast notifications
  • Proper loading states
More UI implementation details

Protected layouts, profile pages, dashboard – all following the same patterns.

Claude completing the implementation

Done.

In under 10 minutes of actual implementation time.

The complete getting started guide

Claude even provides a getting started guide:

  1. Create superadmin: SEED_EMAIL=you@example.com npm run seed:superadmin
  2. Start dev server: npm run dev
  3. Visit http://localhost:3000
  4. Sign in with your email

Everything documented. Everything working.

.

.

.

The Proof: Side-by-Side Comparison

But does it actually match the original?

Login page comparison - original boilerplate (top) vs Claude Skills implementation (bottom)
Login page comparison – original boilerplate (top) vs Claude Skills implementation (bottom)]

Look at those login pages.

The structure, the tabs, the form layout – virtually identical.

Some minor styling differences, but the core implementation?

Exact match.

Profile page comparison - original (left) vs Claude Skills implementation (right)
Profile page comparison – original (left) vs Claude Skills implementation (right)

The profile pages tell the same story. Same sections, same password management, same session handling.

Here’s the kicker: The Claude Skills version actually improved on my original in some places. Cleaner code organization. Better error messages. More consistent styling.

The student became the master.

Then taught the next student to be even better.

Want techniques like these weekly?

Join The Art of Vibe Coding—short, practical emails on shipping with AI (without the chaos).

No spam. Unsubscribe anytime. Seriously.

.

.

.

Why This Changes Everything

This isn’t just about authentication.

Think bigger.

Every feature you’ve perfected can become a skill:

  • Your payment integration with Stripe
  • Your real-time notification system
  • Your file upload handling
  • Your admin dashboard layout
  • Your API error handling patterns
  • Your testing setup

Build once. Refine until perfect. Convert to skill. Deploy everywhere.

The compound effect:

1 perfected feature × 10 projects = 10 hours saved 10 perfected features × 10 projects = 100 hours saved 100 perfected features × Your entire career = …you do the math

But it’s not just time saved.

It’s consistency gained. It’s quality guaranteed. It’s your personal coding style, preserved and replicated perfectly.

.

.

.

The Hidden Benefits Nobody Talks About

Benefit #1: Your Code Patterns Become Your Coding Standard

No more “wait, which project had the good auth implementation?”

Your Claude Skills ARE your coding standard.

Benefit #2: Onboarding Becomes Trivial

New developer joins your team? “Here are our Claude Skills. Use them.”

Instant consistency.

Benefit #3: You Stop Reinventing Wheels

That perfect rate limiting implementation from 6 months ago?

It’s in your skill.

Ready to deploy.

Benefit #4: Your Skills Get Better Over Time

Find a bug in your auth flow? Fix it in the skill.

Every future project gets the improvement.

Benefit #5: You Can Share (Or Sell) Your Expertise

Those Claude Skills you’ve perfected?

They’re valuable. Share with your team. Open source them. Monetize them.

.

.

.

Your “I Know Kung Fu” Moment Awaits

Here’s what you need to understand:

Claude Skills aren’t just another AI feature. They’re the bridge between “AI that codes” and “AI that codes the way YOU code.”

It’s the difference between:

  • A junior developer googling solutions
  • A senior developer implementing YOUR solutions

Every pattern you’ve perfected.
Every architecture you’ve refined.
Every security measure you’ve battle-tested.

All of it can become a Claude Skill.

All of it can be replicated perfectly across every project you ever build.

The question isn’t whether you should start using Claude Skills.

The question is: which of your battle-tested features will you turn into a skill first?

In Part 2, I’ll show you exactly how to convert your existing codebase into a Claude Skill.

The process is simpler than you think, and the payoff is massive.

But for now, open that project with the perfect authentication system.

The one you’ve been copying manually to every new project.

It’s time to teach Claude YOUR kung fu.


P.S. – That authentication skill I demonstrated? It’s now deployed in 5 different production apps. Same code. Same patterns. Same security. Total implementation time across all 5 apps: under an hour. The old way would have taken days, and each would have been slightly different. That’s the power of Claude Skills.

P.P.S. – Part 2 drops next week: “How to Convert Your Battle-Tested Code Into Claude Skills.” I’ll walk you through turning your existing codebase into reusable skills, with specific examples and templates you can use immediately.

7 min read The Art of Vibe Coding, Claude, Codex, GPT-5, Vibe Coding

Codex plans with ASCII Wireframes → Claude Code builds → Codex reviews

I used to dive straight into The Codex-Claude Code Workflow and watch it build features.

Sometimes it worked brilliantly.

Other times?

The implementation worked but looked nothing like what I imagined.

The problem wasn’t Claude Code.

It was me.

I was expecting it to read my mind about UI layout and catch its own bugs.

Then I added two things to my workflow:

ASCII wireframes and systematic code review.

Now?

97% of my features work perfectly on the first shot.

No back-and-forth debugging. No “that’s not quite right” moments. Just clean, working code that matches my vision.

Let me show you exactly how this works.

.

.

.

The Old Way Was Good. The New Way Is Magic.

Here’s what I used to do:

  1. GPT-5 plans the feature
  2. Claude Code implements it
  3. Hope for the best

It worked.

Kind of.

But UI was always a gamble, and bugs only showed up during testing.

Here’s what I do now:

  1. GPT-5 asks questions until it’s 95% confident
  2. GPT-5 creates ASCII wireframes (visual blueprint, minimal tokens)
  3. Claude Code implements (following both plan AND wireframes)
  4. GPT-5 reviews the git diff (catches bugs before I even test)
  5. Claude Code applies fixes (surgical corrections)

The difference?

Night and day.

.

.

.

Real Example: Building WordPress Collapsible Sections

My WordPress newsletter was getting unwieldy. Long code snippets and prompts made posts hard to read.

Readers had to scroll forever to get through everything.

I needed collapsible sections.

Click to expand when you want details, stay collapsed when you don’t.

Here’s how the new workflow handled it.

Step 1: Setting Up for Success

I hit GPT-5 with my requirements.

But here’s the critical part – I told it explicitly: “DON’T WRITE OR EDIT ANY FILES.”

Initial prompt to GPT-5 requesting planning for collapsible sections feature

Prompt:

__YOUR_INSTRUCTIONS_HERE__

Read the codebase, and help me come up with a plan to implement everything above.

Make sure to include a short description for this plan in paragrah format at the beginning of the plan.

IMPORTANT: DON'T WRITE OR EDIT ANY FILES.

Use web search if you need to find solutions to problems you encounter, or look for the latest documentation.

Ask me clarifying questions until you are 95% confident you can complete this task successfully.

a. If the question is about choosing different options, please provide me with a list of options to choose from. Mark the option with a clear label, like a, b, c, etc.
b. If the question need custom input that is not in the list of options, please ask me to provide the custom input.

Always mark each question with a number, like 1/, 2/, 3/, etc. so that I can easily refer to the question number when I answer.

For each question, add your recommendation (with reason why) below each options. This would help me in making a better decision.

And I added this game-changer: “Ask me clarifying questions until you are 95% confident you can complete this task successfully.”

Plus numbered questions with lettered options.

No typing paragraphs. Just “1/a, 2/c, 3/b”.

Done.

Step 2: GPT-5 Goes Detective Mode

GPT-5 immediately dove into my codebase:

GPT-5 starting to explore the codebase structure

Then came the questions.

Not random questions.

Strategic ones:

GPT-5 listing detailed questions about implementation choices

Look at those options:

  • Where should content live? (Theme files, CPT, or plugin)
  • How should expand/collapse work? (Native HTML, JS toggle, or hybrid)
  • Content delivery? (Inline, lazy-load, or auto-switch)

Every question had options. With recommendations. With reasons.

This is what thinking looks like.

Step 3: Quick Answers, Deep Impact

I fired back my choices:

My responses to GPT-5's questions

Fast. Decisive. No essays needed.

But GPT-5 wasn’t done:

GPT-5 asking additional clarifying questions

More edge cases.

More decisions.

Every answer shaped the final implementation.

Step 4: The Plan Emerges

After gathering everything, GPT-5 delivered:

GPT-5's finalized implementation plan with all details

This wasn’t a vague outline.

It was a blueprint:

  • Exact file structure
  • CPT registration details
  • Shortcode parameters
  • REST endpoint specs
  • Caching strategy
  • Security measures

Everything. Decided. Documented.

The ASCII Wireframe Revolution

Here’s where things get interesting.

I asked GPT-5: “Based on this plan, give me the UX flow map and screen-by-screen content using ASCII wireframe.”

Asking GPT-5 to create ASCII wireframes

Prompt:

Based on this plan, give me the ux flow map and screen-by-screen content using the ASCII wireframe. Put it at @notes/wireframes.md.

What came next blew my mind:

GPT-5 creating the ASCII wireframes

The Power of ASCII Wireframes

Example ASCII wireframe showing the admin interface layout

Look at this beauty:

ASCII wireframe showing the content editing interface

This isn’t just a sketch.

It’s a contract.

Claude Code sees this and knows EXACTLY what to build.

No interpretation. No guesswork. Just pure clarity.

Why ASCII beats everything else:

  • 10x fewer tokens than HTML
  • Zero ambiguity
  • Instant understanding
  • Easy to modify
  • Works in any terminal

Step 5: Claude Code Takes the Wheel

Armed with the plan AND wireframes, I unleashed Claude Code (Sonnet 4.5):

Providing the plan and wireframes to Claude Code

Watch what happened:

Claude Code starting implementation based on plan and wireframes

Claude didn’t hesitate. Didn’t guess. It knew exactly what to build because the wireframes showed it.

Claude Code completing the implementation

Implementation complete. First try.

Step 6: The code review That Catches Everything

This is where most devs stop.

Not me.

I fired up Codex CLI’s code review feature:

Custom code review prompt to GPT-5

Prompt:

Please read the git diff, and review the code changes to see if the implementation is correct and follows the plan @notes/plan.md and wireframes @notes/wireframes.md, correctly.

GPT-5 went full detective:

GPT-5 starting the code review process
GPT-5's completed code review with findings

Found issues? You bet:

  • P1: Missing lazy-mode fallback for non-JS users
  • P1: Copy button behavior was wrong
  • P2: Preview line count didn’t match CSS
  • Security improvements needed

These weren’t “nice to have” fixes. These were bugs waiting to happen.

Step 7: Surgical Fixes

I handed GPT-5’s review to Claude Code:

Claude applied every fix.

No arguments. No confusion. Just clean corrections.

The End Result: From Concept to Production

Here’s what we built. In one shot. With this workflow.

The WordPress Admin Experience

WordPress admin showing the new Longform Sections custom post type in the menu

Clean integration into WordPress admin. “Longform Sections” sits right where it should, right below Pages.

The Longform Section edit screen with copy shortcode functionality

Look at that meta box on the right. Copy shortcode with one click. Multiple usage examples. All the parameters documented right there.

Every attribute explained:

  • id – Post ID (required)
  • title – Custom title or defaults to post title
  • lines – Preview line count (default 6)
  • mode – smart|inline|lazy
  • expand_text / collapse_text – Customizable labels
  • deep_link – Enable direct linking to sections
  • copy_button – Show copy button for code

No documentation needed. It’s self-explanatory.

The Frontend Magic

The collapsible section showing a bash script in collapsed state

This is what readers see. Clean. Collapsed. A bash script preview with “Show more” button ready.

The “Next Worktrees Manager” section shows just enough code to give context. The fade-out effect tells readers there’s more. One click to reveal everything.

Clean.

Collapsed.

A bash script preview with “Show more” button ready.

Click “Show more” and boom – the full script appears.

The expanded state showing the full bash script content

.

.

.

Why This Workflow Changes Everything

97% Success Rate Isn’t Luck

It’s the result of:

  • Clear communication through wireframes
  • Systematic planning with questions
  • Meticulous review before testing
  • Surgical corrections based on feedback

You don’t get 97% success by luck.

You get it by design.

That’s how “good enough” becomes “ships perfectly.”

.

.

.

Your Turn to Level Up

Stop settling for “close enough” implementations.

Stop debugging for hours.

Stop the back-and-forth madness.

Here’s your action plan:

  1. Tonight: Save the workflow prompts
  2. Tomorrow: Try it on one small feature
  3. This week: Build something complex
  4. Next month: Wonder how you ever worked differently

.

.

.

The Bottom Line

I built a production-ready WordPress feature in less than 20 minutes.

Not a prototype. Not “mostly working.” Production-ready.

97% of my features now ship on the first try.

The 3% that don’t? External dependencies. Environment issues. Things no AI can predict.

This workflow transform you from someone who “uses AI to code” to someone who orchestrates AI to build exactly what I envision.

ASCII wireframes + systematic planning + code review = development superpowers.

What will you build when implementation matches imagination perfectly?


P.S. – This workflow needs GPT-5 (via Codex CLI) for planning/review and Claude Code (Sonnet 4.5) for implementation. Together, they’re unstoppable. Apart, they’re just good. Choose wisely.

5 min read Claude, Codex, The Art of Vibe Coding, Vibe Coding

How I Vibe Code With 3 AI Agents Using Git Worktrees (Without Breaking Anything)

You know that feeling when you’re vibe coding with Claude Code and it suggests a “minor database refactor” – then suddenly your entire local database is corrupted?

Or when you want to test Claude Code’s approach versus Codex’s approach, but switching branches means losing all your test data?

I was so tired of this dance.

So I built a solution.

Next.js Worktrees Manager – a bash script that creates completely isolated development environments with their own PostgreSQL databases.

Let me show you exactly how I vibe code with multiple AI agents simultaneously.

.

.

.

The Problem Nobody Talks About

Here’s what happens every single day when you’re vibe coding:

You’re in the zone with an AI coding agent.

Claude Code, Codex, Cursor, Windsurf – doesn’t matter which.

The agent suggests changes.
You accept them.
Then you realize it modified your database schema.

Now what?

git reset --hard?

Sure, that fixes the code. But your database?

Those migrations already ran.
That test data is gone.
Your local environment is now broken.

The worst part?

You want to compare different AI agents’ implementations.

But without git worktrees, that means:

  • Constantly switching branches
  • Re-running migrations
  • Losing test data
  • Fighting port conflicts
  • Wasting hours on environment management instead of coding

Regular git workflows weren’t built for this level of experimentation.

.

.

.

The Solution: Git Worktrees + True Isolation

Next.js Worktrees Manager does one thing brilliantly:

It extends git worktrees with database isolation.

Each worktree gets:

  • Its own working directory (via git worktrees)
  • Its own PostgreSQL database (cloned from your main)
  • Its own port assignment (run multiple dev servers simultaneously)
  • Its own .env configuration

Git worktrees handle the code isolation. My script handles everything else.

.

.

.

Installation (30 Seconds)

git clone https://github.com/nathanonn/next-worktrees-manager.git
cd next-worktrees-manager
chmod +x worktrees.sh

Done.

.

.

.

Let Me Show You The Magic

Say you want three different AI agents to implement authentication.

Here’s the entire process:

./worktrees.sh setup 
  --branches claude-auth,Codex-auth,gemini-auth 
  --db-url postgresql://localhost:5432/myapp

Watch what happens:

  1. Creates three worktrees in worktrees/ directory
  2. Clones your database three times
  3. Updates each .env with the correct database URL
  4. Assigns ports 3001, 3002, and 3003

Time elapsed: Less than 60 seconds.

Now run all three simultaneously:

cd worktrees/claude-auth && PORT=3001 npm run dev
cd worktrees/Codex-auth && PORT=3002 npm run dev
cd worktrees/gemini-auth && PORT=3003 npm run dev

Open your browser:

  • http://localhost:3001 – Claude’s implementation
  • http://localhost:3002 – Codex‘s implementation
  • http://localhost:3003 – Gemini’s implementation

Test them side-by-side.

Break things.

Each environment is completely isolated.

.

.

.

The Cleanup Is Even Better

Every setup creates a group ID like wt-20251008-191431.

When you’re done experimenting:

./worktrees.sh clean --group wt-20251008-191431

[Screenshot 3 placeholder: Cleaning up worktrees and databases with a single command]

One command. All worktrees deleted. All databases dropped. Your main branch untouched.

It’s like those experiments never happened.

.

.

.

Use Cases That Actually Matter

Testing AI Agent Outputs

Stop wondering which AI agent produces better code. Test them simultaneously:

./worktrees.sh setup 
  --branches gpt5-approach,claude-opus-approach,gemini-3-approach 
  --db-url postgresql://localhost/myapp

Run all three.

See which implementation is cleaner.

Make an informed decision.

Feature Variations

Building multiple payment providers?

Keep them isolated:

./worktrees.sh setup 
  --branches stripe-checkout,paypal-checkout,crypto-checkout 
  --db-url postgresql://localhost/ecommerce

No more commenting out code.

No more environment variable juggling.

Production Debugging

Need to reproduce a production bug safely?

./worktrees.sh setup 
  --branches prod-bug-fix 
  --db-url postgresql://localhost/production_copy 
  --setup-cmd "npm run seed:production"

Break things freely.

Your main environment stays clean.

Team Development

Multiple developers. Same codebase. Zero conflicts:

./worktrees.sh setup 
  --branches alice/feature,bob/feature,charlie/fix 
  --db-url postgresql://localhost/team_db

Everyone gets their own database.

No more “waiting for migrations.”

.

.

.

The Features That Matter

Custom Ports

./worktrees.sh setup 
  --branches v1,v2,v3 
  --db-url postgresql://localhost/db 
  --start-ports 3000,4000,5000

Auto-Prisma

If you use Prisma, it runs prisma generate automatically:

./worktrees.sh setup 
  --branches test 
  --db-url postgresql://localhost/db 
  --auto-prisma on  # Default

Custom Setup

Need to install packages or seed data?

./worktrees.sh setup 
  --branches experiment 
  --db-url postgresql://localhost/db 
  --setup-cmd "npm install && npm run seed"

Force Recreate

Testing the same branch repeatedly?

./worktrees.sh setup 
  --branches test 
  --db-url postgresql://localhost/db 
  --force

.

.

.

Safety Built In

The script protects you from yourself:

  • Local databases only – Can’t accidentally touch remote databases
  • Clean git check – Won’t create worktrees with uncommitted changes
  • Connection handling – Safely terminates active connections
  • Dry run mode – Preview with --dry-run
  • Confirmation required – Bulk cleanup needs --yes

.

.

.

Performance Reality Check

Max 10 branches per setup.

Why?
PostgreSQL connection limits.
But honestly, if you need more than 10 parallel experiments, you have bigger problems.

Database cloning is instant.

PostgreSQL’s CREATE DATABASE ... TEMPLATE uses copy-on-write.

Even gigabyte databases clone in seconds.

.

.

.

Your New Daily Commands

# Create experiment
./worktrees.sh setup --branches feat/test --db-url postgresql://localhost/db

# Check status
./worktrees.sh status --group wt-20251008-191431

# Get start commands
./worktrees.sh start --group wt-20251008-191431  

# Clean up
./worktrees.sh clean --group wt-20251008-191431

# Nuclear option
./worktrees.sh clean --all --yes

.

.

.

Who Actually Needs This

You need this if you:

  • Work with AI coding agents daily
  • Test multiple implementations regularly
  • Value experimental freedom
  • Hate database conflicts
  • Want true environment isolation

You don’t need this if you:

  • Never experiment
  • Work on simple CRUD apps
  • Don’t use PostgreSQL
  • Enjoy manual environment management

.

.

.

The Bottom Line

Stop tiptoeing around your development database.

Stop losing hours to environment setup.

Stop choosing between experimentation and stability.

This script gives you true isolation in under a minute. Break things. Test wildly. Your main branch stays untouched.

Get the script: github.com/nathanonn/next-worktrees-manager

Your future self – vibe coding with three AI agents simultaneously using git worktrees – will thank you.


P.S. – I’ve used this script to test different AI agent implementations. Not once has my main database been corrupted. That peace of mind alone makes vibe coding actually enjoyable again.

8 min read The Art of Vibe Coding, Codex, Vibe Coding

The Latest Codex CLI Commands That Will Save Your Sanity (And Your Rate Limits)

You fire up Codex CLI for a quick task.

Five minutes later, you’re rate-limited.

“When can I code again?” you wonder, staring at the unhelpful error message.

Or maybe you’re trying to get structured JSON from exec, but it returns a wall of text instead.

Or you want Code Review feedback, but you’re not sure how to get the most out of /review.

Sound familiar?

Codex CLI v0.41 just shipped with fixes for all of these headaches.

But here’s the thing – most developers don’t even know these features exist.

Let me show you the three commands that will transform how you use Codex CLI, plus the exact fixes for the issues that drive us all crazy.

.

.

.

But First: Are You Even On v0.41?

Most people aren’t.

They’re running older versions and wondering why things don’t work right.

Check your version right now:

codex --version
codex-cli 0.41.0

Not on 0.41? Here’s the fastest way to update:

If you use npm:

npm install -g @openai/codex@latest

If you use Homebrew:

brew update && brew upgrade codex

30 seconds. That’s all it takes.

Now let’s dive into the commands that actually matter.

The /status Command: Finally See Your Usage BEFORE You Hit The Wall

Here’s what drives me crazy.

You’re coding. Everything’s flowing. Then BAM – rate limit exceeded.

No warning. No heads up. Just sudden death to your productivity.

The worst part? You had no idea you were even close to the limit.

Before v0.41, you were flying blind. Now, /status shows you exactly how much runway you have left.

.

.

.

The Game-Changing Update Nobody’s Talking About

v0.41 added usage tracking to /status.

Look at this beauty:

codex /status

See those percentages?

  • 10% used on your 5-hour limit
  • 15% used on your weekly limit
  • Exact reset times for both

Now you can actually plan.

Big task coming up? Check your usage first. Running low? Save it for after the reset.

No more surprises.

.

.

.

The Blank Usage Bug (And The 2-Second Fix)

But wait. You run /status and the usage section is… empty?

This happens to everyone. Here’s the stupidly simple fix:

# Send a throwaway message first
codex
> hi
> /status

Why does this work?

Usage data loads after your first message. It’s a known bug. Now you know the workaround.

.

.

.

How to Never Get Surprised Again

Make checking /status a habit:

# Add to your shell profile
alias cs="codex /status"  # Quick check

# Before starting big tasks
cs  # Am I good to go?

When you see you’re at 80% on your 5-hour limit, you know it’s time to wrap up or switch to lighter tasks.

When you see you’re at 90% on your weekly limit on a Wednesday, you know to save heavy lifting for next week’s reset.

The power isn’t just seeing the limits. It’s planning around them.

.

.

.

Want techniques like these weekly?

Join The Art of Vibe Coding—short, practical emails on shipping with AI (without the chaos).

No spam. Unsubscribe anytime. Seriously.

The /review Command: Your AI Code Reviewer That Actually Works

You just finished a feature.

Before committing, you want a second pair of eyes on it.

But your teammate is busy. Or it’s 2 AM. Or you just want a sanity check before the PR.

Enter /review – the command that turns Codex into your always-available Code Reviewer.

.

.

.

### The Interactive Menu (The Feature Most People Miss)

Here’s what blew my mind when I discovered it.

Just type `/review` with no arguments:

“`bash

codex /review

“`

And you get this beautiful interactive menu:

Screenshot showing /review interactive menu with 4 options: "1. Review uncommitted changes", "2. Review a commit" (highlighted), "3. Review against a base branch", "4. Custom review instructions"

Pick option 2, and watch this magic:

Screenshot showing commit selection screen with a searchable list of recent commits like "added ab-testing guide", "feat: Update API activity monitor", "Refactor database schema"

You can:

Search commits by typing – Just start typing to filter

Navigate with arrow keys – Up/down through your commit history

Hit Enter to review – Instant analysis begins

Screenshot showing code review in progress for commit 97fd1d4, with Codex finding issues like "Fix clean command option docs - The guide advertises ./ab-worktrees.sh clean --src-db-url but only recognizes --dry-run"

Why this changes everything:

No memorizing commit hashes. No complex git commands. Just browse, select, review.

It’s like having a git GUI built into your terminal.

.

.

.

Three Ways to Review (From Basic to Brilliant)

1. Review Your Staged Changes

The basics. Stage what you want reviewed:

git add -p  # Stage specific chunks
codex /review

2. Review a Specific Commit

Made a commit but having second thoughts?

# Review the last commit
codex /review commit HEAD

# Review any commit
codex /review commit abc123def

3. The Power Move: Custom Instructions

This is where /review becomes magical:

# Security-focused review
codex /review diff --base main 
  --instructions "Check for SQL injection, XSS, and auth bypasses"

# Performance review
codex /review diff --base main 
  --instructions "Find N+1 queries and unnecessary database calls"

# API contract review
codex /review diff --base main 
  --instructions "Verify backward compatibility and semantic versioning"

.

.

.

GitHub PR Reviews (The Feature Nobody Uses)

Did you know Codex can review PRs directly on GitHub?

Just comment on any PR:

@codex review for memory leaks

Setup required: Enable Code Review for your repo first. Takes 2 minutes.

.

.

.

The Selective Staging Trick

Here’s a pro move most developers miss:

Don’t review everything. Review what matters.

# Stage only the business logic
git add -p src/core/

# Skip the test files and configs
# Now review JUST the important stuff
codex /review

Why this matters:

  • Focused reviews catch more issues
  • Less noise, better signal
  • Faster reviews, clearer feedback

Stage surgically. Review intelligently.

.

.

.

The exec Command: Finally, JSON That Doesn’t Suck

You need Codex to analyze something and return JSON.

So you run:

codex exec "analyze the codebase and return findings as JSON"

What you get back?

A wall of text with some JSON-ish stuff buried in the middle. Good luck parsing that in your CI pipeline.

v0.41 fixed this nightmare.

.

.

.

The Schema Revolution

Here’s the game-changer: --output-schema

You tell Codex EXACTLY what JSON structure you want. It delivers exactly that.

Watch this magic:

# 1. Create your schema (analysis-schema.json)
{
  "type": "object",
  "properties": {
    "summary": { "type": "string" },
    "issues": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "severity": {
            "type": "string",
            "enum": ["low", "medium", "high", "critical"]
          },
          "file": { "type": "string" },
          "description": { "type": "string" }
        }
      }
    }
  },
  "required": ["summary", "issues"]
}

# 2. Run with schema enforcement
codex exec --output-schema analysis-schema.json 
  "Find security issues in the codebase"

# 3. Get PERFECT JSON every time

The result?

Parseable. Predictable. Production-ready JSON.

Every. Single. Time.

.

.

.

See What Codex Is Thinking

Ever wonder what Codex is planning before it executes?

The new --include-plan-tool flag shows you:

codex exec --include-plan-tool 
  "Refactor the auth module for better testing"

Why this matters:

  • Debug long-running commands
  • Understand Codex’s approach
  • Catch issues before execution
  • Learn from its planning process

.

.

.

The Model Bug You Need to Know

Warning: There’s a fresh bug where --output-schema gets ignored with gpt-5-codex.

The workaround? Use gpt-5 instead:

# Doesn't respect schema (bug)
codex config set model gpt-5-codex
codex exec --output-schema schema.json "task"  # Returns unstructured text

# Works perfectly
codex config set model gpt-5
codex exec --output-schema schema.json "task"  # Returns proper JSON

The team knows. Fix coming soon. For now, use gpt-5 when you need schemas.

.

.

.

The Hidden Gem: No More Ripgrep Errors

Remember those annoying postinstall errors?

Error: ripgrep binary not found
Please install ripgrep manually

v0.41 bundles ripgrep. No more install failures in CI. It just works.

Small fix. Huge relief.

.

.

.

Your Cheat Sheet (Copy This)

# Version check
codex --version                              # Are you on 0.41?

# Status & rate limits
codex /status                                # See limits & reset times

# Review commands
codex /review                                # Review staged changes
codex /review commit HEAD                    # Review last commit
codex /review diff --base main 
  --instructions "focus on X"                # Custom focused review

# Exec with structure
codex exec --output-schema schema.json "task"     # Get JSON back
codex exec --include-plan-tool "task"             # See the planning

.

.

.

Your Action Items

  1. Update to v0.41 right now (seriously, it takes 30 seconds)
  2. Add rate limit checks to your shell profile: alias cs="codex /status" # Quick status check
  3. Create JSON schemas for your common exec tasks
  4. Set up /review in your git hooks: # In .git/hooks/pre-commit codex /review || exit 1
  5. Switch models if you hit bugs (gpt-5 vs gpt-5-codex)

.

.

.

The Bottom Line

Codex CLI v0.41 isn’t just an incremental update.

It’s the difference between guessing and knowing.

Between random failures and predictable workflows.

Between text soup and structured data.

These three commands – /status, /review, and exec – aren’t just features. They’re workflow transformers that turn Codex from a cool tool into an essential part of your development process.

The tools are ready. The bugs have workarounds. The productivity gains are real.

What will you build now that your tools actually work?


P.S. – I discovered the usage tracking feature after hitting rate limits three times in one day. Now I check /status before starting any big task. Haven’t been surprised by a rate limit since.

P.P.S. – Yes, the model-specific bugs are annoying. But switching models takes 2 seconds, and the productivity gains are worth the minor inconvenience. The team ships fixes fast – check the GitHub issues for updates.

6 min read Claude, Codex, The Art of Vibe Coding, Vibe Coding

The Codex-Claude Code Workflow: How I Plan With GPT-5 and Execute With Claude Code

I used to dive straight into Claude Code’s Plan Mode and ask it to build features.

Sometimes it worked brilliantly – Claude would research, plan, and implement perfectly. Other times, despite Plan Mode’s capabilities, it built something that technically worked but wasn’t quite what I had in mind.

The problem?

Even with Plan Mode, I wasn’t always giving Claude Code enough context about my specific requirements.

I was expecting it to infer implementation details, UI decisions, and edge cases from a brief description.

Then I discovered a game-changing workflow: Use Codex to plan meticulously, then Claude Code to execute flawlessly.

Let me show you exactly how this works with a real example – adding a context menu to rename and delete chat sessions in my YouTube AI app.

.

.

.

The Magic Is In The Questions

Here’s the breakthrough:

Instead of jumping straight into implementation, I start by having Codex ask me clarifying questions until it’s 95% confident it can create a perfect plan.

This brilliant approach comes from Sabrina Ramonov’s YouTube video “3 ChatGPT Prompts That Feel Like Millionaire Cheat Codes”.

I’ve refined her prompt to work perfectly with Codex for feature planning.

.

.

.

Step 1: Start With Questions, Not Code

I begin with this carefully crafted prompt to Codex:

In the Chat sidebar, I want to add context menu for user to rename the session as well as deleting the session.

Help me come up with a plan to implement this. DON'T WRITE OR EDIT ANY FILES.

Ask me clarifying questions until you are 95% confident you can complete this task successfully.

a. If the question is about choosing different options, please provide me with a list of options to choose from. Mark the option with a clear label, like a, b, c, etc.
b. If the question need custom input that is not in the list of options, please ask me to provide the custom input.

The key elements here:

  • DON’T WRITE OR EDIT ANY FILES – This keeps Codex focused on planning
  • 95% confidence threshold – Forces thorough understanding before proceeding
  • Structured options – Makes decision-making faster and reveals possibilities I hadn’t considered
Original state of the chat sidebar without context menu
Initial prompt to Codex asking for the plan

.

.

.

Step 2: The Clarifying Questions Phase

This is where the magic happens. Codex doesn’t just ask random questions – it systematically explores every aspect of the feature:

Codex presenting its first round of clarifying questions with options

Notice how Codex provides options for each decision:

  • Rename UI: Dialog with text input, inline edit, or popover
  • Delete confirmation: AlertDialog, no confirmation, or typing “DELETE”
  • Mobile behavior: Different menu types or desktop-only
  • Post-action behavior: Keep selection, reset, or select next

I can quickly pick options without having to type lengthy explanations:

My responses to Codex's questions using the option labels

But Codex doesn’t stop there. It asks follow-up questions to refine the details:

Codex asking additional clarifying questions

Questions about:

  • Handling duplicate titles
  • Whitespace validation
  • Button visibility
  • Delete confirmation text
  • Test IDs for E2E testing

I continue answering:

My responses to the additional questions

.

.

.

Want techniques like these weekly?

Join The Art of Vibe Coding—short, practical emails on shipping with AI (without the chaos).

No spam. Unsubscribe anytime. Seriously.

Step 3: The Comprehensive Plan

After gathering all requirements, Codex produces a detailed implementation plan:

Codex's final comprehensive plan with all implementation details - Parts 1 of 4
Codex's final comprehensive plan with all implementation details - Parts 2 of 4
Codex's final comprehensive plan with all implementation details - Parts 3 of 4
Codex's final comprehensive plan with all implementation details - Parts 4 of 4

The plan is incredibly thorough, covering:

Backend Architecture:

  • PATCH and DELETE routes with exact endpoints (/api/chat/sessions/:sessionId)
  • Specific HTTP status codes (200, 204, 401, 404)
  • Storage interface methods with complete signatures
  • MemStorage and PostgreSQLStorage implementation details
  • Ownership verification and cascade deletion logic

Client-Side Implementation:

  • useChat hook extensions with renameSessionMutation and deleteSessionMutation
  • Cache invalidation strategies
  • Toast feedback for success/error states
  • Context state management

UI Components:

  • ContextMenu wrapper with ContextMenuTrigger
  • Always-visible “…” actions button with DropdownMenu
  • Dialog components with specific state management
  • Input validation rules (trim, non-empty, allow duplicates)
  • Loading states that disable actions during pending operations

UX Specifications:

  • Left-click for selection, right-click for actions
  • Mobile-first “…” button approach
  • Preserved updatedAt timestamps during rename
  • Session order stability
  • Sidebar sheet behavior (no auto-close on dialog open)

Testing Infrastructure:

  • Comprehensive test IDs for E2E testing
  • API contract examples with request/response formats
  • Testing checklist with specific scenarios

This isn’t just a vague outline – it’s a blueprint that any developer could follow.

.

.

.

Step 4: From Plan to Implementation

Now comes the handoff. I copy Codex’s plan and paste it directly into Claude Code:

Copying the plan from Codex and pasting into Claude Code

Claude Code immediately understands the comprehensive plan:

Claude Code acknowledging and analyzing the plan

Watch as Claude Code systematically implements each part of the plan:

Claude Code examining the existing codebase structure
Claude Code making the necessary changes to implement the features

The implementation is surgical and precise – Claude Code knows exactly what to build because the plan is so detailed:

Claude Code's completion message showing successful implementation
The working context menu with rename and delete options

.

.

.

Step 5: The Review Loop

But we’re not done yet. This is where the workflow truly shines.

I take the git diff from Claude Code’s implementation and ask Codex to review it:

Please read the git diff, analyze the implementation and check if everything is implemented as per advised.

DON'T WRITE OR EDIT ANY FILES.
Asking Codex to review the implementation

Codex meticulously reviews every change:

Codex's detailed review findings

In this case, Codex found two deviations from the agreed plan:

  1. Timestamp updates during rename (should be preserved)
  2. Button visibility on desktop (had unnecessary opacity classes)

.

.

.

Want techniques like these weekly?

Join The Art of Vibe Coding—short, practical emails on shipping with AI (without the chaos).

No spam. Unsubscribe anytime. Seriously.

Step 6: Refinement Based on Feedback

I copy Codex’s feedback back to Claude Code:

Providing Codex's feedback to Claude Code

Claude Code immediately understands and implements the corrections:

Claude Code implementing the fixes
Claude Code confirming all fixes are complete

The final result?

A perfectly implemented feature that matches the original specifications exactly:

Final working implementation with all refinements

.

.

.

Why This Workflow Is Revolutionary

1. Thinking Before Coding

By forcing myself to answer Codex’s questions, I’m thinking through edge cases and implementation details BEFORE any code is written.

This prevents the “oh wait, what about…” moments during development.

2. Leveraging Strengths

  • Codex excels at: Analysis, planning, asking the right questions, and meticulous review
  • Claude Code excels at: Understanding context, implementing complex features, and following detailed plans

3. Faster Development

Counter-intuitively, spending 10 minutes on planning saves hours of refactoring.

The implementation is right the first time.

4. Better Code Quality

The review loop catches issues that might slip through.

Having Codex review Claude Code’s work is like having a senior developer review every PR.

5. Learning Through Questions

Codex’s questions often reveal considerations I hadn’t thought of.

“Should duplicate titles be allowed?”
“What happens to the selection after delete?”

These questions make me a better developer.

.

.

.

The Prompt That Makes It Work

The secret sauce is in the prompt structure. By combining:

  • Clear feature description
  • Explicit “don’t code” instruction
  • 95% confidence threshold (credit to Sabrina Ramonov)
  • Structured option format

We transform Codex from a code generator into a thoughtful architect who ensures every detail is considered before implementation begins.

.

.

.

Your Next Feature

Try this workflow on your next feature:

  1. Start with Codex – Use the clarifying questions prompt
  2. Answer thoughtfully – Pick from options or provide custom input
  3. Get the plan – Let Codex create a comprehensive blueprint
  4. Implement with Claude Code – Copy the plan and watch it build
  5. Review with Codex – Check the implementation against the plan
  6. Refine with Claude Code – Apply any feedback from the review

This isn’t just about building features faster. It’s about building them right the first time.

While the context menu might seem like a simple example, this same workflow scales beautifully to complex features. Whether you’re building authentication systems, real-time collaboration, or intricate data visualizations – the pattern of meticulous planning with Codex followed by precise execution with Claude Code ensures success every time.

What feature will you plan with Codex and build with Claude Code?


P.S. – The combination of Codex’s analytical planning and Claude Code‘s implementation prowess has transformed how I develop. No more half-baked features or forgotten edge cases. Just clean, well-planned, thoroughly reviewed code.

7 min read The Art of Vibe Coding, Claude, Codex, GPT-5, Vibe Coding

Claude Code vs Codex: Why I Use Both (And You Should Too)

Everyone’s asking “Claude Code vs Codex – which one should I use?”

You’re asking the wrong question.

After tons of testing Claude Code vs Codex head-to-head, I discovered something game-changing: they’re not competitors, they’re the perfect team.

  • Claude Code builds brilliantly,
  • Codex reviews meticulously, and
  • Together they create code that’s both powerful and bulletproof.

Let me show you exactly how this works with a real example from my WordPress theme.

.

.

.

Claude Code vs Codex: The Problem With Choosing Just One

Codex (GPT-5 High) Alone: Minimal to a Fault

Ask Codex to build something from scratch, and you’ll get code that works… technically.

But it’s like asking for a house and getting a tent.

Sure, it provides shelter, but is that really what you wanted?

In the Claude Code vs Codex comparison, Codex’s minimalism means:

  • Basic functionality only
  • No edge case handling
  • Missing quality-of-life features
  • Requires significant enhancement

Claude Code Alone: The Over-Engineering Trap

Claude Code (especially Opus 4.1) goes the opposite direction.

Ask for a simple feature, and it builds you a spacecraft.

The code becomes so complex that even Claude loses track of what it created.

The over-engineering pattern:

  • Abstract factories for simple functions
  • Unnecessary design patterns
  • 20 files when 3 would suffice
  • Complexity that breeds bugs

.

.

.

Claude Code vs Codex: The Solution is Both

After extensive testing of Claude Code vs Codex in production environments, here’s the breakthrough: Use Claude Code to build, then Codex to review.

When comparing Claude Code vs Codex strengths:

Claude Code excels at:

  • Understanding requirements
  • Creating comprehensive implementations
  • Handling complex integrations
  • Building from scratch

GPT-5 (Codex) excels at:

  • Finding security vulnerabilities
  • Catching inconsistencies
  • Identifying missing edge cases
  • Suggesting surgical improvements

Together, they’re unstoppable.

.

.

.

Real-World Example: Claude Code + Codex in Action

Let me walk you through exactly how Claude Code + Codex work together on a real feature – adding a newsletter subscription shortcode to my WordPress theme.

Phase 1: Claude Code Implementation

I asked Claude Code to do two things:

  1. Create a newsletter subscribe form shortcode
  2. Add a guide tab in the theme options to show users how to use this shortcode

Here’s my original prompt:

Screenshot showing the original prompt to Claude Code about injecting subscribe form via shortcode and creating a guide tab

Claude Code immediately understood the context and used a sub-agent to explore the codebase:

Screenshot showing Claude Code’s thinking process and planning the implementation

After investigation, Claude successfully implemented both the shortcode and the guide tab:

Screenshot showing Claude Code’s successful completion message with all implemented features

The initial implementation included:

  • ✅ Newsletter subscribe form shortcode
  • ✅ Guide tab added to Theme Options
  • ✅ Advanced template system with 6 visual styles
  • ✅ Comprehensive documentation for users
  • ✅ Integration features with FluentCRM
  • ✅ Best practices and placement recommendations
Screenshot showing the initial newsletter subscription form - functional but needs refinement

Phase 2: Codex Review

This is where the magic happens.

I took Claude Code’s implementation and gave it to Codex for review:

Screenshot showing how I provided the implementation summary from Claude Code along with my prompt asking Codex to “Read the git diff and analyze if the implementation is correct”

My exact prompt to Codex was simple: “Read the git diff and analyze if the implementation is correct. Attached is how the subscribe form looks like.”

Codex immediately went into detective mode:

Screenshot showing Codex starting its review process - “I’ll inspect the git diff and scan the theme files

Codex immediately went into detective mode, analyzing every aspect:

Screenshot showing Codex’s detailed analysis process - searching for nonces, analyzing functionality
Screenshot showing Codex continuing analysis - checking shortcode usage, examining CSS classes

The Meticulous Review Results

After thorough analysis, Codex identified several critical issues that Claude Code missed:

Screenshot showing Codex’s comprehensive findings

What Codex Found:

  1. Security Gap: The shortcode form lacked nonce and honeypot protection
    • No wp_nonce_field in the shortcode form
    • Template had security features, but shortcode didn’t use them
  2. Integration Issues:
    • Shortcode posted to wrong endpoint
    • AJAX handler expected different parameters
    • JS selector wouldn’t catch all form variants
  3. Documentation Mismatches:
    • PHP usage example was incorrect
    • Guide showed wrong template path syntax
    • Style parameter documentation was misleading
  4. Small but Important Details:
    • Footer template had a stray “◆” character instead of proper close icon
    • Hidden context field missing for non-JS submissions
Screenshot showing Codex’s detailed recommended fixes

Want techniques like these weekly?

Join The Art of Vibe Coding—short, practical emails on shipping with AI (without the chaos).

No spam. Unsubscribe anytime. Seriously.

Phase 3: Applying the Fixes (With a Twist)

Here’s where I made an interesting choice.

Instead of taking Codex’s recommendations back to Claude Code, I asked Codex itself to apply the fixes:

Screenshot showing me requesting “Yes, please apply all the fixes”

Why did I choose Codex over Claude Code for the fixes?

I wanted to test if Codex could handle implementation as well as review.

Spoiler: it absolutely can.

Codex methodically applied each improvement:

Screenshot showing Codex’s completion summary - “Applied the fixes across shortcode, JS, template, and Guide tab”
Screenshot showing detailed list of all changes Codex made
Screenshot showing the 4 files changed with specific line counts

Note: You could absolutely ask Claude Code to apply these fixes instead. Both approaches work. The choice depends on your workflow preference and which tool already has the most context about your specific requirements.

The Final Result

The difference was night and day:

Before Codex Review:

Screenshot of initial basic form

After Codex Review:

Screenshot of final polished form with all improvements

The final implementation now included:

  • ✅ Full security with nonce + honeypot
  • ✅ Proper AJAX/REST integration
  • ✅ Consistent styling across all contexts
  • ✅ Accurate documentation
  • ✅ Clean UI with proper icons
  • ✅ Hidden context field for fallback

.

.

.

The Workflow That Changes Everything

Here’s my exact process:

Step 1: Initial Implementation with Claude Code

"Build [feature] following our project rules"

Let Claude Code do what it does best – create comprehensive implementations.

Step 2: Export for Review

Generate a git diff or summary of changes. Include:

  • The implementation code
  • Any UI screenshots
  • The intended functionality

Step 3: Codex Review

"Review this implementation for security, consistency, and correctness.
Attached is [git diff/code/screenshots]"

Watch as Codex finds issues you never would have caught.

Step 4: Apply Improvements (Two Options)

Option A: Ask Claude Code to apply the fixes

"Apply these recommended fixes: [Codex's feedback]"

Claude Code implements the improvements with full context of the original implementation.

Option B: Ask Codex to apply the fixes directly

"Yes, please apply all the fixes"

Codex can handle both review AND implementation – as I demonstrated in this example.

Both approaches work.

Choose based on:

  • Which tool has more context about your requirements
  • Your comfort level with each tool
  • The complexity of the fixes needed

.

.

.

Why Claude Code & Codex Together Works So Well

Complementary Strengths

The Claude Code vs Codex combination leverages what each does best:

Claude Code brings:

  • Creative problem-solving
  • Comprehensive implementations
  • Deep context understanding
  • Rapid development

Codex brings:

  • Meticulous attention to detail
  • Security vulnerability detection
  • Consistency checking
  • Edge case identification

.

.

.

Pro Tips for Maximum Effectiveness

1. Let Claude Code Explorer First

Always use Claude Code’s codebase-explorer agent for initial investigation.

It understands context better than starting fresh.

2. Be Specific with Codex

Don’t just say “review this.” Say:

  • “Check for security vulnerabilities”
  • “Verify integration points”
  • “Validate documentation accuracy”

3. Screenshot Everything

Visual proof helps both AIs understand what you’re building.

4. Don’t Skip the Review

Even if Claude Code’s implementation seems perfect, run it through Codex.

Those “small” issues compound into big problems.

5. Keep the Feedback Loop Tight

Apply fixes immediately while context is fresh.

Don’t let reviews pile up.

.

.

.

The Bottom Line: Claude Code vs Codex is the Wrong Question

Stop treating Claude Code vs Codex as an either/or decision.

Start using them as collaborators.

The Claude Code vs Codex debate misses the point entirely.

They’re not competitors fighting for your attention – they’re complementary tools that achieve greatness together.

Claude Code is your brilliant architect who designs and builds. Codex is your meticulous inspector who ensures everything is perfect.

Together, they don’t just write code – they craft production-ready solutions that are secure, consistent, and maintainable.

My newsletter shortcode went from “it works” to “it’s bulletproof” in one review cycle.

That’s the power of using the right tool for the right job.

Your next feature deserves both the creativity of Claude Code and the precision of Codex. Why settle for less?


P.S. – This workflow has become so essential that I now budget time for both implementation and review in every feature. The 30 minutes spent on review saves hours of debugging later. Try it on your next feature and see the difference.

11 min read The Art of Vibe Coding, Claude, Vibe Coding

How a Read-Only Sub-Agent Saved My Context Window (And Fixed My WordPress Theme)

I was staring at my WordPress theme’s newsletter page, confused.

The “Recent Issues” section was supposed to show my latest newsletter posts. Instead, it was completely wrong.

The typical approach would be to let Claude Code dive into the codebase, reading file after file, searching for the problem.

But there’s a catch – each file Claude reads consumes precious context tokens.

By the time it finds the issue, it might have used 80% of its context window, leaving little room for actually fixing the problem.

That’s when I discovered a game-changing approach: the read-only sub-agent pattern.

.

.

.

The Context Window Problem Nobody Talks About

When Claude Code explores a codebase, it’s like a detective gathering evidence.

Each file it reads, each search it performs, adds to its memory.

But unlike a human detective who can forget irrelevant details, Claude keeps everything in its context window.

Here’s what typically happens:

  1. You ask Claude to fix a bug
  2. Claude reads 10-20 files looking for the issue
  3. Each file might be hundreds of lines
  4. Suddenly, 80% of the context is consumed
  5. Claude struggles to maintain focus on the actual fix

It’s like trying to solve a puzzle while carrying every piece you’ve ever looked at. Eventually, you run out of hands.

.

.

.

Enter the Codebase Explorer Sub-Agent

The solution?

Delegate the investigation to a specialized sub-agent that only reads and reports back.

Think of it as hiring a research assistant.

The assistant does all the reading, takes notes, and gives you a concise report.

You then use that report to make decisions without having to read everything yourself.

Here’s the actual sub-agent configuration I use:

The key insight: This sub-agent can ONLY read files, not modify them.

It explores, analyzes, and documents its findings in a markdown report.

.

.

.

Real-World Example: Fixing My Newsletter Display

Let me show you exactly how this worked when fixing my WordPress theme.

Step 1: Launching the Investigation

Instead of diving straight into the code, I asked Claude to use the codebase-explorer agent to investigate why newsletter issues weren’t displaying:

Step 2: The Sub-Agent Takes Over

The codebase-explorer agent immediately went to work:

Notice how it’s thinking through the problem systematically. It’s not just randomly reading files – it’s forming a strategy.

Step 3: Sub-Agent Investigation & Documentation

The sub-agent went deep into the codebase, systematically exploring files and tracing the newsletter implementation:

After thorough investigation, it documented everything in a comprehensive markdown report:

  • Summary of the issue: Newsletter posts stored as regular WordPress posts with a specific category
  • Key files involved: page-newsletter.php, functions.php, theme options
  • The specific bug: Incorrect variable usage with setup_postdata()
  • Recommended fix: Change loop variable from $issue to $post

Critical detail: The sub-agent consumed 51,000 tokens during its investigation.

Without the sub-agent pattern, those 51k tokens would have been loaded directly into the main agent’s context window!

Step 4: Main Agent Takes Over With Clean Context

With the sub-agent’s analysis complete, the main agent took over with a nearly empty context window:

Notice how the main agent can now work with surgical precision. It reads the specific file mentioned in the analysis:

The main agent confirms the issue: The code was using $issue as the loop variable but WordPress’s setup_postdata() function expects the global $post variable to work correctly.

Want techniques like these weekly?

Join The Art of Vibe Coding—short, practical emails on shipping with AI (without the chaos).

No spam. Unsubscribe anytime. Seriously.

Step 5: The Main Agent’s Surgical Fix

Armed with the sub-agent’s report and with 90% of its context still available, the main agent presented a precise fix:

The fix was surgical and precise:

Step 6: Success!

The newsletter issues now display perfectly!

.

.

.

Why the Sub-Agent Must Be Read-Only

You might wonder: “Why not let the sub-agent fix the problem directly?”

Here’s why the read-only constraint is crucial:

  1. Context Preservation: The main reason – saving those 51,000 tokens for the main agent’s context
  2. Avoiding Code Duplication: If the sub-agent makes changes and passes control back, the main agent might not be aware of what was modified. This often leads to:
    • Duplicate implementations of the same fix
    • Violating the DRY (Don’t Repeat Yourself) principle
    • Conflicting code changes
    • Confusion about the current state of files
  3. Maintaining Control: The main agent maintains a coherent understanding of all changes made to the codebase

The read-only pattern ensures clean handoffs: The sub-agent investigates and reports, the main agent decides and implements.

No confusion, no duplication, no wasted context.

.

.

.

The Context Window Visualization

Let me show you exactly how context consumption differs between the two approaches:

Without Sub-Agent: Direct Investigation

Result: High risk of context overflow, limited ability to handle complex fixes

With Sub-Agent: Delegated Investigation

Result: Minimal context usage, full capacity for implementation

The Compound Effect Over Multiple Tasks

.

.

.

The Numbers That Matter

Without the sub-agent pattern:

  • Files read directly: 15-20
  • Context consumed: ~80%
  • Investigation time: 20+ minutes
  • Risk of context overflow: High
  • Ability to handle complex fixes after investigation: Limited

With the sub-agent pattern:

  • Files read by main agent: 1 (the analysis report)
  • Context consumed: <10%
  • Investigation time: Same, but parallelized
  • Risk of context overflow: Minimal
  • Ability to handle complex fixes: Excellent

.

.

.

Setting Up Your Own Codebase Explorer

Here’s the complete sub-agent configuration you can use:

---
name: codebase-explorer
description: Use this agent when you need to investigate, search, or analyze the codebase without making any modifications. This agent specializes in read-only operations to understand code structure, find relevant files, trace dependencies, and document findings. Perfect for preliminary investigation before code changes, understanding feature implementations, or analyzing issues.nnExamples:n- <example>n  Context: The user wants to understand how authentication is implemented before making changes.n  user: "I need to add a new authentication method. Can you first investigate how the current auth system works?"n  assistant: "I'll use the codebase-explorer agent to investigate the authentication implementation and document the findings."n  <commentary>n  Since this requires read-only investigation of the codebase before making changes, use the codebase-explorer agent to analyze and document the current implementation.n  </commentary>n</example>n- <example>n  Context: The user is debugging an issue and needs to understand the code flow.n  user: "There's a bug in the payment processing. I need to understand all the files involved in the payment flow."n  assistant: "Let me launch the codebase-explorer agent to trace through the payment processing flow and identify all related files."n  <commentary>n  The user needs read-only investigation to understand the codebase structure, perfect for the codebase-explorer agent.n  </commentary>n</example>n- <example>n  Context: Before implementing a new feature, understanding existing patterns is needed.n  user: "We need to add a new API endpoint. First, let's see how the existing endpoints are structured."n  assistant: "I'll use the codebase-explorer agent to analyze the existing API endpoint patterns and document them."n  <commentary>n  This requires read-only analysis of existing code patterns, ideal for the codebase-explorer agent.n  </commentary>n</example>
tools: mcp__ide__getDiagnostics, mcp__ide__executeCode, Glob, Grep, Read, WebFetch, TodoWrite, WebSearch, BashOutput, KillBash, Write
model: sonnet
color: cyan
---

You are a specialized codebase exploration and analysis agent. Your sole purpose is to perform read-only operations to investigate, search, and understand codebases, then document your findings comprehensively.

**Core Responsibilities:**

1. Search and locate files relevant to specific features, issues, or components
2. Analyze code structure, dependencies, and relationships between files
3. Trace execution flows and identify all components involved in specific functionality
4. Document findings in a clear, structured markdown file

**Operational Constraints:**

-   You must NEVER modify, edit, or create any code files
-   You must NEVER make changes to existing functionality
-   You are strictly limited to read-only operations: viewing, searching, and analyzing
-   Your only write operation is creating/updating your findings document in the notes folder

**Workflow Process:**

1. **Initial Assessment**: Understand the specific issue/feature to investigate
2. **Strategic Search**: Use targeted searches to locate relevant files:
    - Search for keywords, function names, class names related to the topic
    - Look for imports and dependencies
    - Check configuration files and entry points
3. **Deep Analysis**: For each relevant file found:
    - Document its purpose and role
    - Note key functions/classes it contains
    - Identify its dependencies and what depends on it
    - Record important implementation details
4. **Relationship Mapping**: Trace how files connect and interact
5. **Documentation**: Create a comprehensive markdown report

**Documentation Format:**
Your findings must be written to a markdown file in the `notes/` folder with a descriptive name like `notes/[timestamp]_analysis_[issue/feature/component/etc].md`. Structure your report as:

```markdown
# Codebase Analysis: [Topic/Feature/Issue]

## Summary

[Brief overview of what was investigated and key findings]

## Relevant Files Identified

### Core Files

-   `path/to/file1.ext`: [Purpose and key responsibilities]
-   `path/to/file2.ext`: [Purpose and key responsibilities]

### Supporting Files

-   `path/to/support1.ext`: [Role in the system]
-   `path/to/support2.ext`: [Role in the system]

## Implementation Details

### [Component/Feature Name]

-   Location: `path/to/implementation`
-   Key Functions/Classes:
    -   `functionName()`: [What it does]
    -   `ClassName`: [Its responsibility]
-   Dependencies: [List of imports and external dependencies]
-   Used By: [What other parts of the code use this]

## Code Flow Analysis

1. Entry point: `file.ext:functionName()`
2. Calls: `another.ext:processFunction()`
3. [Continue tracing the execution flow]

## Key Observations

-   [Important patterns noticed]
-   [Potential areas of interest]
-   [Configuration or environment dependencies]

## File Relationships Map
```

[ASCII or text-based diagram showing file relationships]

```

## Additional Notes
[Any other relevant information for the main agent]
```

**Search Strategies:**

-   Use grep/ripgrep for pattern matching across the codebase
-   Search for class/function definitions and their usages
-   Look for import statements to understand dependencies
-   Check test files to understand expected behavior
-   Review configuration files for feature flags or settings

**Quality Checks:**

-   Ensure all mentioned files actually exist and paths are correct
-   Verify that your analysis covers the complete scope requested
-   Double-check that no modifications were made to any files
-   Confirm your findings document is saved in the notes folder

**Communication Protocol:**
When you complete your analysis:

1. Save your findings to the notes folder
2. Provide the exact path to your findings file
3. Give a brief summary of what was discovered
4. Explicitly state: "Analysis complete. Findings documented in [path/to/notes/file.md] for main agent review."

Remember: You are a read-only investigator. Your value lies in thorough exploration and clear documentation, enabling the main agent to make informed decisions without consuming excessive context through tool calls.

.

.

.

Pro Tips for Maximum Effectiveness

1. Be Specific with Investigation Goals

Don’t just say “investigate the auth system.”

Say “find all files involved in user login, session management, and authentication middleware.”

2. Use the Analysis Document

The sub-agent’s markdown report becomes your reference.

Keep it open while the main agent works.

3. Chain Multiple Investigations

Need to understand both authentication AND payment processing? Run two separate investigations, get two reports.

4. Preserve Context for Complex Fixes

By keeping the main agent’s context clean, you can tackle complex refactoring that would otherwise hit limits.

.

.

.

The WordPress Gotcha That Almost Got Me

This pattern revealed something interesting about WordPress development.

The setup_postdata() function is notoriously finicky about variable names. It specifically requires the global $post variable to be set.

This is the kind of subtle bug that could consume hours of debugging.

With the sub-agent pattern, we found it in minutes without polluting the main context.

.

.

.

Beyond WordPress: Universal Application

While my example uses WordPress, this pattern works for any codebase:

  • React applications: Trace component hierarchies and prop flows
  • Node.js APIs: Map endpoint relationships and middleware chains
  • Python projects: Understand module dependencies and import structures
  • Ruby on Rails: Explore MVC relationships and gem integrations

The principle remains the same: Delegate investigation to preserve implementation context.

.

.

.

Your Next Steps

  1. Copy the sub-agent configuration from this post
  2. Add it to your Claude Code project
  3. Next time you need to investigate, use the magic words: “Use the codebase-explorer agent to investigate…”
  4. Watch your context efficiency soar

.

.

.

The Bottom Line

Context window management isn’t just about efficiency – it’s about capability.

By using a read-only sub-agent for investigation, you’re not just saving tokens. You’re ensuring Claude Code maintains the focus and context needed to implement complex solutions.

My newsletter display bug?

Fixed in minutes with a two-line change. But more importantly, I had 90% of my context window still available for additional improvements.

Stop letting investigation consume your implementation capacity.

Start using the codebase-explorer pattern.

Your future self – knee-deep in a complex refactoring with plenty of context to spare – will thank you.


P.S. – This pattern has fundamentally changed how I work with Claude Code. I now investigate first, implement second, and never worry about context overflow. Try it on your next debugging session and see the difference.