ISSUE #43 • Mar 27, 2026 • 12 MIN READ

Workflow Engineering in Action: Building a Reddit Summarizer From Scratch With Claude Code

Here’s a confession.

I follow about a dozen subreddit threads. AI tooling, Claude Code tips, local LLM experiments, dev workflows. And every single morning, I open Reddit fully intending to spend five minutes catching up.

Forty-five minutes later, I’m still scrolling.

Ninety percent of it is noise. Reposts, complaints (like those weekly usage/rate limits rants in r/ClaudeCode), low-effort memes, questions that got answered three threads ago. But buried somewhere in there — a workflow trick someone discovered at 2am, a Claude Code hack that actually works in production, a case study with real numbers — that stuff is gold.

I just couldn’t find it fast enough.

So I decided to build something. A simple Express server that would connect to the Reddit API, pull posts and comments from my favorite subreddits, store them locally as JSON files, and let me point Claude at the data to surface only what matters.

And here’s the part that matters for you: I built it using the Claude Code Workflow Engineering process I described in the previous issue. Start to finish. No shortcuts. No “eh, I’ll just wing this part.”

(Okay, I was tempted. But I didn’t.)

What follows is every step of that process applied to a real project — from a blank folder to a working app with full tests passing on the first attempt. Every screenshot. Every command.

Stay with me.

The Starting Point: One Idea, Zero Code

Here’s what my project folder looked like when I started: an idea.md file describing what I wanted, and the Workflow Engineering slash commands from the previous issue.

That’s it. No boilerplate. No template repo. No starter code. Just an idea and a process.

Project folder showing only the idea.md file and workflow engineering commands

The idea itself was pretty straightforward: an Express server that fetches posts and comments from configured subreddits within the last 24 hours, then saves everything as JSON files organized by subreddit and date. No database — just files on disk. Once the data is collected, I can ask Claude to read it and find the good stuff for me.

The one wrinkle? Reddit’s API now requires OAuth 2.0. So the app needs to handle the full authorization flow — token exchange, refresh tokens, the whole dance — before it can fetch anything.

With a clear idea written down, I handed it to the workflow.

Let’s walk through what happened.

Step 1: Brainstorm the Specs

I triggered the /spec_brainstorm command and pointed Claude at my idea file.

Claude Code terminal showing /spec_brainstorm command being triggered with the idea.md file

Now, I’ve tried building apps like this before — dumping everything into one prompt and letting Claude run. It got through maybe 60% before the code started contradicting itself. Requirements from the top of the conversation were ghosted by the bottom.

The Claude Code Workflow Engineering approach is different. Instead of jumping into code, Claude started asking clarifying questions. Real ones. With options, explanations, and a recommendation for each.

The first round covered core architecture decisions: How should data collection be triggered? (Manual API endpoint, cron scheduler, or both?) What kind of frontend does this need beyond the OAuth setup page? How should filtering work?

Claude Code presenting multiple-choice question about data collection trigger method with options for manual API endpoint, built-in cron scheduler, or both

Claude Code asking about frontend scope with options for minimal OAuth-only page or full dashboard UI

Summary of first round answers covering Summarizer, Trigger, Frontend, and Filtering categories

The second round went deeper: How should subreddits be configured? (Config file, hardcoded, or environment variables?) What data should go into the JSON files? (Posts only, posts + all comments, or posts + top comments?) Language preference?

Summary of second round answers covering Config, Data scope, and Language with TypeScript selected

Once Claude had enough context from both rounds, it wrote the full specification document.

Claude Code writing the complete specs.md file based on all the answers provided

Two rounds of questions. Clear decisions documented in a file. The specs existed as an artifact on disk — ready to be read by a completely fresh session with zero memory of this conversation.

That last part matters more than you might think.

(We’re about to see why.)

Step 2: Review the Specs

Here’s where most people go wrong. And I know this because I was most people.

On an earlier project, I skipped the review step. The specs looked fine to me. Three hours into implementation, I found a conflict that would have taken a reviewer two minutes to flag. Two minutes.

So now I don’t skip it.

Here’s the thing: the agent that wrote the specs is the worst possible agent to review them. It already “knows” what it meant. It won’t catch ambiguity because it can fill in the gaps from memory. A fresh agent reading the same file cold? It has no such luxury.

New session. /spec_review command.

New Claude Code session showing /clear followed by /spec_review command

A fresh Claude instance — with zero memory of the brainstorming conversation — read the specs and started poking holes.

And it found real problems. Using GET for state-changing operations (a REST convention violation and a security risk — someone could trigger data collection just by visiting a URL). Writing refresh tokens directly to .env at runtime (which, ferpetesake, doesn’t work the way the spec assumed). Vague OAuth state storage. And more.

Claude Code presenting spec review findings including P1 GET for state-changing operations, P2 writing refresh token to .env, and P3 vague OAuth state storage

Now, here’s where your judgment comes in. Claude surfaced a long list of potential issues — some critical, some nice-to-have. You don’t have to fix everything. You get to choose what matters.

I went through them in three tiers.

First — the spec-breaking issues:

Multi-select interface showing spec fixes with options like REST endpoint methods, OAuth state storage, error strategy, and collect-all endpoint behavior

Second — important improvements:

Second page of issue selection showing P5 Collect-all timeout, P6 Pagination limit, P7 Partial failure, and P8 Date boundary

Third — lower priority fixes (I picked the ones with real consequences):

Third page showing lower priority issues including P10 More Reddit error codes, P11 User-Agent source, and P13 Path traversal security fix

Before applying fixes, Claude asked clarifying questions to make sure the solutions would be solid. How should /collect-all handle job tracking? What should the filename date represent — when data was collected or when the post was created? Where should the Reddit username for the User-Agent header come from?

Claude asking about collect-all endpoint job tracking approach with sequential vs parallel options

Claude asking about date boundary logic for filename dates with a visual example showing collection date mapping

Claude asking about User-Agent source with env var recommended, showing example .env configuration

With answers in hand, Claude updated the specs — changing GET to POST for state-changing endpoints, adding proper error handling, fixing the OAuth storage approach, adding pagination limits, and patching a path traversal vulnerability.

Claude modifying the specs file with red/green diff showing changes to API endpoints and OAuth flow

Summary table of all spec changes applied, showing 10 fixes across REST methods, token storage, OAuth state, pagination, error handling, and path traversal

Ten issues addressed. Specs refined. The artifact on disk now reflected a far more robust design than what the brainstorming session produced alone.

And we still haven’t written a single line of code.

(On purpose.)

Step 3: Write the Test Plan

I’ll be honest with you — this step almost didn’t happen.

Writing tests for code that doesn’t exist yet? It felt ceremonial. Like filling out a form nobody would read. I almost skipped it.

Then the test plan revealed two requirements I’d completely glossed over in the spec.

So now I never skip it.

Fresh session. /write_test_plan command.

New session with /clear followed by /write_test_plan command

Claude read the specs and produced a structured test plan: 33 test cases organized by priority. 8 Critical, 14 High, 10 Medium. Each one with preconditions, specific steps, and expected outcomes.

Claude writing test_plan.md with 33 test cases organized into sections covering config, OAuth, collection, filtering, storage, API, errors, frontend, and date handling

Why does this matter so much?

Because writing test cases forces deep analysis of every requirement. Turning “handle pagination limits” into a specific test case — with exact inputs, steps, and expected outputs — requires genuine understanding. Shallow understanding produces shallow tests, and you’d catch that now rather than three hours into debugging.

And there’s a second benefit: the test plan gives implementation a concrete target. Every task will map to specific test cases. “Done” stops being a gut feeling and starts being a checkmark.

Step 4: Write the Implementation Plan

Another fresh session. /write_impl_plan command.

New session with /clear followed by /write_impl_plan command

Claude read both the specs and the test plan, then generated an implementation plan — 10 tasks, each explicitly linked to the test cases it would satisfy.

Claude writing implementation plan showing project overview and task structure with dependencies

The plan organized tasks into execution waves based on dependencies. Every task mapped to specific test case IDs.

Implementation plan summary table showing 10 tasks with their key test case mappings and execution order across 5 waves

This is the last thinking step. After this, every design decision has been made. Every task has a defined scope. Every success criterion sits in a file on disk.

Now — and only now — we build.

Step 5: Execute the Implementation

Fresh session. /do_impl_plan command.

New session with /clear followed by /do_impl_plan command

Here’s where the Claude Code Workflow Engineering approach earns its keep.

Instead of running all 10 tasks in a single session (which would cause context degradation as the window fills up — I’ve been there, remember?), Claude created each task and processed them in waves using sub-agents. Each sub-agent got a fresh context window. It read the implementation plan from disk, found its assigned task, and executed with laser focus.

Wave 1 started with the foundation — project scaffolding.

Claude creating all tasks with dependencies and executing Wave 1 Task 1 for project scaffolding

Then the waves rolled forward, with parallel tasks running wherever dependencies allowed:

Wave progression showing tasks running in parallel — config loading, OAuth routes, collection logic, and filtering all executing concurrently across waves

Later waves handling comment fetching, error handling, and storage management with sub-agents completing tasks

Final implementation waves covering API routes and the frontend OAuth setup page

Implementation done. 16 files created. All 10 tasks completed across multiple waves.

Implementation summary showing all files created with their purposes — from package.json to the OAuth frontend page

Every sub-agent worked from the same artifact — the implementation plan on disk. No context bleeding between tasks. No “forgetting” early requirements while working on later ones.

Fresh context, every single wave.

Step 6: Setup Before Testing

Before running the test plan, I needed to set up the actual Reddit integration. Three things:

A config file defining which subreddits to monitor (I chose ClaudeCode and ClaudeAI — for obvious reasons):

config.json file showing two subreddits configured — ClaudeCode with minScore 10 and minComments 5, and ClaudeAI with defaults

A Reddit app registration to get OAuth credentials:

Reddit's create application page with RdSummarizer as the app name, web app type selected, and localhost redirect URI configured

And a .env file with the credentials:

.env file showing REDDIT_CLIENT_ID, REDDIT_CLIENT_SECRET, REDDIT_REDIRECT_URI, REDDIT_USERNAME, REDDIT_REFRESH_TOKEN (empty), and PORT=5566

Straightforward stuff. Let’s get to the good part.

Step 7: Run the Test Plan

The final step. Fresh session. /run_test_plan command.

(Deep breath.)

New session with /clear followed by /run_test_plan command

Claude read the test plan, explored the codebase, and confirmed this was a fresh test run with 33 test cases ready to execute.

Claude reading the test plan and exploring the codebase structure, confirming 33 test cases for a fresh test run

It created tasks for each test case, set up tracking files, and organized execution by dependencies and priority.

Claude creating 33 test case tasks with dependencies, setting up test-status.json and test-results.md tracking files

I asked Claude to skip TC-003 (environment variable validation) since that one needed manual testing with specific env states.

User asking Claude to skip TC-003 env validation test, Claude acknowledging and marking it as skipped

Then the tests ran. One sub-agent per test case. Each with fresh context.

Test execution Phase 2 running TC-001 through TC-006 with sub-agents, showing config loading, invalid config, OAuth redirect, and callback tests passing

Mid-test execution showing TC-007 through TC-010 passing — token refresh, collection happy path, subreddit validation, and hours parameter tests

Continued test execution with TC-011 through TC-019 — collection, filtering, storage, and error handling tests all passing with sub-agents

Test execution TC-020 through TC-025 — storage directory creation, merge deduplication, Reddit API pagination, comments fetching, and rate limit handling all passing

Final batch of tests TC-026 through TC-033 including error codes 401 403 404 429 5xx, frontend OAuth page, User-Agent header, and collection date naming all passing

All automated tests passed on the first attempt.

Zero code fixes required.

Test completion summary showing all 32 automated tests passed on first attempt with zero code fixes needed, and implementation matched the specification

Here’s the full results table:

Final score: 32/33 passed. 0 failed. 1 skipped (TC-003 — manual user testing). 0 known issues. 0 total fix attempts.

Test execution summary showing 32/33 passed, 0 failed, 1 skipped TC-003 for manual user testing, and all automated test cases passed on first attempt with zero code fixes

Let that sit for a second.

Every automated test passed on the first try. No code fixes needed. The implementation matched the specification because the specification had been thoroughly brainstormed, independently reviewed, and tested-before-built.

The ceremony I almost skipped? Turns out it was doing the heavy lifting all along.

Putting the App to Work

With all tests green, I could actually use the thing.

First up: the OAuth flow. I started the server and opened the setup page — a simple “Connect to Reddit” button.

Reddit Summarizer Setup page showing Not Connected status with an orange Connect to Reddit button

One click, and Reddit’s authorization page appeared.

Reddit OAuth authorization page asking to allow RdSummarizer to access posts and comments and maintain access indefinitely

After approving, the app received a refresh token and displayed it with clear instructions to add it to .env.

Reddit Summarizer success page showing the refresh token with a Copied button and instructions to add REDDIT_REFRESH_TOKEN to the .env file — **Note: The refresh token is fake.**

Token saved. Now I asked Claude to hit the /api/collect-all endpoint and pull data from both configured subreddits.

Claude Code running the collect-all endpoint with hours=24, showing successful collection from ClaudeCode and ClaudeAI subreddits with post counts

The data landed exactly where the specs said it would — JSON files organized by subreddit and date.

File explorer showing collected JSON data in logs folder organized by subreddit, with actual Reddit post data visible including titles, scores, and timestamps from the ClaudeCode subreddit

Now for the payoff.

I asked Claude to read the collected data and surface the latest Claude Code tips, workflows, and real-world case studies.

User prompt asking Claude to find and summarize the latest Claude Code tips, workflows, and case studies from the collected posts

The collected data was large — 64k tokens. Claude spawned 6 sub-agents to process it in parallel, each analyzing a chunk.

Claude processing the large data file with 6 parallel sub-agents, each analyzing a chunk of posts — ranging from 23.8k to 82.3k tokens

And here’s what came out — a synthesized summary of everything worth knowing from the last 24 hours across both subreddits:

Claude's synthesized insights showing top hacks like Force Opus sub-agents, hook-based context injection, notification sounds on Mac, and workflow optimizations including must-have settings and measure twice cut once workflows

Two subreddits. Hundreds of posts and comments. Distilled into actionable insights in under a minute.

I would never consume that volume of data and extract insights that fast by scrolling Reddit manually. The app collects and organizes. Claude analyzes and summarizes. And because all of this runs through my Claude subscription, there’s no separate API cost for the summarization part.

My morning Reddit scroll just went from 45 minutes to about 2.

Why the Workflow Made This Possible

You might be thinking: “Okay, but couldn’t you have built this without all the workflow steps? It’s just an Express server with some API calls.”

Honestly? Probably. This project is small enough that a skilled developer could prompt their way through it in one session.

But here’s what would have been different.

1. The spec review caught 10 issues before any code existed.

Using GET for state-changing operations. Writing tokens to .env at runtime. Missing pagination limits. A path traversal vulnerability. Any one of these would have meant debugging sessions after implementation — or worse, shipping a security hole you never noticed.

2. The test plan gave implementation a concrete target.

33 test cases, defined before Claude wrote a single line of code. When every task maps to specific success criteria, you don’t end up with “it seems to work” confidence. You end up with full tests passed on the first attempt confidence. There’s a world of difference between those two.

3. Fresh sessions prevented context rot.

The brainstorm session accumulated context from two rounds of Q&A. The review session started clean — and immediately found problems the brainstorming agent was blind to. The implementation used sub-agents in waves, each with its own fresh context window. No degradation. No forgotten requirements.

4. The artifacts served as shared memory.

Every step read from the previous step’s output file. Specs fed the review. Reviewed specs fed the test plan. Test plan fed the implementation plan. Implementation plan fed the sub-agents. Nothing lived “in context.” Everything lived on disk, where any fresh session could pick it up.

And here’s the part I keep coming back to: the workflow scales.

This project happened to be small.

The next one might not be.

And the exact same six commands:

/spec_brainstorm
/spec_review
/write_test_plan
/write_impl_plan
/do_impl_plan
/run_test_plan

…will work the same way regardless of what you’re building.

You design the process once. You refine it over time. Then you apply it to everything.

That’s the whole promise of Claude Code Workflow Engineering. And I think this little Reddit project makes a decent case for it.

Your Turn

The full source code is on GitHub: reddit-summarizer

If you want to use the same workflow for your own projects, grab the Workflow Engineering Starter Kit — all six command files, ready to drop into your .claude/commands/ folder.

Here’s what I’d suggest:

Pick a project idea you’ve been sitting on
Write it down in an idea.md file — even a rough paragraph works
Run the six-step workflow end to end
Pay attention to what the spec review catches — that’s usually where the biggest surprise shows up

What are you going to build with it?

Go engineer it.

Nathan Onn

Freelance web developer. Since 2012 he’s built WordPress plugins, internal tools, and AI-powered apps. He writes The Art of Vibe Coding, a practical newsletter that helps indie builders ship faster with AI—calmly.

Github Linkedin