Category: AI & Automation

AI tools, Claude Code, automation, and AI-assisted workflows

How to Use AI Agent Teams to Optimize Your Product Pages
Most product pages are built once and forgotten. Someone writes a description, uploads photos, sets a price, and moves on. Months later, the page is still converting at 1% and nobody’s touched it because “it’s fine.”

The problem is that a good product page isn’t one skill. It’s copywriting, conversion rate optimization, visual design, and brand consistency all at once. No single AI prompt holds all of those disciplines in focus simultaneously.

I’ve written about the adversarial agent approach before — assembling specialized AI agents into a team, giving each one a scoring rubric, and iterating until they all agree the work is good. I recently applied this to a real Shopify product page with a four-agent team: a copywriter, a CRO specialist, a branding expert, and a visual designer. The conversion rate doubled in seven days.

Here’s how to adapt this for your own pages.

Score First, Then Build a Task List

The key adaptation for product pages is turning agent feedback into a concrete task list you can work through.

Point your agent team at the current page and have each specialist score it out of ten against their rubric. You’ll get feedback like: “6/10 — Add to Cart button blends into the background, social proof is buried below three scrolls” from the CRO agent, and “5/10 — product descriptions are feature lists, not benefit statements” from the copywriter.

Combine all of their recommendations into a single prioritized list. This is your improvement backlog. The types of changes that consistently surface across e-commerce pages:
- Primary action prominence — more contrast, higher placement on mobile, larger touch target for the CTA. Almost always the highest-impact change.
- Mobile layout — product images eating too much vertical space, pushing price and CTA below the fold.
- Benefit-oriented copy — shifting descriptions from “what this is” to “what this does for you.”
- Social proof repositioning — moving reviews and trust signals closer to the point of purchase decision.
- FAQ expansion — every unanswered objection is a reason to leave the page.
Work through the list with yourself in the loop. Don’t hand everything to the AI and walk away. Agents occasionally recommend changes that score well on their rubric but don’t fit your broader context — aggressive urgency tactics that feel off-brand, or rewrites of sections you’ve crafted for a specific reason.

After each batch of changes, re-score. You’ll see numbers climb, and you’ll see new issues surface that weren’t visible before. If you’re not familiar with the challenges of split testing, this iterative approach with agent scoring is a practical alternative — you get structured feedback without needing statistical significance on every change.

Build Features Instead of Buying Apps

One thing that came out of this process: AI agents can build small features that would normally cost $10 to $20 a month as a Shopify app.

CRO agent suggested a social proof notifications — the little popups showing recent purchases. Instead of installing an app, an AI agent wrote a script that pulls real order data from the Shopify API, stores it in metafields, and displays it with a liquid snippet. Twenty minutes of agent time, no monthly fee, no bloated JavaScript, no third-party tracking.

This works for a surprising number of app store features. Countdown timers, stock warnings, cross-sell blocks, announcement bars. If the feature is simple enough to describe, an agent can build a lightweight version that does exactly what you need. This is the same growth engineering approach I’ve been using across my marketing stack — treating your code editor as the platform instead of buying SaaS for everything.

Then Work on the Economics

A better-converting page is only half the equation. If margins are thin and average order value is low, you can’t scale paid advertising profitably.

Once conversion improvements stabilize, shift the agent team to pricing structure. Have them model bundle configurations, free shipping thresholds, COGS at different quantities, pick and pack costs, and shipping rates across weight breaks. The goal is maximizing contribution margin per order while maintaining conversion rates.

What came out of this for me was more aggressive than I would have tested on my own. The AI ran the numbers without the emotional anchoring that comes from having set the original prices yourself. No bias. Just math.

The structural changes worth considering:
- Bundle incentives inside the cart — present options the moment someone adds a product, not on a separate page.
- Tiered thresholds — make each additional item feel like an obvious deal. Free shipping at one level, a percentage off at the next.
- Higher price points — if your page is now doing its job with strong copy and visible social proof, customers may tolerate more than you assume.
Measure Patiently

Page layout changes show results fast. My conversion improvements were clear within the first week.

Avoid changing too much at the same time. It’s hard to isolate what changes were improvement and which were duds.

Give it a shot on your site – let me know how it goes.
February 20, 2026
Let’s Talk About the Openclaw in the Room
Everyone’s talking about Openclaw this week. If you haven’t seen it: it takes a Claude model, strips off the guardrails, wraps it in some extra tooling, and lets it run autonomously. People are impressed. I ran it. And I have thoughts.

What Openclaw actually does

There are really three things going on:

First, it runs in what they call dangerous mode. No safety rails, full access to your machine. The agent will scour your computer for API keys hidden in config files, environment variables, wherever. It may use them. It may publish them. You don’t know. This is why the security-conscious crowd runs it on dedicated cloud hardware with nothing on it they didn’t explicitly provision. That’s the right instinct.

Second, it has a built-in cron that lets the agent schedule its own work. This is the part that matters most. Tell it to manage your X account and it will keep posting all day without stopping. It doesn’t run to completion and then wait for you to kick it again. It stays alive.

Third, it shifts the interface to be chat-centric through existing messaging channels. The win here is portability. You can talk to it while commuting, ask questions from your phone, and it has the full context of your projects, your files, your authentication. That’s something you don’t get when you open a fresh conversation in ChatGPT.

My take: too big a leap

I’ve been running agents hard for weeks. I’ve built multi-agent teams with Claude Code and pushed the current tooling about as far as it goes. And my honest reaction to Openclaw is that it jumped too far.

The user interface introduces a huge number of configuration options. There are a lot of moving parts to set up. It’s not an incremental lift from the interfaces people are already comfortable with. It’s a full departure. And I think that matters more than the community is acknowledging right now.

There may also be some architectural choices that are going to be hard to walk back from. When you build a foundation that’s too complex from day one, you end up having to simplify later, and simplifying is always harder than starting simple.

The real insight: agents need a heartbeat

Strip away the configuration and the dangerous mode and the chat interface. What’s the core idea that makes Openclaw feel alive?

It’s a loop.

Current AI models just turn off. They don’t compute any signals between conversations. There’s no input, no processing, nothing running. They’re not awake unless someone talks to them or they’re working through a task. They have no self-start feature. When they reach the end of a prompt, they effectively pass out and don’t wake up until someone asks them another question.

That’s wildly different from a human brain, which keeps running between conversations. You finish talking to someone and you keep thinking about what they said. You notice things. You have ideas at 2am.

The insight, whether it came from the RALPH loop concept or from Openclaw’s cron, is the same: give the agent a heartbeat. A daemon process that periodically checks in and says “is there anything new to do?” That’s what keeps a little bit of life alive in these things.

What a heartbeat enables

With just a simple startup hook, every time your agent wakes up it checks:
- Are there new blog posts or news to process?
- Did anybody post something on a website I’m monitoring?
- What time is it, and should I adjust the smart home lights?
- Are there new GitHub issues or error logs on the server?
- Is there anything left in the PRD that needs building?
- Can I rerun the unit tests to make sure everything still passes?
Each check is an opportunity for the agent to take a bigger action. That action might be posting to Twitter, writing a marketing report, continuing development on a project, or flagging something that needs human attention.

This is a different thing entirely from scheduled tasks in ChatGPT. Those run a prompt on a timer, sure. But they don’t spin off and create new things. They don’t continuously work through a multi-step project. A local agent with a heartbeat can pick up where it left off, assess the state of a project, and keep going. I’ve been using this kind of persistent agent approach for growth engineering with Claude Code and the difference is night and day.

A simpler path

I’ve been building this into the Culture framework. There’s a daemon that auto-updates itself when the core code changes and pings each agent on a schedule. Anything the agent wants to do gets triggered on that heartbeat. Check a website, generate some content, participate in a larger process.

Right now it’s basic. But the direction is clear. Delayed jobs: “check this in 30 minutes” and the agent schedules itself to wake up in 30 minutes. Recurring tasks on a cron: run this report every two days, check inventory every morning, post a thread every afternoon. These patterns are well established in SaaS operations. Work queues, background jobs, scheduled tasks. Every serious web application runs on them. The difference is that now the worker picking up the job is an AI agent instead of a function.

And the whole thing sits on top of Claude Code. No new interface to learn. No massive configuration surface. Just a daemon and a skill file, extending the tools people are already using.

Incremental beats revolutionary

Openclaw might get there. They might simplify the interface and solidify the architecture. But right now it feels like it skipped a few steps.

I think the safer bet is incremental. Add one thing at a time to the tools people already know. The daemon is the single most valuable addition: it turns a stateless prompt-response tool into something that behaves like a persistent agent. Combine that with skill files for context and you have most of what makes Openclaw exciting without the complexity tax. It’s the same philosophy behind mini AI automations: small additions, compounding returns.

If you want to try this approach, join the Culture at join-the-culture.com. It’s early, but the idea is simple: give your agents a heartbeat and see what they do with it.
February 9, 2026
Adversarial Agents: How AI Teams Build Better Creative Work
In software engineering, tests and code exist in tension. Unit tests verify the program is correct. The program, in turn, validates that the tests make sense. They reinforce each other. Neither is complete without the other.

I’ve been applying this same adversarial principle to creative work with AI, and it’s producing noticeably better results than single-agent prompting.

The single-agent problem

A single AI agent thinks linearly, one token at a time. Ask it to build a landing page, and it’ll produce something reasonable. But a good landing page isn’t just one skill. It’s copywriting, web design, conversion rate optimization, brand compliance, marketing strategy, and sometimes legal considerations, all at once.

No single pass through a context window can hold all of those disciplines in focus simultaneously. The agent will nail the copy but forget the CRO fundamentals, or get the design right but drift off brand voice. Something always slips.

Setting up the adversarial team

The fix is to stop asking one agent to do everything and instead assemble a team where each member brings a deep specialization.

I have a main orchestrator agent spawn sub-agents (or use Claude Code’s team features), and each team member pulls in a dedicated skill file loaded with context for their domain. A copywriting agent might have 500 examples from top copywriters, excerpts from books, your favorite and least favorite examples. A web design agent has example pages, layout patterns, accessibility standards. A branding agent carries your full brand guidelines, voice documentation, and imagery specs.

These skill files can be massive and detailed. That’s the point. You’re front-loading each agent’s short-term memory with deep expertise before it ever looks at your work. I touched on this idea of building AI-operable systems in a previous post, and the skill file approach takes it even further.

The rubric and scoring loop

Each specialized agent receives the current draft of whatever you’re building and evaluates it through its own lens. The CRO agent, for example, might score against a rubric like:
- Is the value proposition clear above the fold?
- Are CTAs bold with action-oriented copy?
- Are social proof elements (ratings, testimonials) visible?
- Where is the pricing positioned?
- Is there urgency (countdown timer, limited availability)?
- Is the page scannable with clear visual hierarchy?
It scores each dimension, produces an overall rating out of 10, and returns the score along with its top recommendations for improvement.

Every agent does this independently, through its own lens. The copywriter scores the writing. The designer scores the layout and visuals. The brand agent checks voice and visual consistency. Each one comes back with a number and a list of suggestions.

Convergence through conflict

This is where it gets interesting. These agents don’t naturally agree. Good copywriting might clash with brand voice. Bold CRO tactics might conflict with clean design sensibility. Compliance requirements can undercut persuasive copy.

They’re in genuine tension, just like real team members with different expertise.

The orchestrator’s job is to synthesize:
1. Collect all scores. If any agent scores below 9 out of 10, another iteration is needed.
2. Read the feedback from all agents and identify the most impactful changes.
3. Revise the deliverable, balancing competing recommendations.
4. Send it back out for another round of scoring.
Each cycle tightens the work. The copy gets sharper and the design gets more intentional. Objections get handled. Details that a single-pass agent would miss get caught by one specialist or another.

The GAN connection

This plays on one of my favorite concepts in AI: the generative adversarial network. In a classic GAN, one model generates images while a second model tries to determine if each image is real or AI-generated. They train against each other. The generator improves because the discriminator keeps catching it, and the discriminator improves because the generator keeps getting better at fooling it.

What makes GANs clever is that they create a self-improving feedback loop without needing manually labeled training data. The adversarial structure itself is the training signal.

What I’m describing with agent teams operates at a higher level, LLMs in role-based scenarios providing structured feedback to each other. But the principle is the same: tension between evaluators and creators drives quality upward through iteration.

What this actually looks like

Over the past couple weeks, I’ve used this pattern for:
- Landing pages for my business. Multiple sales pages where CRO, copywriting, brand, and design agents each scored and refined the work through several iteration cycles.
- A full blog redesign pulling in SEO, marketing strategy, brand identity, and web design as separate evaluation lenses. I’ve been using this kind of growth engineering with Claude Code approach across a lot of my marketing work.
- A short playbook on using AI for business, where editorial, subject matter, and audience-fit agents each had their say.
- Software where domain expertise agents (say, one that understands CPG accounting) worked alongside a coding agent to build something neither could have built alone.
In each case, the final product had a completeness that single-pass generation just doesn’t produce. You notice it. Fewer holes, fewer “oh we forgot about that” moments.

The cost

Let’s be honest about the trade-offs. This approach burns through tokens. A landing page might take 30 to 40 minutes of agent runtime with multiple research phases, iteration loops, browser screenshots for visual verification, and re-scoring cycles.

That’s a lot compared to a single prompt that returns something in 30 seconds. But 30 minutes for a landing page that’s been reviewed by the equivalent of five specialists? I’ll take that trade every time.

You’re trading tokens for quality assurance. The same way a real team costs time and money to review each other’s work, the agent team costs compute. But the output is closer to what a real team would produce.

Same brain, different books

I keep coming back to this thought. These agents are fundamentally the same model. Claude is Claude, whether it’s playing the copywriter or the CRO specialist. The difference is what you loaded into its context window before it started working.

It’s like having the same person walk into the room, but each time they’ve just finished reading five different books. The copywriting agent just absorbed every example and principle you could fit in. The brand agent just re-read your entire brand bible. They bring different perspectives because they’re primed with different information, not because they’re different intelligences.

That framing is why I think this works so well. You’re giving the same capable reasoner different source material to reason from, and the disagreements that emerge are real, not manufactured.

Running a company of agents

Working this way is starting to feel less like programming and more like management. You delegate work, wait for feedback, reconcile conflicting opinions, make a call, and send it back for another round. I wrote about the early stages of this shift in The AI CEO, and it’s accelerating faster than I expected.

In some of these cases, you’re delegating to a team. It won’t be long before you’re delegating to departments. Fully AI departments with dozens or hundreds of agents that have been sub-delegated to operate on specific pieces of a larger project.

I’m already routinely running five to ten agents against the same deliverable. Scale that up and you start to see the shape of something that looks a lot like an org chart, except every box is an AI agent with a specialized skill set.

Try it

If your AI tool of choice supports agents and sub-agents, try this. Even a rough version works:
1. Pick a deliverable: a landing page, a blog post, a piece of code.
2. Identify three or four disciplines that matter for quality. Copy, design, SEO, whatever fits.
3. Create a skill prompt for each discipline, as detailed as you can make it.
4. Have each specialist score the work on a 1 to 10 rubric with specific recommendations.
5. Iterate until every specialist scores a 9 or above.
You’ll burn more tokens and it’ll take longer. But I haven’t gone back to single-pass generation for anything that matters. Once you’ve seen what a team of agents produces compared to one agent winging it, the difference is hard to unsee. The same idea that makes software testing indispensable, that adversarial pressure produces better results, turns out to work just as well when the thing being tested is a creative deliverable instead of a codebase.
February 9, 2026
From a Week to Four Hours: Building Chrome Extensions with AI

A year ago, I built my first Chrome extension. It took the better part of a week.

A few days ago, I built my second Chrome extension. It took four hours.

Same developer. Similar complexity. Almost no retained knowledge about Chrome extension development between the two projects. The difference was the AI.

The First Extension

The first project was a scraper for Amazon Seller Central—pulling data out of the seller dashboard and generating reports. I built it with one of the ChatGPT 4.x models, whichever was current at the time.

It was painful. But impressive at the time.

Not because Chrome extensions are impossibly hard, but because I’d never built one before and the AI couldn’t quite get me there cleanly. Every step involved back-and-forth. I’d describe what I wanted, get code that didn’t work, debug it, explain the error, get a fix that broke something else, repeat.

The manifest file alone took multiple attempts to get right. Permissions, content scripts, background workers—each concept required me to learn enough to understand why the AI’s suggestions weren’t working, then nudge it toward a solution.

By the end of the week I had a working extension, but I’d earned it through iteration and frustration.

The Second Extension

Fast forward to last week. I needed another Chrome extension—this one scrapes recipe information from web pages and submits it to a backend API. Different purpose, but similar complexity to the first project.

I opened Claude Code and described what I wanted.

One prompt later, I had a working prototype running locally.

Not a starting point. Not scaffolding that needed extensive modification. A working extension that did the core job. From there, it was small iterations—mostly around authentication with my backend. But the foundation was solid from the first response.

What Changed

The moments that stood out weren’t dramatic. They were just… easy in a way that felt wrong.

The manifest: Chrome extensions require a manifest.json file that defines permissions, scripts, icons, and metadata. Last year, this was a source of misunderstandings and rejections. This time, Claude one-shot it. Correct permissions, proper structure, sensible defaults. I didn’t have to understand why it worked—it just did.

The submission process: I’d completely forgotten how to submit an extension to the Chrome Web Store. Claude walked me through it—descriptions, screenshots, privacy policy requirements, the review process. Not generic advice, but specific guidance tailored to what I’d built.

Performance and security: After the core functionality worked, I prompted my way through improvements. “Make this more efficient.” “Are there any security concerns?” Each time, I got specific changes to the code. I did a cursory review to make sure nothing looked insane, but I didn’t have to dive deep into the implementation to fix anything myself.

Four hours from start to ready-for-submission.

The Gap Is Closing

I’m not a better developer than I was a year ago—at least not at Chrome extensions. I’d forgotten almost everything I learned during that first project. But the AI got dramatically better.

ChatGPT 4.x was helpful but unreliable. It got me part of the way there, then I had to fight through the gaps. Claude Code with Opus 4.5 understood what I was trying to build and just… built it.

The difference isn’t subtle. It’s not 20% faster or “somewhat easier.” It’s the difference between a week of grinding and an afternoon of iterating.

What This Means

I think about this when people ask whether AI is actually useful for development, or if it’s just hype. The answer depends entirely on when you last tried it.

If your experience with AI coding assistants was ChatGPT circa 2024, you probably remember the frustration—code that almost worked, endless debugging, the feeling that you could’ve done it faster yourself. That was real.

But the tools from six months ago aren’t the tools from today. The gap between “AI assistant that helps” and “AI that builds” is closing fast. For a task I’d done exactly once before, with knowledge I’d completely lost, I went from a week to four hours.

That’s not incremental improvement. That’s a phase change.

Both extensions are in production. One took a week of frustration. One took an afternoon.

January 20, 2026
How I Use AI to Write and Publish Blog Posts
This post is a bit meta. I’m using the exact workflow I’m about to describe to write and publish this very article.

Here’s the setup: I speak my ideas out loud, an AI turns them into polished prose, another AI generates the hero image, and a set of scripts I built with AI assistance handles the publishing. The whole thing lives in a GitHub repository that you can clone and use yourself.

Let me walk you through how it works.

The Problem With Writing

I have ideas. Lots of them. The bottleneck has never been coming up with things to write about—it’s the friction between having a thought and getting it published.

Traditional blogging requires you to:
1. Sit down and type out your thoughts
2. Edit and format the content
3. Find or create images
4. Log into WordPress
5. Copy-paste everything into the editor
6. Set featured images, categories, meta descriptions
7. Preview, fix issues, publish
Each step is a context switch. Each context switch is an opportunity to abandon the post entirely. My drafts folder is a graveyard of half-finished ideas.

Voice First

The breakthrough was realizing I don’t need to type. I use Wispr Flow for voice-to-text dictation. It runs locally on my Mac and transcribes speech with surprisingly good accuracy.

Now when I have an idea for a post, I just… talk. I ramble through my thoughts, explain the concept as if I’m telling a friend, and let the words flow without worrying about structure or polish.

The output is messy. It’s conversational, full of “um”s and tangents. But it captures the core ideas in a way that staring at a blank page never did.

AI as Editor

This is where Claude Code comes in. I take my raw dictation and ask Claude to transform it into a structured blog post. Not just grammar cleanup—actual restructuring, adding headers, tightening the prose, finding the narrative thread in my stream of consciousness.

The key is that I stay in control. Claude produces a markdown draft, and I review it. I keep what works, rewrite what doesn’t, add details Claude couldn’t know. The AI handles the tedious transformation from spoken word to written word. I handle the judgment calls about what’s actually worth saying.

The Publishing Pipeline

Here’s where it gets interesting. I built a set of CLI tools that Claude Code can use to handle the entire publishing workflow.

When I’m ready to publish, I have a conversation like this:
```
Me: "Generate a cyberpunk-style hero image for this post about AI blogging workflows,
crop it to 16:9, and publish to WordPress with the featured image attached."

Claude: [Generates image with Gemini] → [Crops and converts to JPG] →
        [Uploads to WordPress] → [Converts markdown to Gutenberg blocks] →
        [Creates post with featured image] → Done. Here's your URL.
```
One conversation. Full pipeline. No clicking through WordPress admin panels.

How the Tools Work

The publishing toolkit includes:

Voice capture – Wispr Flow transcribes my dictation to text

Content transformation – Claude Code converts raw transcription to structured markdown

Image generation – The Nano Banana Pro plugin generates hero images using Google’s Gemini model

Image processing – A Python script crops images to 16:9 and converts to web-optimized JPG

WordPress publishing – Another Python script handles media uploads, post creation, and metadata via the WordPress REST API

File organization – Each post lives in its own dated folder with the markdown source, images, and a metadata JSON file for future edits

The WordPress MCP server that ships with Claude Code can create posts, but it can’t upload media or set featured images. So I built CLI tools to fill those gaps. Claude Code runs them as needed during the publishing conversation.

Everything in Git

The entire setup lives in a GitHub repository. Each blog post is a folder:
```
posts/
├── 2026-01-13-ai-powered-blog-workflow/
│   ├── content.md          # This post
│   ├── featured.jpg        # Hero image
│   ├── hero.png            # Original generated image
│   └── meta.json           # WordPress post ID, dates, SEO fields
```
Version control for blog posts. If I need to update something, I know exactly where to find it. The meta.json file stores the WordPress post ID so I can push updates to the live site.

The Meta Part

Here’s what’s happening right now:
1. I dictated the concept for this post using Wispr Flow
2. I asked Claude Code to turn my rambling into a structured article
3. I reviewed and edited the markdown
4. I’ll ask Claude to generate a hero image
5. Claude will crop it, upload it to WordPress, and publish
The workflow I’m describing is the workflow producing this description. It’s turtles all the way down.

Try It Yourself

The publishing toolkit is open source: github.com/mfwarren/personal-brand

You’ll need:
- A WordPress site with REST API access
- An application password for authentication
- Claude Code with the Nano Banana Pro plugin for image generation
- Wispr Flow (or any voice-to-text tool) for dictation
Clone the repo, configure your credentials, and start talking. The gap between having an idea and publishing it has never been smaller.

Written by dictation, edited by AI, published by CLI. The future of blogging is conversational.
January 14, 2026
How a Holiday Tech Support Call Turned Into a Full-Stack AI Project
Like many eldest sons, I have a standing role as family tech support. This holiday season, that role led me somewhere unexpected: launching a new product.

The Call

I was visiting my parents over the holidays when they asked for help with a recipe app called MasterCook. They’d been using it for years, but the service was being decommissioned. Could I help them migrate their recipes somewhere else?

I looked at the recommended migration path. Then I looked at the replacement applications. They were… not great. Clunky interfaces, limited features, the kind of software that feels abandoned even when it’s technically still maintained.

I had a week of vacation left. I thought: I can build something better than this.

One Week Later

That thought became save.cooking – and it’s grown far beyond what I originally imagined.

What started as a simple tool to import MasterCook recipe files has evolved into a fully-featured AI-enhanced meal planning platform:

Core Features:
- Import recipes from MasterCook (.mxp, .mx2) and other formats
- AI-powered recipe parsing that actually understands ingredients and instructions
- Vector embeddings that map recipe similarity – find dishes related to ones you love
- Automatic shopping list generation synced to your weekly meal plan
- Public recipe sharing with user profiles
- Full meal plan sharing – not just individual recipes
Technical Details I Never Would Have Tackled Alone:
- JSON-LD structured data for Google Recipe rich results
- Pinterest-optimized images and metadata
- Open Graph tags specifically tuned for recipe content
- Responsive Next.js frontend (not my usual stack)
The site already has over 300 public recipes in its database, and that number grows daily.

The AI Difference

Here’s the thing: I’m not a Next.js developer. I’ve built backends, APIs, CLIs – but modern React frontends aren’t my wheelhouse. A year ago, this project would have taken months and looked significantly worse.

With Claude Code handling the heavy lifting, I could focus on product decisions while the AI handled implementation details. Need Pinterest meta tags? Claude knew the exact format. Want vector similarity search? Claude set up the embeddings pipeline. Struggling with a responsive layout? Claude fixed the CSS.

This isn’t about AI writing code for me. It’s about AI expanding what I can realistically build. The cognitive load of learning a new framework while also designing features while also handling deployment – that’s usually where side projects die. AI agents absorbed that load.

The Graveyard Problem

Recipe websites are a graveyard. AllRecipes feels like it hasn’t been updated since 2010. Food blogs are drowning in ads and life stories before you get to the actual recipe. Apps come and go, taking your data with them.

People have stopped expecting good software in this space. They’ve accepted that finding a recipe means scrolling past someone’s childhood memories and closing seventeen popups.

I think we can do better. I think we should do better. Cooking is fundamental – it’s one of the few things that genuinely brings people together. The software around it shouldn’t be an obstacle.

What’s Next

save.cooking is live and growing. I’m using it daily for my own meal planning. Features are shipping weekly:
- Ingredient substitution suggestions
- Nutritional analysis
- Collaborative meal planning for households
- Recipe scaling that actually works
- Smarter shopping list organization by store section
If you’ve got recipes trapped in old software, or you’re just tired of the current options, come check it out at save.cooking.

And if you’re a developer wondering what you could build in a week with AI assistance – the answer might surprise you. The constraint isn’t technical capability anymore. It’s just deciding what’s worth building.

Built with Claude Code over a holiday week. The family tech support call that actually paid off.
January 13, 2026

Claude Code First Development: Building AI-Operable Systems

Most developers think about AI coding assistants as tools that help you write code faster. But there’s a more interesting question: how do you architect your systems so an AI can operate them?

I’ve been running production applications for years. The traditional approach is to build admin dashboards – React UIs, Django admin, custom internal tools. You click around, run queries, check metrics, send emails to users. It works, but it’s slow and requires constant context-switching.

Here’s the insight: Claude Code is a command-line interface. It can run shell commands, read output, and take action based on what it sees. If you build your admin tooling as CLI commands and APIs instead of web UIs, Claude Code becomes your admin interface.

Instead of clicking through dashboards to debug a production issue, you tell Claude: “Find all users who signed up in the last 24 hours but haven’t verified their email, and show me their signup source.” It runs the commands, parses the output, and gives you the answer.

This is Claude Code First Development – designing your production infrastructure to be AI-operable.

The Architecture

There are three layers to this:

1. Admin API Layer

Your application exposes authenticated API endpoints for admin operations. Not public APIs – internal endpoints that require admin credentials. These give you programmatic access to:

User data (lookups, activity, state)
System metrics (signups, WAU, churn, error rates)
Operations (send emails, trigger jobs, toggle features, issue refunds)

2. CLI Tooling

Command-line tools that wrap those APIs. Claude Code can invoke these directly:

./admin users search --email "foo@example.com"
./admin metrics signups --since "7 days ago"
./admin jobs trigger welcome-sequence --user-id 12345
./admin logs errors --service api --last 1h

3. Credential Management

The CLI tools handle authentication – reading tokens from environment variables or config files. Claude Code doesn’t need to know how auth works, it just runs commands.

Building the CLI Tools

The great thing about AI Developer Agents is that you don’t need to code these tools yourself.

Based on the data models in this application, build a command line cli tool and claude code skill to
use it. the cli tool should authticate with admin-only scoped API endpoints to be able to execude basic crud
capabilities, report on activitiy metrics, generate reports and provide insights that help control the application
in the production environment without relying on a administrator dashboard.
Build authentication into the cli tool to save credentials securely.
examples:
./admin-cli users list
./admin-cli users add user@example.com --sent-invite
./admin-cli reports DAU
./admin-cli error-log

Level up

Here are prompts you can give Claude Code to build out this infrastructure for your specific application:

Initial CLI Scaffold

Create a Python CLI tool using Click for admin operations on my [Django/FastAPI/Express]
application. The CLI should:
- Read API credentials from environment variables (ADMIN_API_URL, ADMIN_API_TOKEN)
- Have command groups for: users, metrics, logs, jobs
- Output JSON by default with an option for table format
- Include proper error handling for API failures

Start with the scaffold and user search command.

Adding User Management

Add these user management commands to my admin CLI:

1. users search - find users by email, name, or ID
2. users get <id> - get full user profile including subscription status
3. users recent - list signups from last N hours/days with filters for source and verification status
4. users activity <id> - show recent actions for a user

Each command should have sensible defaults and output JSON.

Adding Metrics Commands

Add metrics commands to my admin CLI that query our analytics:

1. metrics signups - signup counts grouped by day/week with source breakdown
2. metrics wau - weekly active users over time
3. metrics churn - churn rate and churned user counts
4. metrics health - overall system health (error rates, response times, queue depths)
5. metrics revenue - MRR, new revenue, churned revenue (if applicable)

Include --since flags for time windows and sensible output formatting.

Adding Log Access

Add log viewing commands to my admin CLI:

1. logs errors - recent errors across services with filtering
2. logs user <id> - all log entries related to a specific user
3. logs request <id> - trace a specific request through the system
4. logs search --pattern "..." - search logs by pattern

Format output for terminal readability - timestamps, service names, messages on separate lines.

Adding Actions/Jobs

Add commands to trigger admin actions:

1. jobs list - show available background jobs
2. jobs trigger <name> - trigger a job with optional parameters
3. jobs status <id> - check job status
4. email send <user_id> <template> - send a specific email
5. email templates - list available templates

Include --dry-run flags where destructive or user-facing operations are involved.

Building the API Endpoints

Create admin API endpoints for my [framework] application to support the admin CLI:

1. GET /admin/users/search?email=&id=
2. GET /admin/users/<id>
3. GET /admin/users/<id>/activity
4. GET /admin/users/recent?since=&source=&verified=
5. GET /admin/metrics/signups?since=&group_by=
6. GET /admin/metrics/wau
7. GET /admin/logs?service=&level=&since=
8. POST /admin/jobs/trigger

All endpoints should require Bearer token authentication. Use our existing User and
Activity models. Return JSON responses.

Making Tools Work Well With Claude Code

Claude Code reads text output. The better your tools format their output, the more effectively Claude can interpret and act on the results.

Principle 1: JSON for Data, Text for Logs

Return structured data as JSON – Claude parses it accurately:

$ ./admin users get 12345
{
  "id": 12345,
  "email": "user@example.com",
  "created_at": "2024-01-15T10:30:00Z",
  "subscription": "pro",
  "verified": true
}

But format logs for human readability – Claude understands context better:

$ ./admin logs errors --last 1h
[2024-01-15 10:45:23] api: Failed to process payment for user 12345: card_declined
[2024-01-15 10:47:01] worker: Job send_welcome_email failed: SMTP timeout
[2024-01-15 10:52:18] api: Rate limit exceeded for IP 192.168.1.1

Principle 2: Include Context in Output

When something fails, include enough context for Claude to suggest fixes:

$ ./admin jobs trigger welcome-email --user-id 99999
{
  "error": "user_not_found",
  "message": "No user with ID 99999",
  "suggestion": "Use 'admin users search' to find the correct user ID"
}

Principle 3: Support Filtering at the Source

Don’t make Claude grep through huge outputs. Add filters to your commands:

# Bad - returns everything, Claude has to parse
$ ./admin logs errors --last 24h

# Good - filtered at the API level
$ ./admin logs errors --last 24h --service api --level error --limit 20

Principle 4: Dry Run Everything Destructive

Any command that modifies state should support --dry-run:

$ ./admin email send 12345 password-reset --dry-run
{
  "would_send": true,
  "recipient": "user@example.com",
  "template": "password-reset",
  "subject": "Reset your password",
  "preview_url": "https://admin.yourapp.com/email/preview/abc123"
}

This lets Claude verify actions before executing them, and lets you review what it’s about to do.

Principle 5: Exit Codes Matter

Use proper exit codes so Claude knows when commands fail:

@users.command()
def get(user_id: str):
    try:
        result = api_request("GET", f"/admin/users/{user_id}")
        output(result)
    except requests.HTTPError as e:
        if e.response.status_code == 404:
            click.echo(f"User {user_id} not found", err=True)
            raise SystemExit(1)
        raise

Note: When the commands crash out – the app can immediately fix itself!

Integrating With Claude Code Skills

Claude Code supports Skills – custom commands that extend its capabilities. You can create a Skill that wraps your admin CLI and provides context about your specific system.

Just tell Claude Code to document your new CLI into a skill:

Create a claude code skill to document how to use admin-cli, then give me examples of what I can do with this new skill.

Now Claude Code has context about your admin tools and can use them appropriately.

MCP Tool Integration

For deeper integration, you can expose your admin API as an MCP (Model Context Protocol) server. This lets Claude call your admin functions directly as tools rather than shelling out to CLI commands. (so it can be used by more people than terminal centric administrators)

Security Considerations

A few things to keep in mind:

1. Scope your tokens. Create admin API tokens with the minimum permissions needed. Don’t use your personal account credentials.

2. Audit everything. Log every admin API call with the token used, timestamp, and parameters. You want a trail of what Claude (or you) did.

3. Require confirmation for destructive operations. Either use --dry-run flags or build confirmation into your workflow. Claude should show you what it’s about to do before doing it.

4. Separate read and write access. Consider having read-only tokens for investigation and separate tokens for operations that modify state.

5. Don’t expose production databases directly. Always go through your application layer. This gives you validation, audit logging, and the ability to revoke access.

Putting It Together

Here’s what a typical debugging session looks like with this setup:

You: "Users are complaining about not receiving welcome emails. Investigate."

Claude: Let me check recent signups and email job status.

$ ./admin users recent --since 24h --verified
[shows recent verified users]

$ ./admin logs errors --service email --last 24h
[2024-01-15 08:15:00] email: SMTP connection timeout to smtp.sendgrid.net
[2024-01-15 08:15:01] email: Failed to send welcome-email to user 12340
[2024-01-15 08:15:01] email: Failed to send welcome-email to user 12341
...

Looks like the email service had SMTP connection issues starting at 8:15 AM.
Let me check how many users were affected:

$ ./admin metrics health
{
  "email_queue_depth": 847,
  "email_error_rate": 0.94,
  ...
}

There are 847 emails stuck in the queue with a 94% error rate. The issue appears
to be SMTP connectivity. Once that's resolved, you may want to replay the failed
jobs. I can show you the affected users if you'd like to manually trigger their
welcome emails.

This is the power of Claude Code First Development. You’re not clicking through dashboards or writing one-off SQL queries. You’re having a conversation with an AI that has direct access to your systems through well-designed tooling.

Build the CLI tools. Expose the APIs. Give Claude the access it needs to help you operate your systems. That’s the future of production debugging.

January 13, 2026

Starting Daily Founder Fuel

This past week, I had an idea for an app. This idea came from an impulse to jot down some thoughts about my business challenges and how they needed to be written out, journaled, thought through, and developed further. I wanted to incorporate this into a daily practice, recognizing the value of writing. Everyone knows that through writing, ideas become more concrete, real, and memorable, as well as easier to share. So, writing was on my mind as I considered how to approach this.

I also wanted to maintain a balanced approach to my thought process. Some days are for strategic thinking, others for sales processes, numbers, finance, long-term growth, or professional development. As a founder or entrepreneur, it’s crucial not to fall into old patterns of focusing only on preferred areas but to address all necessary aspects that might otherwise be neglected. Having a structured approach ensures a balanced distribution of thoughts and developing ideas, preventing a single-minded focus and fostering a holistic view of the business.

I began searching for journaling apps tailored to entrepreneurs, addressing their specific concerns and questions to improve their business and life. Most journals available are generic, catering to a wide audience with personal goals and life thoughts. I wanted something more specific to business ideas. While I enjoy writing on paper, a physical book can be easily forgotten. To counter this, I decided to create an email newsletter that would appear daily, ensuring it remains visible and part of my routine. This way, it consistently prompts daily reflection without being easily hidden or forgotten.

After collecting journaling prompt ideas and quotes, I realized many turned into homework-like tasks, which, while interesting and fun, also served as valuable exercises. Questions about handling team conflicts, delegation, sales tactics, personal skill development, team motivation, and defining unique sales propositions kept me engaged. These prompts helped test my clarity of thought and understanding of various business aspects, ensuring I stayed sharp and well-rounded in my approach.

Seeing a need for such a resource, I launched a website, dailyfounderfuel.com, and created a newsletter signup so everyone could try it. I populated it with prompts scheduled out for the next several months. This experiment required minimal effort and low cost—around $40 initially and $10 monthly for email service, domain, and hosting. This small investment sets up an ongoing experiment to see if there’s interest in such a resource. If successful, I’ll have a valuable list of entrepreneurs and founders. If not, it’s a learning exercise. Either way, it’s a worthwhile endeavor. If this sounds interesting, check out dailyfounderfuel.com and sign up.

June 27, 2024
Mini AI Automations

I’ve been reading the book “Buy Back Your Time” by Dan Martell. It is a new perspective to me for how to think about hiring people. Well worth the read if this is something in your wheelhouse.

One of the core ideas in the book is to find the time sucking tasks and focus on removing those first by hiring people to do these (usually) simpler tasks. It’s a great use case for applying AI.

So this past few weeks I’ve been doing an audit of my time to find the 30 minutes here and there that could be automated away.

It got me down the rabbit hole of no-code automation systems and Make.com. With Make, it is easy to deal with the world we live in these days that run on a dozen disconnected services:

Gmail for emails

Google docs for spreadsheets

Notion for business documentation

Slack for team communication

e-commerce platforms – Shopify, Amazon

…and countless others

Writing code to connect with and authenticate all these and then manage all the keys, and understand all the APIs enough to do something quickly is the kind of boring boilerplate code that:

Can often be implemented with Make or Zapier using drag and drop

Can be code written by AI.

With these approaches a number of simple automations can be built relatively easily that can save you hours of time. This week I created an automation to organize ad creatives into Google Drive, and another to cross-post blog posts.

Interestingly, these tools have spawned a new industry of automation consultants who drop into your business to find various processes that can be automated.

Even without a consultant there are thousands of pre-built templates that can be tweaked to match your needs. It’s designed to be a tool that doesn’t require much technical experience to use.

Whatever path you choose, these tools can help whittle away the minor annoying tasks that can suck up time. And for that reason it might be worth playing with. Buy back some of your time with a little automation this weekend.

Powered by beehiiv

April 5, 2024
The AI CEO

It was just one year ago that a user on Twitter (@jacksonfall – now deactivated) went viral. This was shortly after the launch of GPT-4, a model that was a massive leap better than the previous generation. He proposed a challenge for the AI to act as the CEO of a startup, it would direct the finances, develop the strategy and command people to do the work.

This AI was coined HustleGPT and it quickly turned into a sensation that attracted thousands of dollars of investment to see this experiment through.

It quickly directed the launch of an ecommerce shop to sell green gadgets. And initiated some advertising to promote the website.

However, things fell apart a couple weeks later.

The operation of this venture was ostensibly transferred to a community on Discord that formed up around it. From there very little progress was made. The website stagnated.

Just two months after starting, the project was shuttered and @jacksonfall went quiet.

After a year of working with GPT-4, it’s become more apparent that despite its capabilities, there are limitations that make it challenging to manage something as complex as a startup.

In the meantime, a year of progress has happened. OpenAI, Google, Claude, Grok, + many more models have been trained and improved. Context windows have expanded and logical reasoning has improved. Agent systems (multiple AIs working together) are starting to see some promising results. See Devin the AI software engineer:

Are we ready to try this AI CEO experiment again?

This past week I started to explore this. The result has been a revamp of my website: mattwarren.co

Will this have a similar fate to HustleGPT?

Powered by beehiiv

March 25, 2024

Category: AI & Automation

Score First, Then Build a Task List

Build Features Instead of Buying Apps

Then Work on the Economics

Measure Patiently

What Openclaw actually does

My take: too big a leap

The real insight: agents need a heartbeat

What a heartbeat enables

A simpler path

Incremental beats revolutionary

The single-agent problem

Setting up the adversarial team

The rubric and scoring loop

Convergence through conflict

The GAN connection

What this actually looks like

The cost

Same brain, different books

Running a company of agents

Try it

The First Extension

The Second Extension

What Changed

The Gap Is Closing

What This Means

The Problem With Writing

Voice First

AI as Editor

The Publishing Pipeline

How the Tools Work

Everything in Git

The Meta Part

Try It Yourself

The Call

One Week Later

The AI Difference

The Graveyard Problem

What’s Next