Tag: seo

  • Lessons From a Decade of Programmatic SEO

    Lessons From a Decade of Programmatic SEO

    This is the final post in a three-part series on programmatic SEO. Part one covered what it is and whether it’s worth your time. Part two walked through the simplest way to get started. This post is the retrospective — what I’ve learned from building programmatic SEO projects since 2014, what actually works, and what’s coming next.

    Lesson 1: Google Always Catches Up

    In 2014, my Automatic Blog Machine product was making money. Article spinning worked. Keyword stuffing worked. Building a hundred sites with rotated content and pointing links between them worked. For about six months.

    Then Google’s Panda update got smarter, and everything I’d built evaporated. Rankings disappeared overnight. Revenue went to zero. The sites were worthless.

    Every generation of programmatic SEO has its version of this story. Somebody finds a technique that games the algorithm, it works for a while, and then Google closes the loophole. Article spinning died. Exact-match domain networks died. Private blog networks died. Thin template pages with swapped city names and nothing else — those died too.

    The lesson isn’t that Google is unbeatable. It’s that any approach built on fooling the algorithm has an expiration date. The only programmatic SEO that survives long-term is the kind that would still make sense if Google didn’t exist — pages that people actually want to read.

    Lesson 2: The Quality Bar Keeps Rising

    What counted as “good enough” in 2014 would get you penalized today. And what’s acceptable today will probably look thin in three years.

    In the article spinning era, uniqueness was the bar. If the text didn’t trigger a duplicate content check, it was “good enough.” Nobody was reading these pages — they existed to rank, not to serve readers.

    In the template era, usefulness was the bar. If the page had real data — actual business listings, real product specs, genuine local information — it could rank even with a formulaic template. The information was valuable even if the presentation was boring.

    Now, in the AI era, the bar is comprehensive quality. The page needs real data, good writing, proper formatting, useful structure, internal links, and a design that doesn’t scream “this was generated.” Readers expect the same quality from a programmatic page that they’d expect from a hand-written one.

    This isn’t Google being arbitrary. It’s reflecting what users actually want. Every time people complain about search quality — and they complain a lot — Google tightens the screws. The sites that survive each tightening are the ones that were already over-delivering on quality.

    The practical takeaway: build to a quality standard that’s higher than what currently ranks. If the top results for your target query are mediocre, don’t match them — beat them. That margin is your insurance against the next algorithm update.

    Lesson 3: Small Sites Can Win Specific Niches

    The biggest misconception about programmatic SEO is that you need to be Yelp or Zapier to succeed. You don’t. Those companies succeed because they operate at massive scale across broad categories. But scale and breadth aren’t the only ways to win.

    Small, focused sites win by going deeper than the big players bother to. A mega-site might have a page for “plumbing in Austin” but it won’t have a page about Austin’s specific water hardness regulations and what they mean for residential plumbing maintenance. That level of specificity is where the opportunity lives.

    The best small-site programmatic SEO projects share three traits:

    Deep niche expertise. The creator knows the subject well enough to spot what’s missing from existing content. They’re not just generating pages — they’re filling genuine information gaps.

    Specificity that big sites can’t match. A large directory has breadth but not depth. They can’t afford to write 2,000-word deep dives for every long-tail variation. You can — especially with AI handling the research and drafting.

    Willingness to maintain and update. Most programmatic sites get published and abandoned. The ones that win long-term keep their data fresh. If your competitor pages reference 2023 pricing, update yours to 2026 pricing. If a local regulation changed, update your city page. This sounds obvious, but almost nobody does it.

    Lesson 4: Internal Linking Is the Multiplier

    I underestimated internal linking for years. Then I saw the data.

    A set of programmatic pages with no links between them behaves like a hundred isolated blog posts. Google crawls them independently, doesn’t understand the relationship between them, and treats each page as a standalone piece of content competing on its own merits.

    The same set of pages with intentional internal linking becomes a content hub. Google understands the topical relationship. Authority flows between pages. When one page ranks well, it lifts the others. The whole is genuinely greater than the sum of its parts.

    For programmatic SEO specifically, the linking structure should be systematic:

    • Every page links to the hub — the main topic page that anchors the entire collection
    • Related pages link to each other — city pages in the same state, comparison pages in the same category, FAQ pages on related topics
    • The hub links to its best-performing spokes — as you learn which pages rank, link from your strongest page to support the weaker ones
    • External content links in too — your blog posts, your about page, your other site content should all link to relevant programmatic pages

    When I added systematic internal linking to a set of pages I’d published months earlier, some of them jumped from page 3 to page 1 within weeks. The content hadn’t changed. The links made Google understand what it was looking at.

    Lesson 5: Failures Teach More Than Successes

    I want to be honest about the projects that didn’t work, because the failure modes are instructive.

    The 10,000-page experiment (2024). After writing about programmatic SEO as a concept, I decided to test it at scale. Build a large site, publish thousands of pages, see what happens. The content was AI-generated with some data enrichment, but the quality was inconsistent. Some pages were genuinely useful. Many were thin. Google’s March 2024 core update hit the site hard. Traffic dropped 70% in a week. The lesson: volume without consistent quality is a liability, not an asset.

    The comparison site (2023). I built a site with product comparison pages using early ChatGPT-generated content. The information was plausible but not always accurate. Some product features were hallucinated. Some pricing was wrong. Readers complained in comments. Google noticed the bounce rates. The site never gained traction. The lesson: AI content without real data sourcing produces pages that look right but aren’t. Readers can tell.

    The directory that worked (2025). On the other hand, a small directory project — fewer than 100 pages — that aggregated genuinely hard-to-find local information performed well from day one. Each page took longer to produce because the data required real research. But because the information wasn’t available elsewhere in a consolidated format, the pages ranked quickly and stayed ranked. The lesson: less content, more value per page, wins.

    The pattern across every failure was the same: I prioritized quantity over quality. Every success came from the opposite decision.

    Lesson 6: The Maintenance Problem Is Real

    Here’s something nobody talks about in programmatic SEO guides: what happens after you publish?

    Content decays. Prices change. Businesses close. Regulations update. Links break. Data goes stale. A page that was accurate when you published it becomes misleading six months later — and misleading content eventually gets outranked by something fresher.

    For hand-written blog posts, this is manageable. You have 50 posts, you review them periodically, you update what’s outdated. For 500 programmatic pages, the maintenance burden is significant.

    The solutions I’ve found:

    Build refresh into the pipeline. If your data comes from scrapeable sources, schedule regular re-scrapes. Have the AI compare new data to old data and flag pages that need updates. Automate the parts that can be automated.

    Prioritize maintenance by traffic. Not every page needs to be updated on the same schedule. Your top 20% of pages by traffic deserve monthly reviews. The rest can be quarterly or annual. Focus your attention where it has the most impact.

    Design for easy updates. If your page template separates structured data from narrative content, updating the data is easy — just refresh the numbers. If every fact is buried in flowing prose, updating requires rewriting paragraphs. Think about maintainability when you design your template.

    Remove pages that can’t be maintained. If a category of pages depends on data you can no longer source reliably, it’s better to remove those pages than to let them go stale. A smaller, accurate site outperforms a larger, unreliable one.

    Lesson 7: AI Changed Everything (But Not How You Think)

    The biggest shift in programmatic SEO isn’t that AI can write content. It’s that AI can do research.

    Content generation was always the easy part. Even before AI, you could spin articles, fill templates, generate text. The hard part was getting accurate, specific, useful information for each page. That required actual research — visiting sources, extracting data, cross-referencing facts, understanding context.

    What’s different now is that AI agents can do that research at scale. Claude Code can browse the web, read source documents, extract specific data points, and compile them into structured content — for every row in your spreadsheet. That’s not just faster writing. That’s faster research, which was always the bottleneck.

    This changes the economics completely. A project that would have required weeks of manual research to populate with real data can now be researched in hours. The constraint shifts from “can I gather enough information?” to “is this information worth publishing?”

    But here’s the nuance: AI research still needs human judgment. The AI doesn’t know which sources are trustworthy for your niche. It doesn’t know when a fact is technically accurate but misleading in context. It doesn’t know the difference between a useful page and a page that merely looks useful. That judgment is still yours — and it’s what separates programmatic SEO that works from programmatic SEO that gets penalized.

    Where This Is All Heading

    Three trends are shaping the future of programmatic SEO:

    AI search is changing the game. Google’s AI Overviews, ChatGPT’s search, Perplexity — these tools synthesize information from across the web and present it directly to the user. If an AI can answer the query by reading your page and summarizing it, the user might never visit your site. This means programmatic pages need to offer something beyond summarizable facts — interactive tools, downloadable resources, visual comparisons, or depth that can’t be condensed into a snippet.

    E-E-A-T matters more than ever. Google’s emphasis on Experience, Expertise, Authoritativeness, and Trustworthiness is a direct response to the flood of AI-generated content. Sites with a real author, real expertise, and real experience behind them get preferential treatment. For programmatic SEO, this means connecting your template pages to your broader brand — author bios, links to your other work, evidence that a real person stands behind the content.

    The bar for “unique value” keeps climbing. Aggregating publicly available information into a cleaner format used to be enough. Increasingly, the winning programmatic sites add something genuinely new — original analysis, proprietary data, interactive tools, expert commentary layered on top of the aggregated data. The template is just the delivery mechanism. The unique value is what gets the page ranked.

    The Only Rule That Never Changes

    After a decade of building, failing, rebuilding, and occasionally succeeding at programmatic SEO, one principle has held constant through every algorithm update, every technology shift, and every competitive wave:

    If the page helps the reader, it will eventually rank. If it doesn’t, it eventually won’t.

    Every technical decision — the template structure, the data sources, the publishing pace, the internal linking, the AI tooling — is in service of that one question. Would a real person find this page useful?

    Build for that standard, and the algorithm updates become opportunities instead of threats. The sites that survive Google’s crackdowns are always the ones that were building for readers, not for robots.

    The tools have never been better. AI can research, write, and publish at a scale that was unimaginable even two years ago. But the strategic question is the same one it’s always been: are you creating something of value, or are you just creating more noise?

    If you’ve read all three posts in this series, you have everything you need to answer that question for yourself. Start with the concept. Build with the simplest approach that works. And keep the long view in mind — because the sites that win in programmatic SEO are the ones that are still useful five years from now.

    For more on building AI-powered content workflows, check out how I use AI to write and publish blog posts. And if you want to see the original post that started this whole series, that’s here.

  • The Simplest Programmatic SEO You Can Build Today

    The Simplest Programmatic SEO You Can Build Today

    In the last post, I explained what programmatic SEO is and when it’s worth pursuing. The short version: it’s creating web pages using templates and data instead of writing every page by hand.

    But knowing what it is and actually building it are different things. Most guides jump straight to complex tech stacks — custom databases, headless CMS platforms, expensive plugins — and lose 90% of readers before they publish a single page.

    The reality in 2026 is that AI has collapsed most of those steps. You don’t need to manually copy-paste pages from a spreadsheet. You don’t need to learn a page builder plugin. You can start with an AI assistant, a WordPress site, and a clear idea of what pages you want to create.

    Step 1: Let AI Build Your Data Set

    Every programmatic SEO project starts with a list of pages. The old advice was to sit down with a spreadsheet and fill in rows by hand. That still works — but why would you?

    Instead, start by telling an AI assistant what you’re trying to build. Be specific about your niche and what kind of pages you want. For example:

    “Give me a list of 50 cities in Texas with populations over 50,000, along with their county, population, and top three industries.”

    Or: “Research and list every competitor in the meal prep delivery space, with their pricing, delivery areas, and key differentiators.”

    Or: “What are the 30 most common questions people ask about home solar installation, organized by stage of the buying process?”

    The AI generates your seed data in seconds. Export it to a Google Sheet or CSV file, and you’ve got the skeleton of your project. Each row is a potential page. Each column is a variable that changes between pages.

    Here’s where the multiplication happens. Say you have 20 cities and 5 services. That’s 100 potential pages — “[service] in [city]” — generated from two simple lists. Add industries, and you’ve got another dimension. The data set grows fast.

    Keep a local copy of everything. Download your research, cache your data sources, save reference material to your computer. You don’t want to re-fetch the same information every time you work on the project. A local folder with your spreadsheets, source documents, and reference data becomes your project’s knowledge base.

    Step 2: Design Your Template

    Before you generate a single page, you need to know what a good page looks like. This is the most important step, and it’s worth spending real time on.

    Pick one row from your data set — one city, one product, one question — and build the best possible page for it. Not blindly with AI. By hand. Think about what someone searching for that query actually wants to know, and make sure the page delivers it. Your pages need to be good enough that people stay and read.

    This manual page becomes your template. Study it:

    • What headings did you use?
    • What data points appear on every page versus what’s unique?
    • How long does it need to be to genuinely answer the question?
    • What internal links connect it to related pages in your set?

    Once you’re happy with the template, describe it clearly — the structure, the sections, the tone, what goes where. This description becomes your prompt for generating every other page.

    Step 3: Establish Your Brand Guide Early

    This is something most programmatic SEO guides skip entirely, and it’s why so many pSEO sites feel like they were stamped out of a factory.

    Before you generate content at scale, decide on your brand voice and visual identity. Write it down. These decisions are hard to change later, and consistency is what separates a site that feels trustworthy from one that feels like spam.

    For writing voice, decide:

    • First person or third person?
    • Authoritative and expert, or friendly and conversational?
    • Technical language or plain English?
    • What phrases or patterns does your brand use? What does it avoid?

    Feed this brand guide to your AI as context for every page it generates. The difference between “write a page about solar installation in Austin” and “write a page about solar installation in Austin using this voice guide” is enormous. Without it, every page will sound like generic AI output. With it, they’ll sound like they came from the same knowledgeable author.

    For visual identity, decide:

    • What style of images will you use? AI-generated, stock photos, custom graphics?
    • Pick a specific image style and dial in the prompt so it’s consistent across all pages
    • Choose a color palette and typography that carries through the site
    • Decide on a layout template before you start publishing

    Spend an afternoon getting your image generation prompt right. Test it on 5-10 variations and make sure the results feel cohesive. A site where every hero image looks like it belongs to the same brand signals quality. A site where every image looks randomly generated signals the opposite.

    Step 4: Generate and Publish With AI

    Here’s where modern tools change the game entirely. You don’t need to manually create pages one by one, and you don’t need an expensive import plugin to do it for you.

    An AI coding assistant like Claude Code can take your spreadsheet, your template, and your brand guide and do the heavy lifting:

    1. Research each row — For every entry in your data set, the AI can search the web, pull real information from multiple sources, and compile facts that are specific to that page. A page about “plumbing services in Austin” shouldn’t contain generic plumbing advice — it should reference Austin’s actual building codes, local licensing requirements, and water quality specifics.
    2. Write the content — Using your template structure and brand voice, the AI drafts each page. Because it’s working from real research rather than generating from memory, the content is grounded in verifiable facts.
    3. Publish directly — Tools like the WordPress REST API let AI publish pages directly to your site, complete with formatting, categories, tags, and featured images. No copying and pasting between tools.
    4. Review each page — And this is the step you never skip. Read every page before it goes live, especially in the beginning. Check that the facts are accurate, the voice is consistent, and the page would pass the quality test from the last post: would a real person feel their time was respected?

    For the first 10-20 pages, review every single one. As you get confident that your template and prompts produce reliable output, you can shift to reviewing a sample — but never stop reviewing entirely.

    Start Slow, Accelerate Later

    There’s a temptation to use these tools to publish hundreds of pages in a weekend. Resist it.

    When a new site suddenly appears with 500 pages, Google notices. And not in a good way. A brand-new domain with a flood of content looks exactly like the kind of spam site that Google’s algorithms are designed to catch — regardless of how good the content actually is.

    The better approach is to start with a handful of pages and grow steadily:

    Week 1-2: Publish 5-10 of your best pages. Obsess over quality. Make sure every fact is right, every image looks good, every internal link works.

    Week 3-6: Add 3-5 pages per week. Monitor which pages get indexed and start appearing in search. Pay attention to what Google seems to like.

    Month 2-3: If pages are getting indexed and attracting some traffic, increase your pace. Maybe 10 pages per week. Keep reviewing quality.

    Month 3+: If the signal is positive, you can ramp up further. But always tie the pace to the quality you can maintain.

    This gradual approach does two things. It gives Google time to build trust in your domain. And it gives you time to learn what’s working — which page structures perform best, which topics attract traffic, and which ones fall flat. That feedback loop is worth more than a thousand pages published blind.

    Picking Your First Project

    The hardest part isn’t the technology. It’s choosing what to build.

    Here are five proven patterns that work well for a first project, ordered from simplest to most ambitious:

    1. FAQ pages for your niche. Take the 20-30 most-asked questions in your field and create a dedicated page for each one. Have AI research the best current answer for each, pulling from authoritative sources. This is the lowest-risk starting point because each page targets a specific long-tail query with clear search intent.

    2. Comparison pages. “[Product A] vs [Product B]” for every meaningful combination in your space. AI can research current pricing, features, and reviews for each product. The data changes, so keep local copies and plan to refresh these periodically.

    3. Location + service pages. “[Service] in [city]” combinations. This is the classic multiplication approach — 10 services across 20 cities gives you 200 pages. AI can research city-specific details (regulations, demographics, local competitors) to make each page genuinely useful rather than just swapping the city name.

    4. Tool or resource directories. Curate every tool, service, or resource in a specific category. AI can research pricing, features, and user reviews from across the web, then present it in a consistent format. The value is in the consolidation — saving the reader from visiting 30 different websites.

    5. Data-driven analysis pages. Turn public datasets into readable insights. Government databases, industry reports, and public APIs contain enormous amounts of information that nobody has bothered to make accessible. AI can process raw data and present it in plain language for specific audiences.

    Pick one. Build 10 pages. See what happens.

    Common Mistakes to Avoid

    Having tried (and failed at) programmatic SEO more than once, here are the mistakes that kill projects:

    Starting too big. Don’t plan 1,000 pages before you’ve proven 10 work. Build the smallest possible version, see if it gets traffic, then scale what works.

    Skipping the brand guide. Without a consistent voice and visual identity, your site will feel like a content farm even if the information is good. Invest the time upfront.

    No quality review. Publishing AI-generated pages without reading them is how sites get penalized. Review every page early on. Spot-check as you scale. Never publish blind.

    Thin content. If your template produces pages with 200 words of generic text and a data table, that’s not enough. Each page needs to genuinely answer the searcher’s question. If you can’t make a page useful, don’t create it.

    Ignoring internal linking. A hundred orphan pages with no links between them won’t perform. Every page should link to related pages in your set, and your set should link back to your main site content. Build the web of connections from day one.

    Sloppy images. Inconsistent or obviously AI-generated images with different styles on every page undermine trust. Pick one style, refine the prompt, and stick with it across the entire site.

    Going too fast on a new domain. Publishing hundreds of pages on a fresh domain in your first week is a red flag to Google. Start slow, build trust, accelerate when you see positive signals.

    What to Do This Week

    If this approach sounds interesting, here’s a concrete starting point:

    1. Pick a pattern from the five options above that fits your expertise or business
    2. Ask an AI assistant to generate your seed data — cities, competitors, questions, whatever your pattern requires
    3. Build one perfect page by hand — this becomes your template and quality benchmark
    4. Write your brand guide — voice, tone, image style, what to avoid
    5. Search for your target queries and compare your template page to what’s already ranking

    If your page is better than what’s currently out there, you’ve found your project. The tools to scale it are available right now — and most of them are free or close to it.

    In the next post, I’ll share lessons from a decade of building programmatic SEO projects — what actually works long-term, what gets penalized, and where this is all heading as AI gets more capable. For more on how AI fits into content workflows, check out my AI-assisted content strategy. And if you’re a builder looking for the technical deep dive, growth engineering with Claude Code covers the pipeline side in detail.

    But start with the pattern and the brand guide. Everything else follows from those two decisions.

  • What Is Programmatic SEO (And Is It Worth Your Time?)

    What Is Programmatic SEO (And Is It Worth Your Time?)

    A decade ago, I launched a product called Automatic Blog Machine. The idea was simple: use natural language processing to find synonyms and rotate sentence structures so that scraped content wouldn’t get flagged as duplicate text. Spin a paragraph enough times and Google’s algorithms couldn’t tell it was the same article published across a hundred different sites.

    It worked — for about six months. Then Google got smarter, the rankings disappeared, and I learned an expensive lesson about building on a foundation of trickery.

    That was my introduction to programmatic SEO. And while the tools have changed dramatically since then, the core question hasn’t: can you create content at scale without it being garbage?

    What Programmatic SEO Actually Is

    Programmatic SEO is creating web pages using templates and data instead of writing every page by hand. That’s it. No magic, no dark art.

    Think about it this way. A real estate site with a page for every neighborhood in a city — those pages aren’t hand-written. They pull from a database: median home price, school ratings, walkability score, recent sales. The template is the same, but the data makes each page unique and useful.

    That’s programmatic SEO at its simplest. You define a pattern, plug in data, and generate pages that target specific search queries.

    Some real-world examples that are probably already in your life:

    • Yelp has a page for every “best [restaurant type] in [city]” combination
    • Zapier has integration pages for every app pairing — thousands of them
    • NerdWallet has comparison pages for financial products across every category
    • Tripadvisor has pages for every hotel, restaurant, and attraction in every city on Earth

    These aren’t hand-crafted blog posts. They’re templates filled with structured data, and they drive millions of organic search visits every month.

    The Spectrum of Complexity

    Here’s where people get intimidated. They hear “programmatic SEO” and picture a team of engineers building complex data pipelines. But the spectrum is much wider than that.

    The simple end: A Google Sheet with 50 rows of FAQ questions, turned into individual pages on a Wix or WordPress site. Each page targets a specific long-tail search query. No code required.

    The middle: A WordPress site with a template that pulls in data from a spreadsheet or simple database. Maybe you’re building city-specific landing pages for a local service, or comparison pages for products in your niche.

    The advanced end: A full pipeline that scrapes data sources, enriches it with AI, generates unique content for each page, and publishes automatically. This is where tools like Claude Code come in — but you don’t need to start here.

    The point is that programmatic SEO isn’t binary. You don’t need a sophisticated tech stack to benefit from the approach. You need a repeatable pattern and data to fill it.

    A Decade of Cat and Mouse

    My Automatic Blog Machine story isn’t unique. The history of programmatic SEO is really the history of people trying to create content at scale and Google trying to separate the valuable from the worthless.

    The early era (2010-2015): Article spinning, keyword stuffing, link farms. Content was generated to game algorithms, not to help readers. Google’s Panda and Penguin updates torched most of it. My product was part of this wave, and it deserved to get squashed.

    The template era (2016-2022): Smarter operators moved to database-driven templates. If you had genuinely useful structured data — business listings, product specs, local information — you could build pages that actually served a purpose. This worked better because there was real information behind each page, even if the presentation was formulaic.

    The early AI era (2023-2024): ChatGPT arrived, and suddenly everyone could generate “unique” text at scale. But GPT-2 and GPT-3 era content had obvious problems. The hallucinations were rampant. There was no way to connect the model to real data sources, so it would confidently make up facts, invent statistics, and fabricate references. If you read enough AI-generated content from that period, you developed a sixth sense for it — the same vague structure, the same filler phrases, the same lack of specificity.

    Some people tried to work around this. I experimented with using web search APIs to pull real content, then feeding it to ChatGPT to create summaries and rephrase things in a more natural way. It was better than pure hallucination, but still produced that unmistakable AI voice. And Google was getting better at detecting it.

    Where we are now (2025-2026): This is where things genuinely changed. The current generation of AI tools — particularly agent-based systems like Claude Code — can do something the earlier models couldn’t: go out on the internet, find ten real references for every claim, consolidate and synthesize that information, and present it in a way that actually helps the reader.

    That’s a fundamentally different value proposition than spinning synonyms or generating hallucinated text.

    The Real Turning Point

    Here’s the thing that changed my mind about programmatic SEO after years of skepticism.

    When you can connect AI to real data sources — web scraping, APIs, databases, live search results — you’re not faking content anymore. You’re doing genuine research at scale. The AI becomes a research assistant that can:

    • Pull together information from dozens of sources for a single page
    • Take complicated language (legal documents, scientific papers, technical specs) and rephrase it for different audiences
    • Cross-reference facts across multiple sources to reduce hallucination
    • Tie together related concepts in ways that would take a human researcher hours

    Could someone get this information by doing a Google search themselves? Maybe. Could they have a conversation with an AI chatbot and get similar answers? Possibly. But if the value you’re providing involves pulling together many sources, consolidating scattered information, and presenting it in a clear format — that’s real work, even if a machine is doing it.

    Think about a directory site that aggregates local business information from public records, review sites, and social media — then presents it in a clean, searchable format with plain-language summaries. That’s providing genuine value. The information exists on the internet already, but it’s scattered across dozens of sites in inconsistent formats. Consolidating it is the service.

    Or consider taking dense regulatory documents and creating simple, city-specific guides for small business owners. The source material is public, but it’s written in legal language that most people can’t easily parse. Making it accessible is the value.

    When Programmatic SEO Is Worth It

    Not every site or business benefits from this approach. Here’s a honest framework for deciding.

    It’s probably worth exploring if:

    • You can identify a clear pattern of search queries (like “[thing] in [place]” or “ vs “)
    • Structured data exists that could populate those pages (public databases, APIs, scraped information)
    • Each generated page would genuinely answer someone’s question
    • You’re willing to invest upfront in building the pipeline, knowing the payoff is gradual
    • You have some technical comfort, even if it’s just spreadsheets and a basic website builder

    It’s probably not worth it if:

    • Your topic requires deep original thought or personal experience on every page
    • The search queries you’d target are already dominated by massive sites with real authority
    • You can’t identify a repeatable template that works across many variations
    • You’re only interested in tricking Google rather than helping readers
    • You need results next week (programmatic SEO is a long game)

    The honest truth: Most people who attempt programmatic SEO either give up before publishing enough pages to see results, or they cut corners on quality and get penalized. The sweet spot is finding a niche where you can provide genuine value at scale — and that niche is more specific than you think.

    The Quality Test

    Before I invest time building programmatic pages for any topic, I apply a simple test:

    If a real person landed on this page from a Google search, would they feel like their time was respected?

    Not “would they click around the site.” Not “would Google’s algorithm reward it.” Would an actual human being read this page and think, “Good, that’s what I needed to know”?

    If the answer is yes, the approach is sound regardless of how the content was created — by hand, by template, by AI, or by some combination. If the answer is no, no amount of technical sophistication will save it. Google is remarkably good at figuring out when people are disappointed by what they find.

    This is the real shift in programmatic SEO. It’s no longer about creating content that fools algorithms into thinking you’re providing value when you’re not. It’s about actually providing value — and using automation to do it at a scale that would be impossible manually.

    Where to Start

    If you’re curious about programmatic SEO but don’t want to build a complex pipeline on day one, start here:

    1. Find your pattern. What questions do people search for in your space that follow a repeatable format? Use Google’s autocomplete, “People also ask” boxes, or a tool like AlsoAsked to spot templates.
    2. Check the competition. Search for a few variations of your pattern. If the top results are from massive sites with huge authority, pick a more specific niche. If the results are thin or unhelpful, you’ve found an opportunity.
    3. Build one page by hand. Before automating anything, manually create the best possible version of one page in your template. This becomes your quality benchmark.
    4. Then scale gradually. Start with 10-20 pages, not 1,000. See how they perform. Adjust your template based on what works. Only then consider building automation.

    The tools available today — from simple no-code builders to full AI agent pipelines — make the scaling part easier than ever. But the strategic thinking that goes into choosing what to build? That’s still on you.

    I’ve written more about the technical side of building these pipelines in my post on programmatic SEO, and if you’re interested in how AI fits into a broader content workflow, take a look at how I use AI to write and publish blog posts. For the growth-minded builders, growth engineering with Claude Code gets into the deeper technical possibilities.

    But honestly? Start with the pattern. Everything else follows from that.

  • Scrape Google Search Results Page

    Here’s a short script that will scrape the first 100 listings in the Google Organic results.

    You might want to use this to find the position of your sites and track their position for certain target keyword phrases over time. That could be a very good way to determine, for example, if your SEO efforts are working. Or you could use the list of URLs as a starting point for some other web crawling activity

    As the script is written it will just dump the list of URLs to a txt file.

    It uses the BeautifulSoup library to help with parsing the HTML page.

    Example Usage:

    $ python GoogleScrape.py
    $ cat links.txt
    
    Home
    https://www.mattwarren.co/2009/07/01/rss-twitter-bot-in-python/ http://www.blogcatalog.com/blogs/halotis.html http://www.blogcatalog.com/topic/sqlite/ http://ieeexplore.ieee.org/iel5/10358/32956/01543043.pdf?arnumber=1543043 http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1543043 http://doi.ieeecomputersociety.org/10.1109/DATE.2001.915065 http://rapidlibrary.com/index.php?q=hal+otis http://www.tagza.com/Software/Video_tutorial_-_URL_re-directing_software-___HalOtis/ http://portal.acm.org/citation.cfm?id=367328 http://ag.arizona.edu/herbarium/db/get_taxon.php?id=20605&show_desc=1 http://www.plantsystematics.org/taxpage/0/genus/Halotis.html http://www.mattwarren.name/ http://www.mattwarren.name/2009/07/31/net-worth-update-3-5/ http://newweightlossdiet.com/privacy.php http://www.ingentaconnect.com/content/nisc/sajms/1988/00000006/00000001/art00002?crawler=true http://www.ingentaconnect.com/content/nisc/sajms/2000/00000022/00000001/art00013?crawler=true

    Click to access etm69yghjva13xlh.pdf

    Click to access b7fytc095bc57x59.pdf

    ...... $

    Here’s the script:

    #!/usr/bin/env python
    # -*- coding: utf-8 -*-
    # (C) 2009 HalOtis Marketing
    # written by Matt Warren
    # https://www.mattwarren.co/
    
    import urllib,urllib2
    
    from BeautifulSoup import BeautifulSoup
    
    def google_grab(query):
    
        address = "http://www.google.com/search?q=%s&num=100&hl=en&start=0" % (urllib.quote_plus(query))
        request = urllib2.Request(address, None, {'User-Agent':'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)'} )
        urlfile = urllib2.urlopen(request)
        page = urlfile.read(200000)
        urlfile.close()
        
        soup = BeautifulSoup(page)
        links =   [x['href'] for x in soup.findAll('a', attrs={'class':'l'})]
        
        return links
    
    if __name__=='__main__':
        # Example: Search written to file
        links = google_grab('halotis')
        open("links.txt","w+b").write("\n".join(links))