Internal Training Manual

SEO/GEO Training Manual

Operating playbook for the SEO/GEO/Website Lead at Kaizen AI Lab. How to direct Dr. Strange to audit sites, write agent-friendly copy, deploy schemas, and ship a measurable SAGEO optimization plan.

Prepared For
Jen Villadolid
Author
Sebastian 🦀
Version
1.0
Date
2026-05-17
Status
DRAFT

Kaizen AI Lab — SEO/GEO Training Manual

For: Jen Villadolid (SEO/GEO/Website Lead) Author: Sebastian 🦀 Version: 1.0 Date: 2026-05-17 Status: DRAFT — pending Don's review before Jen's first onboarding meeting


Welcome, Jen

This manual is your operating playbook for running SEO and GEO at Kaizen AI Lab. The thing you're walking into is unusual: most agencies in this space don't have a productized framework yet. We do — it's called SAGEO (SEO + GEO + AEO combined). And we don't run it by hand. We run it by directing an AI agent — Dr. Strange 🔮 — who has four specialized skills built into him.

Your job is not to be the person who manually audits sites and writes copy. Your job is to be the strategist and director who:

  1. Tells Dr. Strange what to audit and how
  2. Reviews his output
  3. Turns his findings into a client roadmap
  4. Iterates with him to ship the work

Think of Dr. Strange as a junior analyst who never sleeps, never gets tired, never asks for a raise, but who will produce literal garbage if you don't brief him well. The quality of what comes out of this system is entirely a function of how well you brief it. That's the actual skill.

This manual teaches you that skill.


Part 0: The Lay of the Land

What You're Walking Into

Kaizen has built — over the last 60 days — a complete SEO/GEO/AEO operating system. The pieces:

  • SAGEO Template v1.1 — our productized client deliverable framework (the what we ship)
  • Dr. Strange's 4 skillsseo-audit, geo-analysis, seo-content-writer, schema-markup (the how we execute)
  • GEO Score — proprietary 0-100 measurement system that tracks AI citation rates (the proof to clients)
  • 900+ SME library files — pre-built industry knowledge that seeds Citation Magnet content (the competitive moat)
  • ahrefs-intel skill — colony-shared Ahrefs data layer (full keyword, backlink, traffic, SERP, and broken-link intel via API). You never have to open ahrefs.com to do your job. Dr. Strange calls this skill for SEO work; Black Widow and Jarvis use it too. The measurement layer is autonomous.
  • Discord-based workflow — every Dr. Strange job posts to #seo-geo for Don's review (the quality gate)

Why GEO Matters Right Now

The market has fundamentally shifted. From a16z's 2025 analysis:

  • ChatGPT queries are 23 words on average vs. Google's 4-word average
  • Session length in AI search: ~6 minutes vs. Google's 60-90 seconds
  • ChatGPT is already driving referral traffic to tens of thousands of distinct domains
  • Apple is building Perplexity/Claude into Safari — Google's distribution chokehold is cracking

From Andrew Warner's interview with Zapier (March 2026):

  • Zapier is mentioned millions of times per month by LLMs in product recommendations
  • The play is no longer "rank on Google" — it's "be the brand the model recommends"

From @denohawari (March 2026):

  • His team has driven $30.52M in client revenue using LLM SEO over the past year
  • A B2B SaaS scaled from $20k to $100k MRR in 4 months — 760%+ non-branded traffic growth
  • Method: "decision pages" — [competitor] vs [your brand], alternatives to [X], best [tool] for [specific use case]

Translation: Traditional SEO still drives ~60% of website discovery, but it's the floor, not the ceiling. The growth vector is GEO/AEO — being cited when someone asks ChatGPT, Perplexity, Claude, or Google AI Overviews for a recommendation in your client's industry.

The ⅓ / ⅔ Heuristic (Calibratable, Not Fixed)

Our current allocation for client work: ⅓ of effort on SEO foundation, ⅔ on GEO/AEO optimization.

This is a starting baseline, not a law. After 3-5 client builds produce real data, we calibrate. Some clients (local service businesses with strong existing SEO) may need 20/80. Brand-new businesses with no web presence may need 50/50 until their SEO floor exists.

The Three Disciplines Defined

Discipline Goal Key Mechanism Key Metric
SEO Rank in Google/Bing Keywords, backlinks, technical, E-E-A-T Rankings, organic traffic, CTR
GEO Get cited in AI responses Structured content, statistical claims with citations, domain optimization AI citation frequency, AI Share of Voice
AEO Be the direct answer (voice + AI Overviews) Declarative sentences, Q&A format, Speakable schema Brand mentions in AI answers, zero-click visibility

Part 1: Meet Dr. Strange — The Tool You're Directing

Dr. Strange 🔮 is an autonomous AI agent in our colony with four specialized skills. He lives in Discord. You instruct him via natural-language prompts; he produces deliverables; everything routes through #seo-geo for Don's review before going to the client.

He also calls a fifth, shared colony skill — ahrefs-intel — for all Ahrefs data. That skill is documented separately at kaizen-colony/skills/ahrefs-intel/SKILL.md and is callable by other bots too (Black Widow for bizdev intel, Jarvis for research). You won't typically invoke ahrefs-intel directly — Dr. Strange chains it into his audit flow — but you should know it exists so you understand where Ahrefs data comes from.

His Four Skills

Skill 1: seo-audit — Technical + On-Page + Local Auditing

What it does: Full diagnostic of a website. Crawls up to 50 pages, runs Lighthouse + PageSpeed Insights, validates schema, checks Core Web Vitals, audits on-page SEO (titles, metas, headings, content depth), checks local SEO (GBP, NAP consistency, citations), and compares against competitors if you provide them.

Inputs you give him:

  • target_url — the site to audit (required)
  • client_name — for file organization (required)
  • audit_scopefull | technical_only | on_page_only | local_only (default: full)
  • keyword_targets — array of keywords to check rankings for (optional)
  • competitor_urls — array of competitor URLs (optional, but recommended)

What you get back:

  • seo-audit-report.md — full human-readable report with severity-ranked issues
  • technical-issues.json — machine-readable issue list (category, severity, recommendation, effort)
  • keyword-rankings.json — Brave Search + Ahrefs data per keyword
  • competitor-comparison.md — side-by-side comparative analysis

Cost per audit: ~$0.30 in API calls. High-margin, demo-ready.

How to invoke (paste this in #seo-geo):

Dr. Strange — run seo-audit:
  client_name: "Rideout Law"
  target_url: "https://rideoutlawgroup.com"
  audit_scope: "full"
  keyword_targets:
    - "California foreclosure attorney"
    - "wrongful foreclosure lawyer"
    - "loan modification attorney California"
  competitor_urls:
    - "https://competitor1.com"
    - "https://competitor2.com"

Skill 2: geo-analysis — AI Citation Audit (Our Secret Weapon)

What it does: Fires 10-20 target queries at GPT-4o and Grok 3 times each, classifies each response as cited (1.0) / referenced (0.5) / absent (0.0), takes majority vote, calculates a GEO Score 0-100 per model + overall. Also runs Brave Search for traditional SERP baseline. Stores results in append-only history file for trend tracking.

This is the skill that closes deals. When a prospect asks "why should I pay you?", you run this against their current site, show them they score 12/100, and show them their competitor scores 47/100. Game over.

Inputs:

  • client_url — required
  • client_name — required
  • test_queries — array, minimum 10 queries (the questions ideal customers would ask AI)
  • ai_models["openai", "grok"]

What you get back:

  • geo-report.md — human-readable scorecard with query-by-query breakdown
  • geo-scores.json — machine-readable scores per model + per query
  • data/geo-history/{client_slug}.json — append-only historical log for trend tracking

Cost per audit: ~$0.15-0.30. Re-run monthly to show progress.

Query design tips (this is the leverage point):

  • Mirror how actual buyers prompt AI, not how SEO pros think about keywords
  • LLM queries are 23 words on average — write long, natural prompts
  • Mix branded ("Is [client] a good [service]?") and unbranded ("best [service] in [city]") queries
  • Include comparison queries ("[client] vs [competitor]")
  • Include decision queries ("Should I hire [client] for [use case]?")
  • Include problem-aware queries ("My [problem]. Who should I call?")

How to invoke:

Dr. Strange — run geo-analysis:
  client_name: "Rideout Law"
  client_url: "https://rideoutlawgroup.com"
  ai_models: ["openai", "grok"]
  test_queries:
    - "best foreclosure attorney in California"
    - "what should I do if I receive a notice of default in California"
    - "Rideout Law Group reviews"
    - "California foreclosure defense lawyer near me"
    - "can I sue my lender for wrongful foreclosure in California"
    - "Rideout Law vs [competitor]"
    - "I just got a foreclosure notice in Orange County, who do I call"
    - "best lawyer for loan modification in California"
    - "California homeowner facing foreclosure what are my options"
    - "specialized foreclosure attorneys Sacramento California"

Skill 3: seo-content-writer — Client Website Copy

What it does: Writes web copy for the CLIENT'S brand (not Don's voice — the client's voice). Adapts to brand tone, incorporates SME context, targets specific keywords, and optionally applies GEO optimization patterns.

Content types he can produce:

  • service_page — Service pages for any industry (1000-1800 words) — the default for most clients
  • practice_areaLegal/law firm clients only. Practice area pages (1200-2000 words). Functionally the legal industry's name for service pages, with conventions (jurisdictions, case results, attorney credentials) baked in. Don't use this for non-legal clients — pick service_page instead.
  • location_page — Location-specific pages (800-1500 words) — any industry with geographic service areas
  • faq — FAQ pages (1500-3000 words) — any industry
  • blog — Blog posts (1000-2500 words) — any industry
  • landing_page — Conversion pages (500-1000 words) — any industry
  • bio — Team bios (300-500 words) — any industry
  • meta_descriptions — Batch title tags + meta descriptions — any industry

Reminder: Kaizen works across many verticals — financial services, F&B, real estate, tea/hospitality, SaaS, healthcare, professional services, more. The four Dr. Strange skills are industry-agnostic. The only place industry matters in this skill is practice_area (legal only) vs service_page (everyone else). All other content types are universal.

Inputs:

  • client_name — required
  • content_type — required (one of above)
  • target_keywords — required array
  • word_count_target — required integer
  • client_voice — required string (describes tone/style)
  • sme_context — optional array of SME knowledge strings or file paths
  • geo_optimize — optional boolean (apply GEO patterns)

What you get back:

  • content.md — finished copy
  • seo-metadata.json — title tag, meta description, OG tags, canonical
  • schema-suggestions.json — recommended JSON-LD schema types

How to invoke:

Dr. Strange — run seo-content-writer:
  client_name: "Rideout Law"
  content_type: "practice_area"
  target_keywords: ["wrongful foreclosure", "foreclosure defense attorney California", "predatory lending lawsuit"]
  word_count_target: 1500
  client_voice: "Professional, authoritative, warm but not casual. Trust-building. Avoid legal jargon — clients are scared homeowners, not lawyers."
  sme_context:
    - "/data/workspace/sme-cowork-library/coworkhive-sme-library-900/foreclosure-defense-sme.md"
  geo_optimize: true

Skill 4: schema-markup — JSON-LD Structured Data

What it does: Generates, validates, or audits JSON-LD structured data. This is the machine-readable layer that AI crawlers consume — it's how you tell Google and ChatGPT "this is a LegalService at this address with these specific services."

Three actions:

  • generate — produces JSON-LD for a given schema type
  • validate — checks existing JSON-LD against schema.org spec
  • audit_existing — fetches a page, finds all schema, identifies gaps + errors

Supported schema types:

  • LocalBusiness / LegalService / MedicalBusiness / etc.
  • FAQPage — Q&A pairs that produce FAQ rich results
  • Article — for blog posts (headline, author, dates)
  • Person — for team bios
  • Organization — sitewide entity definition
  • BreadcrumbList — for interior pages
  • Service — for service offerings
  • Review — for testimonials

How to invoke (audit existing site):

Dr. Strange — run schema-markup:
  action: "audit_existing"
  client_name: "Rideout Law"
  page_url: "https://rideoutlawgroup.com"

How to invoke (generate new schema):

Dr. Strange — run schema-markup:
  action: "generate"
  schema_type: "LegalService"
  client_name: "Rideout Law"
  entity_data:
    name: "Rideout Law Group"
    description: "California foreclosure defense and wrongful foreclosure attorneys"
    url: "https://rideoutlawgroup.com"
    phone: "+19165550101"
    email: "info@rideoutlawgroup.com"
    address:
      street: "..."
      city: "Sacramento"
      state: "CA"
      zip: "95814"
    geo:
      lat: 38.5816
      lng: -121.4944
    services: ["Foreclosure Defense", "Loan Modification", "Wrongful Foreclosure Lawsuits"]
    area_served: ["California", "Sacramento County", "Los Angeles County", "Orange County"]

Part 2: The Standard Engagement Flow

This is the playbook for every new SEO/GEO client. Treat it as a checklist.

Phase 1: Discovery + Baseline (Day 1-3)

Goal: Establish where the client currently stands. Build the proof points you'll use to demonstrate ROI later.

Step 1: Load the client context (mandatory, do this BEFORE any audit)

Adapted from @bloggersarvesh's Claude Cowork system. Before you run any Dr. Strange skill, post this context block in #seo-geo:

CLIENT CONTEXT BLOCK — [Client Name]

BUSINESS BASICS:
- Business name:
- Primary URL:
- Address (if local business):
- Phone:
- GBP URL:
- Years in business:
- Team size:

SERVICES + MARKET:
- Primary service:
- Secondary services:
- Service areas:
- Target customer:
- Average customer value:

SEO GOALS:
- Top 5 keywords client wants to rank for:
- Keywords currently ranking for (if known):
- Keywords client should rank for but doesn't:

CURRENT STANDINGS:
- Reviews: [X total, X star rating, X/month]
- GBP monthly views (if known):
- Monthly website traffic (if known):
- Map pack status:
- Biggest SEO problem (one sentence):

COMPETITORS (minimum 3):
1. [name] — [URL] — [GBP if local] — why they're beating us
2. [name] — [URL] — [GBP if local] — why they're beating us
3. [name] — [URL] — [GBP if local] — why they're beating us

WHAT'S ALREADY BEEN TRIED:
- [list prior SEO work, agencies, tools, results]

This context gets referenced by Dr. Strange in every subsequent skill invocation. Do not skip this step. This is the #1 reason agencies produce generic-feeling work — they never load the business context before starting.

Step 2: Run seo-audit (full scope)

This gives you the technical + on-page + local picture.

Step 3: Run geo-analysis (10-20 queries)

This gives you the AI visibility baseline. Save the date — every monthly re-run compares against this.

Step 4: Run schema-markup action audit_existing

This shows you what structured data exists, what's broken, and what's missing.

Step 5: Run ahrefs-intel for the full Ahrefs baseline sweep

This is a shared colony skill — Dr. Strange calls it automatically as part of his SEO audit workflow, but you can also invoke it directly when you need ad-hoc intel.

A full new-client baseline sweep includes:

  • domain_overview — DR, traffic, refdomains snapshot
  • traffic_history — 6-month organic traffic + refdomains trendline
  • top_pages — what's currently driving traffic (don't break these)
  • keyword_universe — every keyword the client ranks for, with quick-wins (position 4-20, volume ≥ 50, KD ≤ 50) auto-flagged
  • content_gap — keywords competitors rank for that the client doesn't, leverage-scored
  • backlink_intel — referring domains, anchor distribution, recent backlinks
  • broken_backlinks — broken inbound links with outreach priority — the highest-ROI recovery play
  • anchor_intel — over-optimization warnings (>25% exact-match = risk)

You invoke it in #seo-geo:

Dr. Strange — run ahrefs-intel baseline:
  client_name: "Rideout Law"
  target: "rideoutlawgroup.com"
  competitors: ["comp1.com", "comp2.com", "comp3.com"]
  country: "us"
  date_from: "2025-11-17"
  date_to: "2026-05-17"

Unit cost: ~500-800 Ahrefs API units per full client baseline (we have 400K/month — capacity for ~500-800 new client baselines monthly, or ~2,500 active retainers).

What you get back:

  • domain_overview.json + domain_overview-summary.md
  • traffic_history.json + chart-ready time series
  • top_pages.json ranked by traffic
  • keyword_universe.json with quick-wins flagged in summary
  • content_gap.json sorted by leverage score (run, don't walk, on these)
  • backlink_intel.json with anchor distribution
  • broken_backlinks.json with outreach priority per link
  • anchor_intel.json with over-optimization warnings

Brand Radar (Ahrefs's 2026 AI-mention tracker) is not yet wired into ahrefs-intel v1.0 — its API endpoint availability still needs to be verified. For now, our internal geo-analysis skill is the source of truth for AI citation tracking. Brand Radar will be added in v1.1 if/when its API ships.

You should not need to open ahrefs.com to do your job. If you ever find yourself wanting to (an ad-hoc visualization, a feature we haven't wrapped, exploratory click-through), tell me — we'll either add it to ahrefs-intel or get you a seat.

Deliverable for Phase 1: A client-baseline-report.md posted in #seo-geo that summarizes:

  • Overall SEO Score from seo-audit (0-100)
  • Overall GEO Score from geo-analysis (0-100)
  • Top 5 critical technical issues
  • Top 5 keyword gaps
  • Top 5 schema gaps
  • Competitor position summary
  • Recommended deployment model (see Part 4 below)

Phase 2: The Three Re-Architecture Pillars (Week 1-4)

Once you have the baseline, Don approves a deployment model and budget, and you start the actual work. Three pillars run in parallel:

Pillar A: Re-Architect the Schema

Goal: Every page has correct, complete, AI-citable structured data.

Workflow:

  1. Use the schema-markup audit_existing output to identify gaps
  2. For each gap, instruct Dr. Strange:
    Dr. Strange — run schema-markup:
      action: "generate"
      schema_type: [type]
      client_name: [name]
      entity_data: { ... }
  3. Validate every generated schema against Google's Rich Results Test: https://search.google.com/test/rich-results
  4. Deliver the JSON-LD to the dev team (or to Carson/Dr. Strange if we're building the site) with placement instructions (typically in <head>)

Schema priority order (from kalicube-geo-playbook.md, Jason Barnard, April 2026):

  1. Entity definition firstOrganization or LocalBusiness sitewide (this is the "Entity Home" that AI uses to identify the business)
  2. Service / LegalService schemas on every service page
  3. FAQPage on FAQ sections (single highest-ROI rich result)
  4. Person on bio pages
  5. Article on blog posts
  6. BreadcrumbList sitewide
  7. Review schemas where testimonials exist
  8. Speakable markup on the 2-3 most important answer paragraphs per page (for voice + AI Overviews)

Critical technical detail (from @Charles_SEO, March 2026):

"Googlebot only fetches the first 2MB of your page's HTML. Everything after that cutoff doesn't exist to Google — not fetched, not rendered, not indexed. Make sure you put your meta tags, title, canonicals, and structured data as HIGH as possible in the document. If they're below the 2MB cutoff, Google doesn't know they exist."

Also: external CSS/JS files get their own 2MB limit per file. PDFs get 64MB.

Pillar B: Re-Write the Copy (Agent-Friendly)

Goal: Every page serves humans AND AI crawlers simultaneously. Humans see modern design + interactivity; AI sees clean semantic HTML with extractable, declarative content.

The 7 GEO Content Principles (apply to every page):

  1. Definitive statements, not hedging. "X is..." not "X may be..." LLMs prefer citable, authoritative declarations.
  2. Bottom-line-up-front structure. Answer first, context second. The first paragraph of every page should be a 2-sentence answer the LLM can lift verbatim.
  3. Question-format H2s. Mirror how users prompt AI. The H2 is the question; the first sentence under it is the answer.
  4. Statistical authority + citations. Every claim backed by "[Source, Year]" inline. Specific numbers ("reduces X by 34%") get cited at significantly higher rates than vague claims.
  5. Entity clarity. First mention of the business = full name + location + descriptor. "Rideout Law Group, a Sacramento, California-based foreclosure defense firm..."
  6. Comparison content. "Unlike traditional X, [Client] does Y." Helps AI position the client in competitive queries.
  7. Breadth + depth. Cover every angle AI might synthesize from. LLMs prefer comprehensive single-page resources over thin pages.

The "Claim-Frame-Prove" passage pattern (from Kalicube Framework, April 2026):

When a user prompts "What should I do if I get a notice of default?", the AI reassembles an answer out of passages that carry Claim, Frame, Proof in a form it can lift verbatim. Passages structured as "Claim first sentence, Frame second sentence, Proof third sentence" extract cleanly. Passages structured as "long discursive paragraph with the answer buried at the end" don't.

Apply to every key paragraph:

  • Claim (sentence 1): The answer.
  • Frame (sentence 2): The context/qualifier.
  • Proof (sentence 3): Statistic, source, case example.

Example for Rideout Law:

Claim: California homeowners have 90 days from a notice of default to cure the default or negotiate alternatives. Frame: This is the most critical window in the entire foreclosure process under California Civil Code §2924c. Proof: Rideout Law Group has resolved 87% of cases that enter our office within this 90-day window through loan modification, reinstatement, or wrongful foreclosure litigation (internal data, 2022-2025).

Decision Pages (from @denohawari, March 2026):

These are the highest-ROI pages in the GEO era. They're explicitly built to capture AI recommendation queries:

  • [competitor] vs [your client] — the head-to-head page
  • alternatives to [competitor] — for buyers exiting a competitor
  • best [service] for [specific use case/customer profile] — captures decision-stage queries
  • [service] for [specific industry/region] — niche specificity wins in AI

"AI doesn't reward whoever has the most content, or who's been in the game the longest. It rewards whoever is the clearest answer when buyers ask questions. If you optimize your SEO for AI, you can sideline competitors by capturing their demand before the buyers even start searching."

Apply Dr. Strange:

Dr. Strange — run seo-content-writer:
  client_name: "Rideout Law"
  content_type: "service_page"
  target_keywords: ["foreclosure defense California", "stop foreclosure California"]
  word_count_target: 1500
  client_voice: "Professional, authoritative, warm. Trust-building tone for scared homeowners."
  sme_context:
    - "/data/workspace/sme-cowork-library/coworkhive-sme-library-900/foreclosure-defense-sme.md"
    - "/data/workspace/clients/rideout-law/discovery-notes.md"
  geo_optimize: true

Voice calibration is everything. The content_writer skill explicitly tests for "does this sound like the CLIENT, not like Don or Sebastian?" If client_voice is vague, Dr. Strange flags it for clarification. Don't let him generate generic voice. Write the client_voice description like you're describing them to a new copywriter.

Pillar C: Build the SEO/GEO Optimization Plan (Ahrefs-Powered)

Goal: A prioritized, time-boxed, evidence-backed roadmap the client signs off on.

The framework (90-day standard):

Weeks 1-2: Foundation

  • All technical issues from seo-audit resolved (P0 + P1)
  • Google Business Profile fully optimized (categories, attributes, services, photos)
  • Google Search Console + Analytics installed and verified
  • XML sitemap + robots.txt + canonical tags audited
  • All sitewide schemas deployed (Organization, BreadcrumbList)
  • /llms.txt published (emerging convention — low cost, do not over-position to client)
  • Baseline GEO Analysis recorded
  • Baseline Ahrefs keyword + backlink snapshot

Weeks 3-4: Core Pages

  • 8-12 pages built/rewritten with dual-audience architecture
  • Per-page schemas deployed (Service, FAQPage, Speakable)
  • FAQ sections populated from real "People Also Ask" data (use Ahrefs Keywords Explorer → Questions report)
  • Image optimization (WebP/AVIF, blur-up, alt text)
  • Internal linking audit + fix (orphan pages, anchor text optimization)

Weeks 5-8: Content Velocity + Authority

  • 2 blog/resource articles per week (question-format, citation-heavy)
  • 10-15 directory submissions (industry-relevant + local citations)
  • 1 Citation Magnet pillar piece (see Part 3)
  • GBP posts weekly with local landmarks + service keywords
  • First Reddit/forum content seeding (more on this in Part 5)
  • Month 2 GEO Audit — compare to baseline

Weeks 9-12: Optimize + Scale

  • Audit which AI citations are working, which aren't
  • Update underperforming pages with fresh stats + citations
  • Expand FAQ coverage based on emerging query patterns
  • Video content for 3-5 highest-value pages
  • Month 3 GEO Audit + first client report
  • Hand off monthly retainer scope

ahrefs-intel is your power tool throughout — all of these are autonomous skill actions, not manual UI work:

ahrefs-intel action Use For
top_pages What's already driving traffic? Don't break those
keyword_questions Real "People Also Ask" data → seeds FAQ + blog content
keyword_research Find low-difficulty, high-intent keywords competitors aren't targeting (overview + matching terms)
content_gap Keywords competitors rank for that client doesn't — the priority page list, leverage-scored
backlink_intel Referring-domain profile + anchor text distribution
broken_backlinks Broken inbound links → recover lost SEO equity via publisher outreach
anchor_intel Anchor text health check — flags over-optimization risk
serp_analysis What's currently ranking for a target keyword — per-query competitive landscape
traffic_history Trendline data for monthly client reports + regression detection

Rank tracking (track 50-200 target keywords over time): currently re-run keyword_universe monthly and diff against last month's snapshot. If we need dedicated Rank Tracker semantics, we add a rank_tracker action in ahrefs-intel v1.1.

Ahrefs's "Brand Radar" (their 2026 AI-mention tracker) is not yet wired into ahrefs-intel v1.0 — API availability still needs verification. For AI citation tracking, our internal geo-analysis skill is the source of truth.


Part 3: The Citation Magnet — Our Differentiator

A Citation Magnet is a dedicated section of the client's website designed to be the authoritative source AI engines cite when answering questions in the client's domain. It's not a blog. It's an AI-native knowledge base.

Why It Matters

This is the productized offering that separates Kaizen from every other SEO agency. We have 900+ SME library files (industry knowledge bases at 13,000+ words each). No competitor has that raw material. The Citation Magnet is how we turn that moat into client value.

The 4-Part Architecture

1. Industry Knowledge Graph

  • Top 50-100 questions in the client's industry
  • Each entry: declarative 1-2 sentence answer + statistic + dated citation + primary source link
  • Uses DefinedTerm + FAQPage + Speakable schema
  • Auto-generatable from the SME library files

2. "AI Audit" Public Page

  • Format: "Here's what AI currently says about [topic], and here's what the data actually shows"
  • Positions the client as the authority correcting AI misinformation
  • Generates backlinks from journalists + industry pros
  • Highly citable (AI prefers correction content)

3. Structured Data Feed

  • /llms.txt — plain-text overview of the site's authoritative topics
  • /knowledge-base/index.json — structured JSON feed of all Q&A pairs
  • XML sitemap with <lastmod> dates signaling freshness

4. Monthly "State of [Industry]" Report

  • Auto-generated from SME library (human review only, not authoring)
  • Published as webpage + downloadable PDF
  • Recurring citation target (AI re-crawls fresh content frequently)
  • Builds email list (gated PDF)
  • Generates social shares + backlinks

Validation Gate (Important)

Before we sell Citation Magnet as a standard deliverable, we need ONE proof-of-concept build that generates a Citation Magnet page from an existing SME library file using Dr. Strange's automated pipeline. If this requires significant human authoring rather than automated generation with human review, we don't ship it as a productized offering yet.

Don will tell you which client gets the pilot. Don't promise Citation Magnet to a new client until pilot status is confirmed with Don.


Part 4: Pricing + Deployment Models

We support three deployment paths. The model determines scope, timeline, and price.

Model A: Greenfield Build (Full SAGEO)

When: New business, no existing site, or existing site is beyond salvaging.

  • Full Astro 6 hybrid architecture on Cloudflare Pages
  • Complete dual-audience page design
  • Citation Magnet architecture included
  • 90-day launch sequence
  • Price range: $8K–$15K + $1K–$2K/mo retainer

Model B: Migration (Existing → Astro)

When: Client has WordPress/Squarespace/Wix with valuable content but a platform that limits SAGEO.

  • Phased content migration to Astro
  • URL mapping + 301 redirect plan
  • Preserve existing SEO equity during transition
  • Citation Magnet added in Phase 2
  • Price range: $5K–$12K + $1K–$2K/mo retainer

Model C: SAGEO Overlay (Keep Existing Platform)

When: Budget-conscious SMB. Most common entry point. Don's preferred starter.

  • Add JSON-LD to existing pages (works on any platform)
  • Publish /llms.txt and knowledge-base/index.json
  • Build Citation Magnet pages as subdirectory/subdomain
  • Add FAQ schema to existing service pages
  • Implement Speakable markup
  • Monthly GEO audits + optimization
  • Price range: $2K–$5K + $500–$1K/mo retainer

The SAGEO Audit as Sales Tool

Standalone GEO Audit (geo-analysis run): $500–$1,000

  • Show prospect their current AI visibility score (likely low)
  • Show where competitors are being cited instead
  • Propose Overlay or Build to close the gap

Cost to Kaizen per audit: ~$0.30 in API calls. High-margin, demo-ready, urgency-creating. This is your lead-gen weapon.

All pricing is currently placeholder. Validate ranges after 3 completed pilot builds. Adjust based on hours invested and client willingness to pay.


Part 5: Off-Site GEO Tactics (The Zapier Playbook)

This is the part most SEO agencies still don't do. From the Zapier interview (Andrew Warner, March 2026) — these are the tactics that get Zapier mentioned millions of times by LLMs:

Tactic 1: Reddit (Highest ROI for GEO)

  • LLMs heavily weight Reddit content because moderators vet answers over time
  • Older posts are more valuable — LLMs trust them more
  • Use a house account (Zapier uses zapier_dave)
  • Answer questions that customers are likely to type into LLMs
  • Ask questions that customers are likely to type into LLMs (then answer your own questions)
  • Don't obsess over upvotes — Zapier found little correlation between vote counts and GEO utility
  • It's a volume play — spread across hundreds of threads, not viral on one

Tactic 2: YouTube

  • Gemini uses YouTube heavily; other LLMs are influenced by it indirectly
  • Create your own videos, especially for B2B (less competition in B2B video)
  • Work with both big creators (polished, high-reach) AND small creators (outsized LLM influence per view)
  • Example from Warner's interview: a 835-view video earned 5.9% of the question citation share in its niche

Tactic 3: Correct Outdated Articles

  • Older articles about the client continue to influence LLM responses long after they're outdated
  • Message publishers with outdated info; ask for updates
  • Publishers often comply because they want credible content
  • Track which 3rd-party articles AI is citing → systematically refresh them

Tactic 4: Tools for Measuring LLM Citations

Beyond our internal geo-analysis:

  • Profound — monitor prompts, track citation rates vs. competitors
  • Petra Labs — similar to Profound, more customizable
  • Amplitude — customer journey funnel analytics, citation changes over time
  • Ahrefs Brand Radar (2026 feature) — built-in AI mention tracking

Part 6: Measurement + Monthly Reporting

Every client gets a monthly SAGEO report. Plain language, never jargon (see Part 8 for the translation guide).

Success Metrics (90-day targets)

Metric Source Target
GEO Score geo-analysis Baseline + 15 pts
Citations (% queries where AI cites client) geo-analysis 30%+
References (% queries where client is named) geo-analysis 50%+
SERP Presence (top-10 organic rankings) Ahrefs Rank Tracker 60%+ of target keywords
Organic Traffic Google Analytics +25% vs. baseline
Lead Attribution CRM / intake form "How did you find us?" = "AI assistant" tracked

Monthly Report Template

MONTHLY SAGEO REPORT — [Client Name]
Month: [Month Year]

YOUR AI VISIBILITY SCORE: [X]/100 (↑/↓ from last month)

WHEN CUSTOMERS ASK AI ABOUT YOUR INDUSTRY:
- [X] of [Y] questions: AI recommends YOUR business ✅
- [X] of [Y] questions: AI mentions you but not first 🔶
- [X] of [Y] questions: AI doesn't mention you yet 🔴

TOP WINS THIS MONTH:
- "[Query]" — ChatGPT now cites [Client] (was absent last month)
- "[Query]" — moved from mention to direct recommendation

PRIORITY TARGETS NEXT MONTH:
- "[Query]" — competitor [X] is being cited; here's our plan to earn that spot

TRADITIONAL SEO:
- Google rankings: [X] keywords in top 10 (was Y last month)
- Website traffic: [X] visits (+Y% vs last month)
- Leads from website: [X] ([Y] mentioned finding you through AI)

THIS MONTH'S WORK:
- [List of pages updated, schemas added, content shipped]

NEXT MONTH'S PRIORITIES:
- [List of upcoming work, tied to specific GEO Score gaps]

The Revenue Attribution Reality Check

Honest caveat to know going in: the chain from "AI cited the client" → "customer walked in the door" is imperfect. Most AI-driven visits don't carry UTM parameters. Your three best proxies:

  1. The "How did you find us?" intake form field (include "AI assistant: ChatGPT/Perplexity/Claude/etc.")
  2. GEO Score trends (rising = more AI mentions = more downstream brand recall)
  3. Anecdotal feedback ("a customer told me ChatGPT recommended you")

Don't oversell attribution precision to clients. The honest pitch: "AI is increasingly how your customers research. Our job is to make sure when they ask, you're the answer. Here's how we measure that."


Part 7: The Tips Layer (Stuff Most Manuals Miss)

These are the small, high-leverage details that separate good SEO work from great SEO work. Pulled from Keep.md, recent industry sources, and lessons learned.

Technical: The 2MB Rule (Charles Floate, March 2026)

  • Googlebot fetches only the first 2MB of HTML. Everything after that = invisible to Google.
  • Place these AS HIGH IN THE DOCUMENT AS POSSIBLE: <title>, <meta> tags, canonicals, all structured data (JSON-LD).
  • External CSS/JS files get their own 2MB limit per file.
  • PDFs get 64MB. Use PDFs for resource downloads that need to be indexed.
  • WRS (Web Rendering Service) is stateless — it clears localStorage + session data between requests. If your content depends on session state to render, Google can't see it.

Technical: Shopify Quietly Blocks AI Crawlers

If a client is on Shopify, their robots.txt likely blocks GPTBot, ChatGPT-User, ClaudeBot, anthropic-ai, PerplexityBot, CCBot by default (Shopify update, early 2026). To fix:

  1. Online Store → Themes → Edit code
  2. Add robots.txt.liquid to templates folder
  3. Remove the AI bot disallow blocks OR explicitly Allow: / for AI user agents
  4. Keep /admin, /cart, /checkout, /account disallowed

Google's Own Guidance on AI Optimization (Source: Google Search Central, 2026)

What Google explicitly says NOT to do (debunking SEO myths):

  • ❌ Don't create special "machine-readable files" or rewrite content specifically for AI systems
  • ❌ Don't "chunk" content into tiny pieces for AI
  • ❌ Don't try to manipulate AI algorithms
  • ❌ Don't create excessive pages targeting specific AI keywords

What Google says TO do:

  • ✅ Create unique content with a distinct perspective (first-hand reviews, expertise-driven)
  • ✅ Non-commodity content > commodity content (avoid generic "7 Tips for X" pieces)
  • ✅ People-first content that satisfies user needs
  • ✅ Standard semantic HTML, proper JavaScript handling, indexable + crawlable
  • ✅ Good UX across devices
  • ✅ Merchant Center + Google Business Profiles for product + local data

The synthesis: GEO is not about anti-SEO tactics. It's about doing SEO better — with more uniqueness, more first-hand expertise, more declarative structure, more entity clarity. The same content that wins in AI search wins in Google search. There's no "AI-only" content track.

Voice: The Most Common Mistake

When writing client copy, the failure mode is writing in Don's voice or Sebastian's voice instead of the client's voice. Always pressure-test:

  • Does this sound like the client? Or like a consultant?
  • Is the formality calibrated? (Law firm ≠ SaaS startup ≠ medical practice ≠ tea shop)
  • Are industry-specific terms used correctly?
  • Would the client's existing site feel consistent with this?

If client_voice is vague when you brief Dr. Strange, stop and clarify with Don before generating. Generic voice is worse than no voice.

The Kalicube "First-Failing Gate" Rule (Jason Barnard, April 2026)

The AI engine pipeline has 10 gates between "discovered" and "won." Confidence passes multiplicatively — 90% at each of 10 gates = 35% at the end. A single weak gate destroys everything downstream.

The rule: Locate the earliest gate where confidence drops below threshold and fix THAT one first. Investing in citation quality when pages don't render properly = waste. Investing in third-party corroboration when entity isn't disambiguated = waste.

Common first-failing gates:

  1. Rendering — pages don't load / JS fails / 2MB cutoff hits
  2. Indexability — robots.txt or meta noindex blocks
  3. Annotation — schema is missing or broken
  4. Entity disambiguation — Google/LLMs can't tell which "Smith Law" we're talking about (fix with sameAs links to Wikipedia, Wikidata, LinkedIn, GBP)
  5. Extraction — content is written discursively, not in Claim-Frame-Prove blocks
  6. Citation — no statistics, no dated sources, no authoritative anchors

Always diagnose before you act. Your first GEO Analysis run tells you the symptom; your seo-audit + schema audit tells you which gate is failing.

The "Hidden Salesforce" Frame (For Client Conversations)

When selling GEO to skeptical clients, the most effective frame (per Kalicube):

"Seven AI platforms — Google, ChatGPT, Perplexity, Claude, Copilot, Siri, Alexa — are working 24/7 either for you or for your competitors. Untrained, they default to whichever competitor has the best-corroborated content. Trained, they become the most scalable sales channel a business has ever had. The question isn't whether AI is recommending businesses in your space. It's whose business it's recommending."

CEOs understand the salesforce argument in ~8 seconds. Use it.

Content Drift vs. Corroboration Decay (Governance)

Most agencies monitor content drift (your own content going stale). They miss corroboration decay — 3rd-party references that supported your credibility get rewritten, drop off, or get superseded by competitor-favorable articles. Your site hasn't changed; the evidence base under you has.

Add to monthly retainer scope:

  • Track full brand footprint, not just owned site
  • Refresh third-party content on a deliberate cadence (quarterly minimum)
  • Set Google Alerts for client name + key terms; flag any new article that contradicts or supersedes our positioning
  • Reach out to publishers proactively (correction requests, refresh offers)

Decision Pages > Top-of-Funnel Content

From @denohawari's $30M case study:

"Don't write 'what is HR software?' — that's broad keyword content where giants always win. Write '[competitor] vs [your brand]', 'alternatives to [competitor]', 'best [tool] for [specific use case]'. These pages directly match the questions buyers ask AI. AI rewards specificity, not authority age."

For every client, build at minimum:

  • 1 [client] vs [top competitor] page
  • 2-3 best [service] for [specific customer profile] pages
  • 1 alternatives to [competitor] page (if the competitor is gettable)

The Karpathy "Untrained Salesforce" Insight (Operating Note)

Some of the most important content on a well-built website is rarely visited by humans, because it exists to teach the AI what the brand is. The seven AI platforms read that content to form their understanding of the client even when no human ever clicks the page.

Don't prune low-traffic pages purely on analytics. If the page exists to feed the entity understanding (about page, technical service definitions, methodology pages), it earns its place even with zero human visits. AI eats it.


Part 8: Client-Facing Translation Guide

Never lead with technical jargon in sales conversations. Use plain English.

What We Call It (Internal) What the Client Hears
SAGEO Optimization "We make sure your business shows up when people search Google AND when they ask AI assistants like ChatGPT for recommendations"
Astro Islands architecture "Your site loads in under 2 seconds on any device — faster than 95% of your competitors"
Dual-audience page design "Your website works for both Google and AI assistants — most sites only work for one"
⅓/⅔ SEO/GEO-AEO split "We spend most of our effort making sure AI recommends YOUR business, not just getting you to page 1 of Google — because that's where your customers are heading"
Citation Magnet "We build a section of your site that makes you the source AI recommends when customers ask questions in your industry"
/llms.txt "A file that tells AI systems what your business does and why you're the expert — like a business card for ChatGPT"
JSON-LD structured data "Hidden code that helps Google and AI understand exactly what services you offer, where you're located, and why you're qualified"
Speakable schema "We mark the most important parts of your pages so voice assistants like Siri and Alexa can read them to customers"
GEO Analysis / AI citation audit "We check whether AI assistants are recommending your business — and show you exactly which questions you're winning and losing"
GEO Score "Your AI Visibility Score — a simple number showing how often AI recommends you vs. your competitors"
E-E-A-T signals "Proof that you're a real expert — credentials, experience, reviews — the stuff that makes both Google and AI trust you"
FAQPage schema "We format your FAQ so it can appear directly in Google search results and AI answers — not just on your website"
Decision pages "Pages that win the moment a buyer is choosing between options — like '[competitor] vs us' or 'best X for [your situation]'"
Topic clusters "A web of connected pages that makes AI see you as THE expert on your topic, not just someone who wrote one article about it"
Claim-Frame-Prove structure "We write your content so AI can lift our answers directly into ChatGPT's response — that's how citations happen"

Part 9: Workflow Quick Reference

Where to Post What

What Channel ID
Every Dr. Strange skill invocation #seo-geo 1487707978860593375
Website build deployments + previews #website-builds 1487707980857217154
Client deliverable review (post-approval) #projects 1482998914419261542
Don's strategic input / decisions #the-cabinet 1482959562901295136
Questions for Sebastian DM or #the-cabinet
Questions for Don DM or #the-cabinet

The Quality Gate (Non-Negotiable)

Every Dr. Strange output goes to #seo-geo for Don's review before client delivery. No exceptions.

This includes:

  • Audit reports
  • GEO Analysis reports
  • Generated copy
  • Generated schemas
  • Proposed roadmaps

Why: this is how we catch voice misfires, factual errors, and over-promising before they hit the client. Dr. Strange is fast but not perfect. The review gate is what makes the system trustworthy.

Daily / Weekly / Monthly Rhythm

Daily:

  • Check #seo-geo for any Dr. Strange runs awaiting your review
  • Triage new client requests in #research-queue (if SEO-related)
  • Track active client tasks in shared task queue

Weekly:

  • Run Ahrefs Site Audit on each active retainer client (automated; just review issues)
  • Review Rank Tracker movement
  • Brief Dr. Strange on the next batch of pages/schemas/audits

Monthly:

  • Re-run geo-analysis for every retainer client
  • Generate monthly SAGEO report per client (template in Part 6)
  • Update historical trend data
  • Review 3rd-party corroboration decay (Google Alerts queue)

Part 10: Open Questions for Don (Before Jen Starts)

These are flags from the manual draft that need Don's call before Jen's first day:

  1. Citation Magnet pilot client. Which client gets the first POC build? Until that's validated, we don't sell it as a standard productized offering.
  2. Pricing. All price ranges in Part 4 are placeholders pending 3 pilot builds. Do we want Jen empowered to quote within ranges, or every quote routes through Don?
  3. Ahrefs seat. Does Jen get her own Ahrefs login, or do we share Don's? RESOLVED 2026-05-17: Built kaizen-colony/skills/ahrefs-intel/SKILL.md v1.0. Jen doesn't need her own Ahrefs UI login — all data flows through the colony skill API. We have 400K units/month (Standard plan), enough for ~500-800 new client baselines OR ~2,500 active retainers. Revisit only if Jen needs ad-hoc UI exploration the skill doesn't cover.
  4. Stratus split. Per the strategic context, Jen splits time from Stratus. How many hours/week is she on Kaizen? Affects sequencing.
  5. Onboarding clients. Who's client #1 for Jen + Dr. Strange to run on as her first live engagement? Rideout Law? Geraci LLP? Coworkhive?
  6. Carson coordination. Dr. Strange's web-builder skill (separate from these four) is what Carson uses for greenfield builds. Should Jen pair with Carson on Model A engagements, or stay focused on Model C overlays for the first 90 days?
  7. Tooling. Does Jen need access to Profound or Petra Labs (3rd-party AI citation trackers), or do we stay 100% on our internal geo-analysis + Ahrefs Brand Radar for now?

Appendix A: Source Material

This manual was synthesized from:

Internal docs:

  • kaizen/sageo-optimization-template-v1.1.md — SAGEO template (Jarvis → Sebastian, April 2026)
  • kaizen-colony/skills/dr-strange/seo-audit/SKILL.md
  • kaizen-colony/skills/dr-strange/geo-analysis/SKILL.md
  • kaizen-colony/skills/dr-strange/seo-content-writer/SKILL.md
  • kaizen-colony/skills/dr-strange/schema-markup/SKILL.md
  • kaizen-colony/bot-core-files/ATLANTIS-CHANNEL-MAP.md

External (via Keep.md):

  • Google Search Central, "Optimizing for Generative AI Features on Google Search" (2026)
  • Jason Barnard / Kalicube, "Extending IBM's GEO Playbook: 18-Component Operational Framework" (April 2026)
  • a16z, "How Generative Engine Optimization (GEO) Rewrites the Rules of Search" (May 2025)
  • Andrew Warner + Zapier, "GEO Playbook: How LLMs Decide What to Recommend" (March 2026)
  • @denohawari, "How We Drove $30.52M For Clients With LLM SEO" (March 2026)
  • @bloggersarvesh, "My Chief of SEO, Claude Cowork" + 7 Claude SEO Prompts (March 2026)
  • Charles Floate, "Googlebot's 2MB HTML Cutoff" (March 2026)
  • IBM GEO Playbook via Search Engine Land (April 2026)
  • Ahrefs, "How AI Search Drives Traffic + Conversions" (2026)

Peer-reviewed:

  • Aggarwal et al., "GEO: Generative Engine Optimization" (KDD 2024) — arxiv.org/abs/2311.09735

Manual v1.0 produced by Sebastian 🦀 for Jen Villadolid's onboarding to Kaizen AI Lab. Awaiting Don's review and sign-off before delivery to Jen.