Fill Your SEO Gaps
Module 4: Lesson 56 min read

Semantic HTML Hierarchy: Diagnose and Fix

Jules de Bruin

By Jules de Bruin

GEO Instructor at Rankscale

Last updated 2026-04-27

Summarize with AI

TL;DR. RAG systems chunk pages by heading hierarchy. A flat or broken structure (all H2s, or H1 → H4 with no H2/H3) means the engine cannot isolate the right chunk to cite. Rule: one H1, question-format H2s, scannable H3s, never skip a level. Inspect your page in 2 minutes with the accessibility tree.

Why hierarchy matters for AI

When a RAG system retrieves a page, it does not read top-to-bottom. It extracts chunks defined by heading boundaries. A well-structured page looks like this to the engine:

H1: Main topic
  H2: Question 1 → [chunk 1 = content under H2]
    H3: Sub-point → [sub-chunk]
  H2: Question 2 → [chunk 2]
  H2: Question 3 → [chunk 3]

A broken page looks like this:

H1: Main topic
  H4: Random sub-point (skipped H2, H3)
  (no H2 at all)
  Random bolded text that should be an H2 but is just a <strong> tag, not a heading.

On the broken page, the engine cannot decide where chunks begin and end. It either quotes the whole page (too long, gets down-weighted) or quotes nothing (you are not cited).

The 4 rules

Rule 1: Exactly one H1 per page - The H1 is the page's title statement. Matches or closely paraphrases the target prompt. One per page. Not two. Not "H1 styled as H2" with a div class. A real <h1> tag.

Rule 2: H2s in question format - H2s are the major section questions. Phrase them as the reader would ask them. Not:

  • "Features"
  • "Use Cases"
  • "Why Choose Us"

Use:

  • "What features does Rankscale support?"
  • "Who should use Rankscale?"
  • "Why use Rankscale over [alternative]?"

Question-format H2s double as FAQ candidates the engine can extract as direct answers.

Rule 3: H3s are scannable sub-points - H3s break an H2 section into sub-points. Keep them short, specific, and parallel. If your H3s under an H2 do not logically belong together, your H2 is too broad.

Rule 4: Never skip a level - H1 → H2 → H3 → H4. Never H1 → H3 directly. Skipping levels breaks the chunking logic and most accessibility tools flag it.

The 2-minute diagnostic

Pass criteria:

  • One H1, matching the page topic
  • 3 to 8 H2s, each a question
  • H3s grouped logically under H2s
  • No skipped levels
  • No div or span masquerading as headings with CSS

Any fail = hierarchy gap.

The most common failure modes

  • Designer-built pages. Marketing pages often use styled div tags instead of real heading tags. Visual hierarchy looks right; semantic hierarchy is absent.
  • CMS autogenerated pages. Blog templates often wrap the title in <h2> and use <h1> for the site logo. Invert it.
  • Multiple H1s from template modules. Hero banners, CTA sections, and testimonial widgets sometimes ship with their own H1. Audit every imported component.

Do this now:

If multiple H1s, skipped levels, or marketing headings masquerading as questions still show up, log back into Page Audit V2 and use its outline diagnostics to justify the rewrite ticket with screenshots. Fix time is typically 1 hour per page.

Start improving your AI visibility today with Rankscale.

Get started