Methodology

How the Citation Readiness Score works.

The score is a geometric mean of five 0.0 – 1.0 dimensions, multiplied by 100. Lower bound 0 (the page is structurally uncited-able), upper bound 100 (the page is in the top decile of pages we've audited for both search ranking and LLM citation). Two dimensions act as hard-blocks at 0.0 — if either Dual Fit or Bot-Crawl Health scores zero, we report the structural failure instead of an aggregate.

The scoring rubric is open. Below is every input, every threshold, every fail condition. We adapted it from the @geo specialist module of our internal AcePilot framework — the same module audits every public-facing change on our own fleet of ~30 sites before any deploy. We eat the dog food: CitationDesk's own pages clear ≥0.7 on every dimension before shipping. You can verify by running our own URLs through the Free tool.

SEO Foundation

Score 0.0 – 1.0

We audit your page against 13 SEO minimum items. Score = (items passing) / 13 rounded to 0.1.

Stable URL slug (no /p/12345)?
Unique <title> (50–60 chars, primary keyword early)?
Unique <meta description> (~155 chars)?
Single H1 matching title intent?
H2 structure reflecting sub-topics?
≥3 in-context internal links?
Schema.org markup (WebPage minimum + archetype)?
OpenGraph + Twitter Card complete?
Canonical URL declared?
hreflang declared (or single-locale signal)?
Mobile-perfect render at 375px?
Core Web Vitals targets (LCP <1.5s · INP <200ms · CLS <0.05 · TTFB <400ms)?
IndexNow ping on deploy?

GEO Readiness

Score 0.0 – 1.0

The Aleyda Solis 10-characteristic checklist, applied page-by-page. Score = (items passing) / 10.

Accessible — content rendered in first-paint HTML (no JS-gated body)?
Useful — unique dataset / synthesis / POV not in training corpus?
Recognizable — Organization schema + same name + sameAs across the site?
Extractable — H2 headings are quote-ready sentences, DefinedTerm schema for key concepts?
Consistent — same voice, identity, claims across the site?
Corroborated — facts appear in ≥3 independent sources (Wikipedia, Reddit, LinkedIn)?
Credible — E-E-A-T signals (operator identity, methodology, citations)?
Differentiated — explicit POV in a /learn or methodology section?
Fresh — datePublished + dateModified + visible "Last verified"?
Transactable — user can act from the citation (live pricing, email capture)?

Dual Fit

Score 0.0 – 1.0 (0.0 = HARD-BLOCK)

The hardest dimension. Most pages target ONE of {keyword ranking, AI citation}. Dual-Fit pages target both. A 0.0 here means an orphan page neither route surfaces — we block the audit verdict pending a structural fix.

Does this page target a real search query a human would type?
Does it answer a fact-question an LLM would extract?
Are both surfaces clearly addressed in title + first 100 words?
Is there a quote-ready format (table, definition, list) the LLM can grab?

Entity Coherence

Score 0.0 – 1.0

LLMs build entity graphs from coherent signals. Scattered identity = grouped as "miscellaneous". A coherent entity gets cited as a named source.

Is the brand name + Organization schema consistent across every page?
Is there a Person schema for the operator/author with byline?
Is sameAs present on Person and Organization with ≥2 public profiles?
Is the /about page complete with operator identity + methodology + sameAs?
Are author URLs stable and indexable (not /author/?id=42)?

Bot-Crawl Health

Score 0.0 – 1.0 (0.0 = HARD-BLOCK)

The upstream gate. If bots can't crawl, no amount of GEO optimization matters. A 0.0 here means citation is structurally impossible — we block the audit verdict pending an infrastructure fix.

Does robots.txt allow GPTBot, ClaudeBot, PerplexityBot, Googlebot-Extended, Applebot-Extended, CCBot, Amazonbot, Bytespider, Meta-ExternalAgent?
If on Cloudflare, is AI Crawl Control = "Do not block (allow crawlers)"?
If on Cloudflare, is Manage robots.txt = "Disable robots.txt configuration"?
Is llms.txt present at site root with valid Anthropic + OpenAI + Perplexity-compatible content?
Does GPTBot return HTTP 200 on the audited URL?
Is the body content rendered in first-paint HTML (not JS-gated for GPTBot + PerplexityBot which don't execute JS)?

How we poll the four LLMs

The poll engine is the second half of the system — the part that asks each LLM your test queries weekly (Free), daily (Pro), or every four hours (Team) and records whether your site got cited.

Perplexity — primary API path (Sonar / Pro). Cited sources are returned structured.
Claude — Anthropic API for the canonical text; claude.ai authenticated session as fallback for the surfaces requiring browsing.
ChatGPT — chat.openai.com via authenticated browser automation. We log the model version, the query, the response, and any cited URLs.
Gemini — Vertex AI for the API path; gemini.google.com browser automation for the surfaces lacking API parity.

Provenance

Every recorded citation includes: timestamp · model + version · query text · cited URL · paragraph quoted (where extractable) · position in the response · competitors also cited. We do not synthesize or approximate — every row is a real LLM response captured at a real time. We expose the raw rows via CSV export (Pro) and API (Team).

When the methodology updates

LLM-citation patterns change as models update + retrieval pipelines evolve. We rev the methodology when the underlying signal shifts — and we document every change with a dated entry on this page. Our calibration discipline is from the AcePilot fleet's "learn-from-data" rule: every projection is logged against actuals, and multipliers self-adjust against per-archetype baselines after each ship.

Ready to run the audit on your site? Get your free Citation Readiness Score. Or learn more about the principles behind the score.