The AI Search Glossary

Plain-English, citable definitions for the vocabulary of Generative Engine Optimization, AI search visibility, and large-language-model brand discoverability. Updated for 2026. Each entry is anchor-linkable — try it.

Generative Engine Optimization (GEO)

Also known as: GEO, AI search optimization, LLM SEO

The practice of structuring a brand's content, citations, and structured data so large language models (ChatGPT, Claude, Perplexity, Gemini, Grok, DeepSeek) cite the brand accurately and frequently when users ask buyer-intent questions in its category.

GEO sits next to (not against) traditional search engine optimization. Where SEO targets the ranking of a URL on a Google results page, GEO targets the probability that a brand is mentioned, recommended, or quoted inside an AI-generated answer. The two disciplines share many tactics — clear titles, structured data, third-party citations — but the optimization surface is different: instead of ten blue links, you are optimizing for the model's output distribution.

In practice, brands working on GEO focus on three levers: (1) being present in high-authority third-party sources that the models retrieve from, (2) publishing canonical, machine-readable summaries of their own products and brand (the role Citorial's Brand Hubs play), and (3) maintaining consistent factual signals — name, address, claim, category — across the open web.

Back to top

AI Search Visibility

A measurement of how often, how prominently, and how accurately an AI engine mentions a brand when users ask category-relevant questions.

AI search visibility is the leading metric in GEO work. It generalizes "do I appear?" into a triplet of dimensions: presence (am I mentioned at all?), prominence (am I in the first sentence, last sentence, or buried in a list?), and accuracy (is what the model says about me correct?).

A useful visibility audit samples 100–1,000 buyer-intent prompts per category, executes them across multiple LLMs in parallel, and reports per-prompt presence/prominence/accuracy plus an aggregated share-of-voice score. Citorial audits are structured this way.

Back to top

Answer Engine Optimization (AEO)

Also known as: AEO

A discipline closely related to GEO that focuses specifically on optimizing for "answer engines" — products like Perplexity, Google AI Overviews, and Bing Copilot that return a synthesized answer instead of a list of links.

AEO and GEO are often used interchangeably, but AEO usually emphasizes the user-facing answer surface, while GEO emphasizes the underlying generative model. In practice, work that wins one usually wins the other: structured data, clear summaries, strong third-party citations.

A key AEO tactic is to make sure that a brand's most-citable claims appear in indexable HTML — not only inside a PDF or behind a login — because answer engines tend to retrieve from open, crawlable HTML before they reach for anything else.

Back to top

LLM Optimization (LLMO)

Also known as: LLMO

An umbrella term for any optimization work whose target is how a large language model represents or recommends a brand, product, or person.

LLMO is broader than GEO and AEO. It includes work that is not strictly "search" — for example, making sure a model that is asked to "draft a comparison table" includes a specific brand, or making sure a code-assistant model cites a specific library in code suggestions.

For e-commerce, the most common LLMO surface is buyer-intent recommendation: a model is asked "what's the best running shoe under $150 for flat feet?" — the brand wants to appear in the model's answer with correct framing.

Back to top

Share of Voice (AI)

Also known as: SoV, AI share of voice

The percentage of relevant AI responses in which a brand is mentioned, normalized over the total set of category-relevant prompts probed.

In an audit that probes 100 prompts and finds your brand mentioned in 23 of them, your share of voice is 23%. The metric is most useful when it is computed alongside the SoV of the top 3–5 competitors, because the absolute number is hard to interpret without that baseline.

Share of voice should always be reported per LLM, not only as a single aggregate — most brands have noticeably different SoV in Perplexity than they do in ChatGPT, and the corrective actions are not always the same.

Back to top

Buyer-intent prompt

A natural-language query that signals the user is in a buying mindset — comparing products, asking for recommendations, or evaluating specific brands.

Examples: "best running shoes for flat feet under $150", "is brand X worth it for sensitive skin?", "alternatives to brand Y for European shipping". These are the prompts that matter commercially — they are the ones where a citation directly maps to a potential sale.

A good GEO audit explicitly distinguishes buyer-intent prompts from informational prompts ("what causes plantar fasciitis?") because the optimization actions differ: buyer-intent visibility comes from review sites, structured product data, and brand-comparison content, while informational visibility comes from editorial content depth.

Back to top

Brand Hub

A canonical, machine-readable profile page maintained for a brand on a third-party site, designed to be cited by AI search engines.

Brand Hubs (the term Citorial uses) are similar in spirit to a Wikipedia entry or a Crunchbase profile: a single URL that bundles a brand's name, description, structured data, references, FAQ, and sameAs links into a clean HTML page LLMs can read without JavaScript.

Maintaining a Brand Hub on a domain that LLMs already trust (and that has an explicit `robots.txt` opt-in for AI crawlers) materially improves the brand's chance of being surfaced when a user asks a category question.

Back to top

Retrieval-Augmented Generation (RAG)

Also known as: RAG

A pattern where a language model retrieves relevant documents from an external source at query time and grounds its answer in those documents, instead of relying purely on its training-time memory.

Most consumer-facing AI search products (Perplexity, ChatGPT Search, Google AI Overviews, Gemini) use a retrieval-augmented pipeline. They issue web queries, fetch a handful of pages, and pass them to the language model as context for the final answer.

GEO work largely targets this retrieval stage. If your content is the highest-quality source the retriever can find for a query, your brand ends up grounded into the answer — even if the underlying model has never seen your brand in its training data.

Back to top

Structured data (Schema.org)

Also known as: schema.org, JSON-LD

Markup that explicitly labels parts of a page — for example, "this string is the brand name", "this paragraph is an FAQ answer", "this number is a price" — so search engines and LLMs can parse the page reliably.

The most common format for structured data on the web is JSON-LD embedded in a `<script type="application/ld+json">` tag inside the page's HTML. Schema types relevant to e-commerce include `Organization`, `Product`, `Offer`, `Review`, `FAQPage`, and `BreadcrumbList`.

AI search engines lean on structured data heavily when they need to extract a fact under tight latency budgets. A well-marked-up page is easier to cite than a visually-equivalent page that buries the same information in unmarked prose.

Back to top

sameAs

A Schema.org property used inside an Organization or Person entity to point to other URLs that refer to the same real-world entity — typically the brand's official social profiles, Wikipedia entry, or industry registries.

sameAs is one of the cheapest disambiguation wins in GEO. When the model encounters a brand name that could be a person, a product, and a software library, the sameAs graph tells it "the brand on this page is the same entity as @brand on X, the LinkedIn page at this URL, and the Wikipedia article at this URL". That removes ambiguity at retrieval time.

Back to top

Crawlable HTML

HTML whose meaningful content is present in the initial server response (or in pre-rendered static HTML) and does not require JavaScript execution to become visible.

Most LLM crawlers either do not execute JavaScript at all, or execute it under strict time/cost budgets. A single-page app that only renders content after a client-side fetch is, in practice, invisible to those crawlers.

Two effective fixes: pre-render the relevant pages to static HTML at build time (the approach the Citorial landing site uses), or run server-side rendering on every request. Either way, the goal is to have the brand-relevant text exist in the raw HTML response.

Back to top

hreflang

An HTML link relation used to declare that two URLs are translations of the same content into different languages, so search engines and LLMs can pick the right one for each user.

Without hreflang, search engines often serve the wrong-language version of a page or, worse, treat the two versions as duplicate content. The link tag — `<link rel="alternate" hreflang="pt-BR" href="https://example.com/pt-br/page" />` — is small but materially important for any brand operating in more than one language.

On Citorial's Brand Hubs, every page that has a sibling translation emits hreflang links pointing at each sibling and an `x-default` pointing at the canonical English version.

Back to top

llms.txt

A proposed convention — a Markdown file served at `/llms.txt` — that gives LLM crawlers a curated catalog of the most useful, citable pages on a site.

Where `robots.txt` controls what crawlers may fetch, `llms.txt` is a positive signal: "here is what is worth fetching, summarized for you". The proposal is still emerging in 2026, but several large AI labs have signaled support, and the cost of adding the file is low.

Citorial's `/llms.txt` lists the homepage, brand directory, glossary, how-it-works, and blog, plus the Brand Hub URL pattern and a pointer to the sitemap.

Back to top

robots.txt (AI crawlers)

The same robots.txt convention search engines have used since the 1990s, now extended with user-agent strings for AI training and inference bots (GPTBot, ClaudeBot, PerplexityBot, Google-Extended, Applebot-Extended, CCBot, etc.).

Most brands either ignore AI user-agents (leaving the default Allow:/ behavior) or block them across the board. Citorial recommends the opposite: explicitly opt in to AI crawlers for any content the brand actively wants cited.

The opt-in is the easiest first GEO action a brand can take. It costs nothing, ships in a single commit, and changes which crawlers are willing to fetch the brand's pages.

Back to top

Prompt coverage

The number of distinct buyer-intent prompts an audit probes against each LLM. A higher coverage gives more statistical confidence to the share-of-voice estimate.

Coverage of 100 prompts is the practical minimum for a small-category audit — enough to detect a meaningful presence gap but vulnerable to per-prompt noise. 300 prompts is a stable mid-tier, and 1,000 prompts is the gold standard for multi-brand benchmarking.

Citorial's three audit tiers correspond to these three coverage levels.

Back to top

New term every month, in your inbox.

The glossary grows as AI search evolves. Get the updates plus monthly long-form pieces — no spam.

We use double opt-in. By signing up you accept the privacy policy. Unsubscribe in one click from any email.

Want to see how AI search treats your brand?

A Citorial audit probes 100–1,000 of these buyer-intent prompts across ChatGPT, Claude, Perplexity, Gemini, Grok, and DeepSeek, then ships a 30-day action plan.

See pricing