In late 2023, researchers at Princeton and Georgia Tech published a paper that coined a phrase the SEO industry has been grappling with ever since. "GEO: Generative Engine Optimization" by Aggarwal et al. introduced a formal framework for thinking about how content can be optimised not for traditional search engine rankings, but for citation rates in AI-generated responses.

The paper's core finding was both simple and significant: the attributes that make content rank well in Google are not the same attributes that make content get cited by generative AI systems. And in some cases, the divergence is sharp enough that content optimised for one can actively underperform in the other.

GEO vs SEO: What Is Actually Different

Traditional SEO operates on a retrieval model: a search engine crawls and indexes documents, then ranks them by relevance and authority signals (backlinks, on-page signals, user behaviour) in response to a query. The output is a ranked list. Your goal is to be near the top of that list.

Generative engine optimisation operates on a synthesis model: an AI system retrieves relevant documents (often via vector similarity or RAG), then synthesises a response that draws on those documents. The output is a generated answer with citations (Perplexity) or without (ChatGPT in some modes). Your goal is to be in the set of documents the model draws on and cites.

The mechanics that determine inclusion in a synthesised answer are different from the mechanics that determine position in a ranked list. Specifically:

Authority signals (backlinks) matter less for GEO than they do for SEO. The retrieval component of most generative systems is semantic, not PageRank-based.
Content quality signals matter more. The synthesis layer — the LLM — evaluates content for coherence, specificity, and factual density before incorporating it.
Structural signals matter differently. Clear headings and structured content help LLMs extract specific claims; they matter for GEO in ways that go beyond their traditional SEO value.

The 6 GEO Tactics That Empirically Increase Citation Rates

The Aggarwal et al. paper tested specific content interventions against a benchmark of 10,000 queries across 9 domains and measured their effect on citation rates. These are not theoretical recommendations — they are empirically measured.

1. Include Statistics and Quantitative Data

Content that includes specific statistics — particularly cited statistics from credible sources — showed citation rate improvements of up to 40% in the research. Generative models prefer claims they can anchor to specific numbers, because quantitative data reduces the risk of hallucination when the model synthesises it.

Practical implication: audit your key content pages and look for claims that can be replaced with specific figures. "Many companies use AI for content creation" becomes "67% of B2B marketing teams used AI content tools in 2025 (Content Marketing Institute)." The latter is far more citable.

2. Use Direct Quotations from Authoritative Sources

Pages that include quoted material from recognised experts or authoritative sources showed significantly higher citation rates than pages that paraphrased the same information. This mirrors how human writers research: a direct quote from a credible source is a higher-confidence signal than a paraphrase.

If you are writing about a topic where expert opinion is relevant, include actual quotes — attributed, specific, and sourced. Do not paraphrase when you can quote.

3. Write with Fluency and Precision

This one seems obvious but the research quantified it: high-fluency content (grammatically correct, well-structured, precise language) was cited more frequently than equivalent content written with lower linguistic quality. LLMs are trained on high-quality text and have learned to weight fluency as a proxy for reliability.

The implication for content strategy: copy editing is not just aesthetics. Vague, hedged, or grammatically loose content underperforms for GEO purposes even if it ranks fine in traditional search.

4. Provide Explicit Citations to Source Material

This is counterintuitive for SEO practitioners: linking out to sources improves your GEO performance. In traditional SEO, excessive outbound links are sometimes considered a dilution risk. In GEO, citing your sources makes your content more credible to the generative model synthesising a response.

The reasoning is consistent with how LLMs work: they learned from well-cited academic and journalistic content, and they associate citation density with reliability. Content that looks like a well-researched article — with explicit references — is treated as higher confidence than content that reads like marketing copy.

5. Use an Authoritative Tone

The research showed that content written in an authoritative, declarative tone performed better than equivalent content written tentatively. "Research shows X" outperforms "it could be argued that X might..." The model is trying to answer a question; it prefers sources that state things clearly.

This does not mean being overconfident about uncertain claims — that can backfire through inaccuracy. It means expressing genuine expertise and certainty where it is warranted, rather than hedging everything.

6. Include Unique Research or Original Data

Content based on original research — proprietary surveys, unique data analysis, first-hand case studies — showed the highest citation improvement in the GEO paper, with some categories showing 80%+ uplift. This is the highest-effort but highest-return GEO investment.

The logic is clear: a generative model trying to answer a question cannot synthesise a finding that only one source contains. If your content is the only place a specific data point exists, it must cite you to include that data point. Exclusivity creates citation necessity.

GEO Does Not Replace SEO

It is important to be precise about the relationship between GEO and SEO. They are not competing strategies — they are complementary ones that optimise for different surfaces.

Traditional SEO still drives the majority of web traffic. Google search still works. For many categories and query types, organic search is the dominant discovery channel and will remain so. GEO optimises for a different and growing channel — AI-generated answers — that is increasingly important but not yet dominant in most verticals.

The tactical overlap is also real. Content that is high-quality, well-structured, specifically written, and backed by data tends to perform well in both channels. The divergence is at the margins: the specific investment choices (build more backlinks vs. commission original research; focus on on-page SEO vs. improve citation density) differ between the two frameworks.

Measuring GEO Performance

The challenge with GEO, relative to SEO, is measurement. You cannot check your "position" in an AI answer the way you check your Google ranking. AI systems do not expose consistent, queryable APIs for citation tracking.

The practical approach is systematic prompt testing. Define a set of queries that are relevant to your category — the questions your customers would ask an AI assistant when evaluating solutions like yours. Then test those prompts against ChatGPT, Perplexity, Gemini, and Claude on a regular cadence and record:

Whether your brand is mentioned
Whether you are cited with a link (for systems that provide citations)
What context surrounds the mention
Which competitors are mentioned in the same response

This data, tracked over time, gives you a GEO performance baseline and a way to measure the impact of your content investments.

Surfaceable automates this tracking — running your defined prompt set across multiple AI systems weekly and tracking brand mention rates, citation frequency, and competitive benchmarks over time. Without systematic measurement, GEO improvement is difficult to validate and hard to prioritise against competing investments.

The Compounding Effect of GEO Investment

One of the more important findings from the GEO research is that the effects of content quality improvements compound over time. As LLMs are retrained on newer data, content that consistently performs well in retrieval and synthesis tends to accumulate a positive signal — it appears more frequently in AI training data, which increases its weight in future model generations.

This means that the brands doing GEO work now are not just winning in current AI search — they are building a representation advantage in the training data that will power the next generation of models. The window to build that advantage, while most competitors are not yet thinking about it systematically, is still open.

Generative Engine Optimisation (GEO): The Research-Backed Guide