Learn how to measure your brand's AI visibility with the right metrics: presence rate, position score, and share of voice across ChatGPT, Claude, and Perplexity.
One of the most common conversations in marketing teams right now goes something like this: "We know AI search matters. We've started optimising for it. But how do we know if it's working?" The honest answer is that most teams do not have an answer — not because the data does not exist, but because they have not set up the right measurement framework.
Tracking AI visibility requires different metrics than traditional SEO. This article defines those metrics clearly, explains how to collect the data, and shows you how to turn it into actionable insights.
Traditional SEO measurement — rankings, organic traffic, click-through rate — does not capture what happens when a user gets their answer from ChatGPT, Perplexity, or Google AI Overviews without ever visiting a website.
When a user asks "what's the best tool for tracking SEO metrics?" and ChatGPT names three competitors and not your brand, that is a lost impression. It does not show up in your Google Search Console as a lost click. It does not appear in your analytics. It is invisible to your standard reporting.
This is the measurement gap that AI visibility tracking fills.
Definition: The percentage of relevant AI queries in which your brand is mentioned at all.
If you run 50 queries across ChatGPT, Claude, and Perplexity — queries that are relevant to your brand and category — and your brand is mentioned in 18 of them, your presence rate is 36%.
Presence rate is your headline metric. It answers the most fundamental question: "Does AI know we exist?" For most brands, this number is lower than they expect when they first measure it.
How to use it: Set a baseline, then track it monthly. Presence rate responds relatively slowly to optimisation efforts — you are fighting against what LLMs have learned from vast training corpora — but consistent effort in the right areas (entity building, third-party coverage, content structure) does move it over 3-6 months.
Definition: When your brand is mentioned, how early in the response does it appear?
Being mentioned in position 1 (the first brand named) is significantly more valuable than being mentioned in position 4 buried after three competitors. Position score can be calculated as an average rank across all responses where you appear, or as a distribution (e.g. 40% of mentions are first, 35% second, 25% later).
How to use it: Brands with high presence rate but poor position scores are being mentioned as afterthoughts — "you might also consider X". The goal is to move from being mentioned to being the primary recommendation for specific use cases.
Position score tends to correlate with how specifically your brand is associated with a use case. If ChatGPT mentions you first for "AI visibility tracking tool" but fourth for "SEO tool", that tells you your category association is stronger than your general category association.
Definition: Across your competitive set, what proportion of AI mentions belong to your brand?
If you and four competitors collectively get mentioned 200 times across a set of relevant queries, and your brand accounts for 60 of those mentions, your share of voice is 30%.
How to use it: Share of voice contextualises your presence rate. A 30% presence rate sounds low in isolation, but if the category leader has 35% and everyone else is under 15%, you are well-positioned. Conversely, a 50% presence rate is concerning if a competitor is at 80%.
Track share of voice by:
Definition: When your brand is mentioned, how is it framed?
Not all mentions are equal. "Surfaceable is a strong choice for businesses serious about AI visibility" is materially different from "Surfaceable is one option, though some users report a steep learning curve."
Sentiment can be scored on a simple scale: positive, neutral, or negative. More sophisticated approaches score the degree of positivity or negativity and track what specific attributes are mentioned positively (ease of use, accuracy, support) versus negatively.
How to use it: Negative sentiment in AI responses often reflects patterns in your training data — reviews, forum discussions, comparison articles. Address the underlying issues, then generate fresh positive coverage to shift the balance over time.
Definition: How many of your target queries do you appear in at all?
This is the breadth dimension of your AI visibility. You might have a strong presence rate for branded queries ("what is [your brand]?") but low coverage for category queries ("what are the best [tools for X]?"). Query coverage reveals your blind spots.
Build a query bank of 50-100 relevant queries across:
For small teams or initial baseline measurement, you can manually run queries through ChatGPT, Claude, Perplexity, and Gemini, logging whether your brand is mentioned and where. A spreadsheet works.
The limitations are obvious: it is time-consuming, hard to do at scale, and suffers from LLM response variability (the same query can produce different results on different runs, requiring multiple runs per query to get a reliable signal).
Platforms like Surfaceable automate this process: you define your brand, competitors, and query bank; the platform runs queries across multiple AI systems on a regular schedule; and you get a dashboard showing your presence rate, position score, share of voice, and sentiment over time — with trends and competitive benchmarks.
This removes the manual overhead and provides consistent, comparable data. It also handles the variability problem by running queries multiple times and aggregating results.
The quality of your AI visibility data is only as good as the queries you track. Build your query bank around:
Category queries — queries a potential customer would ask when exploring your category without knowing your brand exists. "What are the best tools for tracking AI search visibility?" "How do I measure my brand's presence in ChatGPT?"
Use-case queries — queries framed around the problem your product solves. "How can I tell if my brand appears in AI answers?" "What's the best way to monitor my share of voice in LLM responses?"
Comparison queries — queries comparing your brand or category to alternatives. "ChatGPT vs Perplexity for brand research." "[Your brand] alternatives."
Competitor queries — queries specifically asking about competitors. Monitoring these tells you how competitors are being positioned relative to you.
Aim for a query bank of at least 50 queries across these categories for a meaningful baseline. Larger banks (100-200 queries) give you better statistical reliability and more granular insights by segment.
Once you have the data, reporting it clearly to stakeholders is its own challenge. The most effective format is:
The harder challenge is connecting AI visibility to revenue. This is still an evolving area, but early approaches include:
As AI search matures, the attribution infrastructure will improve. For now, treat AI visibility as a strategic metric — one that matters even before you can precisely quantify its revenue impact.
AI visibility measurement is no longer optional for brands serious about their digital presence. The framework is straightforward: define your query bank, establish your baseline metrics (presence rate, position score, share of voice, sentiment), track them consistently, and tie your optimisation actions back to movement in those metrics.
Tools like Surfaceable make this process scalable without requiring you to manually interrogate AI systems every week. Set up your tracking, establish your baseline, and start generating the data that makes AI visibility a measurable, improvable discipline — not just a vague strategic concern.
Try Surfaceable
See how often ChatGPT, Claude, Gemini, and Perplexity mention your brand — and get a full technical SEO audit. Free to start.
Get started free →