Why Reporting on LLM Traffic Is So Hard

HomeGeek SpeakSEOWhy Reporting on LLM Traffic I...

Why-Reporting-on-LLM-Traffic-Is-So-Hard-post

AI: two little letters that have sent the world into an absolute tailspin. Everyone wants a piece of the artificial intelligence pie, and that’s especially true for marketing teams hoping to cash in on LLM visibility.

But there’s just one problem: capturing AI data is like trying to understand what a beach is like by collecting a few grains of sand.

Let’s explore this topic further and discuss the few ways that Geeky Tech tries to help our clients work with what’s available.

Why Reporting on LLM Traffic Is So Hard post

Everything’s an Educated Guess When You Don’t Have the Real Numbers

Tools like SE Ranking, Ahrefs, and OtterAI do provide data, but it might not be what you think it is. This is not a conspiracy or some pre-sales-pitch smack talk; it’s just the nature of how LLM data is available to the world, and one of the handful of reasons LLM traffic is so hard to report on concretely.

AI engines like ChatGPT, Perplexity, Claude, Gemini, and the entire cast of I, Robot generate ocean loads of data every single day. But it’s not available to the public.

That means that whatever data you’re looking at on your fancy tool dashboard is a product of…. 

  • The tool prompting the LLM for results
  • Referral traffic from Google Analytics 4
  • Prompting LLMs with common queries, then scraping the results for your brand’s citations/links, and extrapolating from those results
  • User surveys and self-reporting (if available) 

This isn’t nothing, but it’s not the full picture. There’s no LLM equivalent to Google Search Console, so these tools don’t know what search terms/prompts people are using. They can scrape data using common queries in which your brand may show up. Of course, this is merely an educated guess. 

But that’s not the only reason you’re not seeing the full picture.

The Data We Do Have Is Only Half the Story

session sources from llms

In the image above, you’re looking at referral traffic from GA4 with a filter on it. These numbers represent real people landing on your website from today’s most popular AI engines. But thanks to data privacy laws, you’re only seeing maybe 50% of the actual referral traffic, which is the industry’s recognised acceptance rate. Where’s the other 50%? Those are the people who have opted out of cookies. 

So even when you’re looking at real traffic data, you still don’t have the full picture. 

But there’s an even more obvious reason why it’s so hard to report on AI traffic.

Google Search Is Keyword-Based, AI Search Is Conversational

How does a person typically reach a landing page from a Google search? They enter the keyword, browse through the SERPs, click on a blue link, et voilà!  

How does a person reach a landing page from a ChatGPT query? That depends entirely on the nature of the conversation.

ChatGPT what is the best isp skye

They may ask a brand-related question, like ‘what is the best ISP on the Isle of Skye?’ and click on the first landing page ChatGPT spits out. Or, they might take their time with it. They might open another window, browse Yelp reviews, look at comparison sites, ask their friends, and then head over to your landing page. Both end in a click, but you’ll never know the journey the user went on in the second example.

ChatGPT isle of skye good place to live

Then there are the conversations that start off about tangerines and end up about the French Revolution. For example, let’s say a user is thinking about moving to a remote location in the UK and they’re weighing their options. They ask Perplexity something like ‘Is the Isle of Skye a great place to live?’ which leads to a whole conversation about the Isle of Skye’s climate, school system, cost of living, scenery, etc. FINALLY, ten minutes later, the user asks, ‘But is the internet reliable there?’

If you had access to Perplexity’s search data, at what point would you, an internet service provider in the UK, start to care about where this conversation is going or how it started? From the very first question, which really had nothing to do with ISPs? Or from the reliability question, which doesn’t even mention the Isle of Skye? 

This is why reporting is so difficult. 

Screenshot 2025 12 22 at 11.14.06

More Traffic Doesn’t Necessarily Mean You’re Doing a Good Job

Gosh, would you get a load of those numbers, huh? This client’s LLM traffic has more than doubled over the past year. This would be a straightforward win if we were looking at Google Search traffic, but since we’re talking about LLMs, we have to take it with a grain of salt because more people used LLMs in 2025 than they did in 2024.

So, is your website more visible, or are there simply more ChatGPT users? We can only hope that your sessions are a result of both. This is something we don’t have to worry about with GA4 data since we assume that Google Search adoption is as big as it’s ever going to be.

Now that we’ve waterboarded you with reality, here’s how marketing agencies like Geeky Tech are using the limited data available to gain actionable insights.

How Geeky Tech Works With the Data Available

Setting Up Tracking with Regular Expression

First, we set up precise filtering in Google Analytics 4 using regex (or regular expression) to capture AI referrals from ChatGPT, Perplexity, Claude, Gemini, and Copilot—the biggest LLMs at the time of this writing, plus many other, lesser-known answer engines. This helps us get specific on AI traffic so we can start to see trends and referral patterns.

Regex that we use to filter for LLMs

           chatgpt|openai|perplexity|gemini|claude|anthropic|copilot|grok|phind|felo\.ai|you\.com|wenxiaobai|mistral|meta\.ai|llama\.com|app\.writesonic|poe\.com|huggingface|chat\.forefront|kimi\.com         

Inference Backed By SEO Experience

This is where all our SEO experience really starts to pay off. Since we’re only ever seeing the full picture through a keyhole, we use session data to make educated inferences. For example… 

  • If traffic from Copilot is hitting your pricing page, we may humbly infer that the user made a branded query. If traffic is going to your home page, it would be safe for us to assume that the query was informational and not branded.
  • If many of the pages getting traffic from AI are your blog posts, then you’re likely creating optimised content, and you should keep going!

A Wait and See Attitude

We may have burst your bubble about the quality of AI data you’ve been looking at. But even though AI traffic doesn’t give you the whole story, the fact that you’re even getting AI traffic in the first place is a pretty big deal.

Once your reporting is set up, the only thing left to do is to keep up with GEO practices and continue monitoring the metrics. This way, we can respond quickly to trends with more optimised content and technical improvements.

Final Thoughts

Because of data limitations and the conversational nature of AI search, AI traffic reporting is admittedly quite tricky. But like we said, if you’re getting AI traffic at all at this early stage of the AI revolution, consider this a win.  

At Geeky Tech, we’re committed to helping clients not just understand their reporting but capitalise on AI visibility. As this aspect of GEO evolves, so will our strategies.

If you liked this, you might also like:
Jump ahead:
Share This Post
Facebook
LinkedIn
Twitter
Email
About the Author
Picture of Genny Methot
Genny Methot
Genny Methot is Geeky Tech’s storyteller. She heads up our social media content, blog posts, and the Geek Speak podcast. Click here to learn more about Genny.
Shopping Basket