
We're delighted to feature our guest author and collaborator, Chris Long, here on the Botify blog to share his data analysis and conclusions. Chris Long is an SEO/AEO consultant and the co-founder of Nectiv, helping Series B and beyond companies improve their organic growth through search and LLMs.
Last summer, I shared a LinkedIn post detailing how marketers could analyze OpenAI’s crawl activity using log file analysis. At the time, I was a bit surprised at how many people reacted to it. It seemed like SEOs were very interested in knowing exactly how AI engines like ChatGPT interact with their site.
That got me thinking: who has lots of data on this topic? I reached out to Botify, the perfect partner to get this information, with one of the biggest log file datasets on the planet. It’s one of the best data sources out there to actually learn how ChatGPT is crawling your site.
Methodology
Access logs create a record anytime a bot or human visits your site. If Google, ChatGPT, or a person visits your site, a record is created and stored on your server. This is what’s known as a log file.
As part of their software, Botify parses and ingests log files for their clients. They work with some very large enterprise partners across retail & e-commerce, technology, publishing, travel, marketplaces, and more.
Exploring the data, I was able to get granularity into insights across fields like search or AI engines, bot user agents, industries, verticals, and more. The dataset spans well over 250 billion log files and counting.
We wanted to learn how ChatGPT’s crawlers specifically interact with the web as a whole. Looking at a period of time spanning November 2024 through March 2026, we analyzed roughly 7 billion different log files to gather insights into ChatGPT’s crawling behavior via the activities of three OpenAI bots: ChatGPT-User, GPTbot, and OAI-SearchBot.
Quick background on the crawlers
Before we begin, it’s important to understand the differences between OpenAI’s crawlers. There are three primary bots they use:
- ChatGPT-User: This bot represents a user-inititated action. If someone tells ChatGPT to look at a page or interact with an app, ChatGPT-User is the agent.
- GPTBot: This is the general crawler that OpenAI uses for its training data. If GPTBot is crawling your site, OpenAI could use that to improve the foundational knowledge of the models.
- OAI-SearchBot: This is OpenAI’s web searching crawler. When ChatGPT performs a search that isn’t user-initiated, this is the bot that crawls the web and sources the results. This is the one most associated with SEO and fan-out queries.
ChatGPT appears to be losing users
An interesting insight into the data comes from just looking at ChatGPT-User agent trends. When isolating the data to this agent, we can see a dramatic decline in logs volume. Since December 2025, there has been a significant drop in user events:

Out of curiosity, I compared December 1st through the present date (March 14th, 2026) against the previous period. During that time, ChatGPT-User dropped by a staggering -28%!
A few interesting theories could be derived from fewer ChatGPT-User events:
1. Fewer people may be using ChatGPT
One potential explanation is that fewer people are simply using ChatGPT. Since it’s a fairly direct proxy of when users are interacting with the LLM, a smaller number of users would result in fewer events.
Other studies out there support this notion. For instance, SimilarWeb found that ChatGPT dropped from 86.7% traffic share in January 2025 to just 64.5% by January 2026. Similarly, Sistrix found that ChatGPT usage plateaued around late 2025 and showed usage also dropping.

2. OpenAI may be relying on cached resources
When discussing this trend with the folks at Botify, they shared another possibility. According to their team, it’s possible that the reason we’re seeing less ChatGPT-User traffic is actually because OAI-SearchBot is crawling more.
Assuming that OpenAI is building its own index, it doesn’t need to fetch pages in real time as often — it likely already has a fresh version stored. This is similar to how Gemini relies on Google’s index instead of crawling pages on demand when grounding responses. As a result, this may be an indication that OpenAI’s HTML web index is becoming even more comprehensive, rather than a drop in ChatGPT users or fewer conversations using web search.
OpenAI has tripled their crawl of the web since August 2025
Another very interesting discovery: something major happened when GPT-5 launched. Practically overnight, all three of OpenAI’s major crawlers saw rapid increases:

When segmenting just to the automated crawlers (GPTBot/ and OAI-SearchBot), we can see a massive difference before and after GPT-5. In total, OpenAI’s crawl of the web is estimated to have tripled since August 2025.

What about the connection to ChatGPT-User?
Initially, this makes sense, especially when in connection with ChatGPT-User. If more people are using the platform post-GPT-5, does that mean its training bot and web crawler bot get executed more?
However, remember that ChatGPT-User has dropped by -28%.

That hasn’t been the case with either OAI-SearchBot or GPTBot. Those both remain at all-time-high crawl activities.
It would appear from this data that GPT-User and OAI-SearchBot / GPTBot are disconnected. Even with dramatic declines, OpenAI’s automated bots are still crawling a lot more than they once did.
OAI-SearchBot has increased 3.5x since August 2025
Here’s the trend for OAI-SearchBot. It saw dramatic increases in August and continues see all-time highs in terms of crawl activity:

Around the time that OpenAI launched GPT-5, there was much speculation around the community that ChatGPT would start leveraging search more. The great Dan Petrovic wrote an article about how GPT-5 Made SEO Irreplaceable. The idea was that OpenAI was sourcing more from web search and trained to be intelligent, not knowledgeable. As it turns out, that was 100% the case.
When comparing OAI-SearchBot activity post-GPT-5, there was a 3.5x increase in events. And that’s not a small number. In just Botify’s dataset, OAI-Searchbot jumped by 2.2 billion events.

Breaking the change down by industry, some types of sites were impacted more than others. Some sites saw much more relative growth in OAI-SearchBot than others.
In particular, Healthcare (740.94%) and Media / Publisher (701.91%) saw the largest increases:

It’s worth noting that none of the different verticals saw negative growth from OIA-SearchBot:
- Media / Publisher: +701.91%
- Healthcare: +740.94%
- Marketplaces: +215.56%
- Internet / Software / Techno: +204.76%
- Other: +311.19%
- Retail / eCommerce: +194.96%
- Travel: +29.81%
GPTBot activity has increased 2.9x since August 2025
Not only did OpenAI start using web search more frequently after GPT-5, it also more aggressively pushed out its training crawler.

Botify’s dataset shows that GPTBot activity has increased by 2.9x since the release of GPT-5. That’s a delta of +1.8 billion events when comparing the before and after periods.

OpenAI now spends slightly more time searching than training
Another way we wanted to analyze the data was looking at the split between crawling (OAI-SearchBot) and training (GPTBot). This is important because marketers might want to know which activity they need to optimize for. Do you spend more time training to influence the activities in the training set, or does it mostly rely on search?
We expressed OAI-SearchBot’s relative crawl against GPTBot as a ratio (OAI-SearchBot / GPTBot) and compared this before and after GPT-5.
- Before GPT-5: 0.95 OAI-SearchBot / GPTBot
- After GPT-5: 1.14 OAI-SearchBot / GPTBot

After the release of GPT-5, OpenAI is now officially spending more time searching instead of training. This once again lines up with the notion that they’re trying to focus on intelligence and use the web as their knowledge source.
However, our findings show this varies a lot by industry. Some types of sites might get searched more, while others might be a good fit for training data.
To understand this, we looked at the "Relative Crawl Difference” of OAI-SearchBot / GPTBot over the last 3 months. Green results mean that OpenAI is searching more than training, while orange indicates more training instead of searching.
This shows us that whether OpenAI views your associated prompts as needing a “training” or a “searching” response highly depends on your industry. To be clear, this doesn’t mean that the other isn’t important. This just shows which of the two pathways OpenAI is more likely to choose for a given vertical.

If you are in Media / Publishing, it’s searching much more aggressively than any other vertical (256% OAI-SearchBot / GPTBot crawl difference). Software / Internet sites lean slightly more towards searching, as well.
However, Healthcare (-50%) and Retail / E-commerce (-33%) are more training-focused. GPTBot was relatively more active on these sites.
Google still dwarfs OpenAI in web crawling
In case you’re wondering how OpenAI stacks up to Google, even with the 3x increase, it’s not even close. This graph showcases a comparison between these four crawlers:
- Googlebot Desktop + Smartphone
- OIA-SearchBot + GPTBot
You can see that the Google-based crawlers completely dwarf OpenAI’s:

In the last month, there were 18.2 billion Google events compared to 887 million OpenAI events. That means, as of now, ChatGPT represents about 4% of Google’s total crawl. Even compared to Bing’s 5.49 billion events, OpenAI only represents 14% of the crawl.
However, it’s important to note that the gap is closing. When looking at the same 30-day period in 2025, it was a much bigger difference. Google crawlers registered 15 billion crawl events, while OpenAI only registered a mere 207 million — 1.38% of Google’s total crawl.
In the span of a year, OpenAI has closed the gap from 1.38% to 4% of Google’s crawl events.
Conclusion
From this analysis, a few things are clear. OpenAI is searching and training more than it ever has before. However, direct usage of ChatGPT seems to be fluctuating, especially since winter 2025.
While OpenAI now uses search slightly more, this is highly dependent on your site. The only way to know for certain how AI systems are interacting with your site is to get under the hood and review your site’s own internal log files.
A huge thanks to the Botify team for their collaboration on this!
.png)
We'll be hosting a joint webinar on Thursday, May 21st at 11am ET / 8am PT to discuss these findings live and cover actionable takeaways based on the data. Register now to join the conversation live and go even deeper into the data surrounding AI bot behavior on some of the world's top companies' websites.
.png)



