SEO Culture
Publisher SEO
Business & Reporting
Botify News & Products
SEO Foundations
SEO News & Events
Future of SEO
E-Commerce SEO
Enterprise SEO
Content SEO
Technical SEO

Back to blog

Future of SEO
Technical SEO

Why You Need an AI Bot Governance Plan (and How to Build One)

X
 min read
October 2, 2025
Morgan McMurray

Since OpenAI released ChatGPT to the public at the end of 2022, the online consumer journey has been turned on its end. Now, 60% of US consumers use AI to find information, and within platforms like ChatGPT and Perplexity, they can even move seamlessly from product discovery to purchase, converting without ever visiting a website. It’s a new opportunity for brands, and with highly personalized and refinable AI chats, consumers can describe exactly what they want — making them more qualified to buy overall.

On the flip side, there’s always information you don’t want surfaced to consumers — outdated brand information, out-of-stock inventory or incorrect prices, sensitive information, and more. Plus, more AI bots looking for content on your website can tax your infrastructure, which comes with its own costs and consequences.

Unfortunately, you don’t have full control over how LLM-supported platforms surface your content in conversations with consumers — whether the tone is positive or negative, for instance, or if your products are positioned against competitors. 

What you can control are your rules around bot access to your website: which crawler bots are allowed to find your content, and what content is available for ingestion. If you haven’t yet grappled with the question of what AI traffic you want to keep or block, now is the time. 

Why you need an AI governance plan today

Building an AI governance plan is no longer optional. Whether you want your brand represented in generative AI results (or actively want to keep it out), failing to manage incoming AI traffic or blocking bots indiscriminately means:

  1. You’ll front unnecessary infrastructure costs as bot traffic skyrockets and burdens your servers. 
  2. You’ll lose the valuable control you have over your brand presence in AI search results.

Alternatively, blocking all AI traffic may cause you to lose out on opportunities to enhance visibility on AI platforms and in AI-augmented search results. Plus, if you aren’t feeding your brand information to AI models, you can bet your competitors are.

Here at Botify, our data underscores the urgency: humans are relying more and more on AI to start the consumer journey for them. Across our retail customers, crawls from AI bots rose from ~20 million per day in January 2025 to nearly 80 million daily in just six months. At the same time, consumer behavior is changing, resulting in fewer clickthroughs. 

While fewer clicks occur (just 1–5%), the traffic that does arrive is three times more engaged and twice as likely to convert. Fewer visits from humans may be alarming, but your brand is likely receiving more valuable visits from the ones who actually click through. 

How to build your governance plan

Not all AI crawlers are equal. Some drive visibility and conversions; others may increase risk or dilute traffic. The right governance plan helps you decide which to allow, which to block, and how to prioritize your most important content.

1. Understand your bot traffic

Traditional search bots like Googlebot or Bingbot index content for search results. AI crawlers, however, have different goals that directly affect how your brand shows up in generative platforms.

Before you can outline which bots you want to allow, you’ll need to know which ones are actually crawling your site. Analyze bot behavior via log file analysis to see which AI crawlers are interacting with your content, then map those crawlers to their intent:

  • Training crawlers (e.g., GPTBot from OpenAI, ClaudeBot from Anthropic, Amazonbot, and PetalBot from Huawei) collect site content to teach large language models. Allowing them helps shape how AI “understands” your brand, but also means consumers may see summaries that use your content without credit, resulting in little to no click-throughs.
  • Live retrieval crawlers (e.g., ChatGPT-User, Perplexity‑User, Bing AI bots) fetch fresh, real-time content and often provide linked citations for queries that go beyond their original training data.
  • Index-building crawlers (e.g., OAI-SearchBot, PerplexityBot) support platforms building their own search indexes, signaling AI’s shift toward independence from Google and Bing.

Other notable players include Bytespider (ByteDance/TikTok), CCBot (Common Crawl), and FacebookBot (Meta).

A few questions to ask during your analysis:

  • Which bots are finding my content? What’s their purpose?
  • What parts of my site are they exploring? What are they ignoring (or blocked from)?
  • How deep into my pages are they going?

Your goal: begin to understand which bots are finding what content, and why. Once you have the lay of the land, you can begin to craft an actionable governance plan that directs bots to content you want them to find and restricts them from content you don’t.

2. Decide whether AI models should train on your content

Allowing models to train on your content can embed your brand’s expertise into AI systems, but it’s a long-term play. Most models are already outdated when released.

Some considerations: 

  • If you want your values, policies, and expertise to shape how AI describes your company, consider allowing training on things like brand content and/or editorial content.
  • If you’re concerned about data security, IP protection, or competitors piggybacking on your content, restrict training on those pages.

If you decide to block training bots, be sure to prioritize any third-party content you have some control over, such as reviews, social profiles, and Wikipedia. These sources could be used to train models and inform things like brand knowledge (such as your history, values, industry, etc.), consumer sentiment, competitive comparisons, and more. 

3. Make a call on real-time visibility

Live retrieval is the only way to earn linked citations in AI search results, which is vital for both credibility and referral traffic. It’s also how AI platforms surface data like recent reviews, sales, current events, and other fresh content to consumers — information they wouldn't have otherwise. Decide whether you want AI platforms to surface your latest updates.

Consider: 

  • If accuracy and up-to-date info are critical (like pricing, inventory, and product launches), you might say yes.
  • If your revenue depends on traffic volume (such as ads or subscriptions), consider holding back.

4. Be intentional with content access

What you block can matter just as much as what you allow, and AI platforms won’t crawl everything equally. You want bots to find what matters most, meaning content that will build your authority and support the customer journey. To do so, try thinking in terms of tiers. For example:

  • Tier 1: Always allow. This could cover evergreen brand content, key product details, resources that showcase expertise. This will help elevate brand stories, core product pages, and high-value editorial.
  • Tier 2: Allow cautiously. You might want to use caution with editorial content, depending on whether you can afford the click loss.
  • Tier 3: Block. If your site has sensitive data, negative sentiment, or thin/duplicate content such as irrelevant user-generated content or low-value pages, they could be good candidates for this tier.

Adjust your tiers and priorities to align with your business outcomes, leveraging the bot analysis you conducted in step one to see which content is already getting attention by bots (and which isn’t).

5. Prioritize platforms wisely

Once you know what kinds of bots you’ll allow and for which content, map those bots to priority platforms where your consumers spend the most time.

For example, retailers may benefit most from AI-supported product recommendations within platforms like Perplexity, or in Google AI Overviews (supported by Google Gemini). Brands catering to Gen Z might prioritize social platforms where their consumer base spends the most time, like TikTok search and Meta AI. 

Whichever platforms make your list, review them frequently, as AI search platforms emerge and evolve fast.

6. Benchmark your competitors on your chosen platforms

Once priority platforms are identified, you need to know how both your brand and your competitor brands are currently showing up on them. 

Analyze how your high-intent keywords and prompts show up in AI platforms:

  1. What brands are being mentioned in the results?
  2. Are competitors showing up more than you?
  3. What kind of content gets cited?

Establish your benchmarks and continue to monitor competitor visibility to reveal any opportunity gaps.

7. Build your bot list

With all of this analysis on hand and your allow/block decisions made, you can finalize your AI bot governance plan by specifying the AI bot crawlers that your rules will be applied to. 

Leverage free resources like those listed below to quickly note which bots belong to which platforms, categorize them based on purpose and priority, and list out the rules for each category:

  1. Cloudflare’s List of Verified Bots 
  2. Overview of OpenAI Crawlers
  3. Perplexity Crawlers
  4. Overview of Google Crawlers
  5. Overview of Bing Crawlers
  6. List of AI User Agents, Bots, & Search Crawlers

Pro tip: Remember, AI bots are generally unable to render certain content on pages, like dynamic content built with JavaScript. Make sure AI bots are seeing vitally important elements like reviews, prices, and more by pre-rendering pages with a bot management solution like SpeedWorkers.

Licensing: A potential AI governance game-changer

Recently, large publishers and tech companies have pushed for a new standard as AI regulation lags behind: licensing. Really Simple Licensing (RSL) layers content licensing terms for AI tools on top of a site’s robots.txt file, designed to stop unauthorized scraping while ensuring fair compensation for creators.

If adopted widely, this could fundamentally change how brands interact with AI bots, ultimately shifting governance from tactical (allow or block) to contractual (license or deny). While too soon to tell if the strategy will take off, it’s worth watching. 

Smart governance sets you up for success

An AI governance plan is about making deliberate and strategic choices, and no two brands should have the same one. Every decision, including whether you allow LLM training, prioritize live retrieval, or block specific content, should map directly back to your business goals. 

Each step of the process to build your nuanced AI governance plan needs thoughtful consideration. For help getting started, you can download our free AI Search Playbook, which includes a full questionnaire that your team can work through to make these decisions. 

By asking the right questions questions and following the steps above, you’ll build a governance plan that both protects your brand content from misuse, and that prioritizes brand visibility wherever your consumers search.

Want to learn more? Connect with our team for a Botify demo!
Get in touch
Related articles
No items to show
Join our newsletter
SEO moves fast. Stay up-to-date with a monthly digest of the industry's best educational content, news and hot takes.