Leading connectivity company has announced feature to block AI crawlers accessing website content without permission or compensation.
Crawlers have been a universal tool along with the internet for decades allowing platforms to “read” website content and utilise for designated purposes. The most common is crawlers checking website indexes to power search engines. Without crawlers, the likes of Google search could not function. But with the explosion of AI tools and increasing concerns of their use and sourcing learning material, AI bots have raised many questions and frustrations due to running rampant without a means of control – until now.
What are AI crawlers?
AI crawlers are automated bots that scan and extract vast amounts of data from websites and other online sources. Their primary purpose is to collect information to train large language models (LLMs) used by companies like OpenAI (for ChatGPT) and Google (for Gemini). They work by systematically “browsing” web pages, following links, and copying various types of content, including text, images, videos, tables, and code.
Unlike traditional web crawlers used by search engines (like Googlebot) primarily for indexing content to make it discoverable in search results, AI crawlers specifically gather data for machine learning and training AI models. This data helps AI models learn to generate more accurate, human-like responses, understand context, and process dynamic content. Some AI crawlers also perform “live retrieval” to pull real-time data for AI assistants to supplement their generated answers with current information.
The rise of AI crawlers has sparked discussions about content ownership and compensation, as they collect data that AI models then use to generate responses, potentially reducing traffic to the original content sources.
How is Cloudflare stopping AI Crawler Bots
Until the recent news from Cloudflare, it has not been possible to prevent AI crawler bots from accessing website content. Website owners could utilise permissions to restrict some general engagement of website but there were limits. The most common many typical website users will recognise is region locking on video content based on country due to licensing (though we know many have used VPNs as a means to work around this).
Announced on 1 July 2025, Cloudflare has launched tools to new and existing service users to help website owners decide which crawlers to allow access and choose to prevent AI crawler bots and stop being a source of their data learning process.
Listed as a Permission-Based Model for the Internet, the new system which website owners can choose to opt-in to apply (though set as live for new accounts by default), means crawlers must announce themselves before determining if permission to access is granted.
The new feature once enabled allows Cloudflare to block a range of AI crawler bots including:
- Amazonbot (Amazon)
- Applebot (Apple)
- Bytespider (ByteDance)
- ClaudeBot (Anthropic)
- DuckAssistBot (DuckDuckGo)
- GoogleOther (Google)
- GPTBot (OpenAI)
- Meta-ExternalAgent (Meta)
- PetalBot (Huawei)
In addition, verified bots AI-related categories such as AI Assistant, AI Crawler or and Archiver along with many more with similar behaviours will also be blocked through the new feature.
What does this mean for your website
The new Cloudflare AI Crawler Bot feature when applied will mean such AI crawler bots that support machine learning or utilising your content and imagery for various purposes will no longer be utilised as source material.
This will not hinder visibility on search engines or work via traditional SEO or similar activity or prevent your site remaining indexed by search engines as bots listed under that category will remain permitted even when the feature is enabled. However, for those seeking visibility through search on AI platforms such as Google’s AI overview and Gemini or Chat GPT and have been benefitting from GEO implementation, visibility here will be completely gone therefore allowing potential positioning from competitors.
The feature operates in addition to the existing general bot blocking system Cloudflare users already manage .
Following the announcement, many major publishers, media and technology companies have welcomed the feature hailing it the stepping stone for transparency within the content ecosystem for AI companies and creators.
Should you block AI crawler bots
The choice to block AI crawler bots will depend on your business needs and operations. For websites for publishers and creatives, the new Cloudflare feature will be invaluable to protect intellectual property from AI learning systems which have been target for many frustrations.
Alternatively, the new enforced permission-based model can also allow website owners to gain compensation for access through a pay per crawl structure. Cloudflare have shared more details of how this can be implemented and flexibility to meet business and potential partnership needs.
The method for blocking AI bots will also be dependent on your Cloudflare plan however the blocking feature is available for all users.