GPTBot

OpenAI’s web crawler is named GPTBot and it’s identifiable by the specified user agent and its associated string.

User agent token: GPTBot
Full user-agent string: Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; GPTBot/1.0; +https://openai.com/gptbot)

Functionality

Web pages traversed using the GPTBot identifier may be harnessed to enhance subsequent GPT AI models. OpenAI ensures these pages do not lie behind paywalls, accumulate personal data, or contain text that contravenes our guidelines.

By granting Bot access to your website, you contribute to the refinement, efficacy, and safety of AI models. Further down, guidance is provided on how to restrict GPTBot’s entry to your website.

Restricting GPTBot Access

If you wish to prevent GPT Bot from accessing your website, simply include GPTBot in your website’s robots.txt as follows:

User-agent: GPTBot Disallow: /

Modifying GPTBot’s Access

To permit GPTBot’s entry to specific sections of your website, adjust the GPTBot settings in your website’s robots.txt as illustrated:

User-agent: GPTBot
Allow: /directory-1/
Disallow: /directory-2/

IP Departure Ranges

The IP range from which OpenAI’s crawler initiates web requests is available for review on the OpenAI website.