Internet robots such as Google’s may visit a website more frequently than the website can serve.
The frequency may be considered detrimental to the website, with the potential to cause slowness for the website’s visitors.
The crawl delay is specified in the robots.txt file.
This file is included at the root of the website. It lets you tell the Internet robots how you wish them to index your website. But, they may ignore your advice! Particularly true of malicious robots or perhaps a mis-configured new project.
The robots.txt file is used to advise, those robots which follow the standard, which directories they are permitted to traverse and the point we are interested in, the delay before the next visit.
To configure the delay in the robots.txt file add a Crawl-delay entry.
User-agent: google-bot Crawl-delay: 60
Where the number is the number of seconds before the robot returns.
In the example the Google bot is targeted, requesting it to return after 60 seconds.
The problem with this approach is the delay before the robots.txt file is next read. It may be a few days before the file is next read, while you may be keen to slow the crawl rate sooner.
Google can be informed directly via its Webmaster tools website.
Click on Search console, top left below the Google logo. From the listed websites select the one which you wish to amend. This shows a page with current status summary graphs and a menu of options at the left.
At the top right, below your account link click on the gear icon to select Site settings.
Within the second section: Crawl rate, click on the radio button for Limit Google’s maximum crawl rate.
This expands a section to show a bar option to set the lowest and highest values, for the time delay, in seconds, between requests.
Make your selection and click on the Save button.
I think, its better to use the robots.txt file. All the robot configurations are in one place, for easy reference. No need to search for account login information to configure the settings across multiple providers. But, do remember that the file is publicly readable, if that should be of concern.
Dependant upon the urgency, you can either inform Google directly via webmaster tools or by configuring robots.txt.