URL Validation Testing and Whitelisting

Hi friends! I’m building Link checking into my automated tests on Circle. We have a lot of links to drupal.org (references to modules), and that site ends up blocking the tests with a 403 response.

I’m wondering if others have encountered and resolved similar issues with sites that block bot crawling.

That’s probably a bot blocker on the Drupal domain. In general, you should try to avoid running automation on other people’s servers - it offloads costs onto the third party, and it is quite normal that they will try to block that.

You could:

  • Omit testing these links completely, by adding a blocklist in your link checker. This may be appropriate if the links to this domain do not change very often, and can be checked manually once.
  • If you really need to check these links automatically, then add delays in your link checker when hitting external domains (e.g. 1 second between each request). That will make your system look much less like a crawler.
1 Like

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.