Ahrefs and robots.txt

Nevil
Shopify Partner
3 0 4

I just figured out that ahrefs (the seo tool) cannot crawl any shopify website because of certain (API?) call limits. We use this tool to see the backlinks and internal link value of certain pages, so we can optimize on that.

The error Ahrefs gets from shopify is: HTTP/1.1 429 (Too Many Requests)

This is probably because of a shopify limit, and according to ahrefs this can only be solved by adding the following text in our sites robots.txt file:

User-agent: AhrefsBot 
Crawl-Delay: 1

But shopify doesn't allow store owners to change robots.txt, nor will they do it for you. Which then results in ahrefs not being able to crawl any shopify site, as far as I can conclude right now. 

Is there any way shopify can add this to the robots.txt? Or another way around this?

Any help would be appericiated!

Replies 10 (10)
Sberry
New Member
3 0 0

Hi there,

I noticed this post by Nevil but I think Ahrefs has updated their site crawler since then. I have yet to try out a free account to verify this but here's what I was told by customer service:

Crawl speed is determined by both the number of parallel requests our crawler does and the delay between each request. By default, we send one request and wait 2 seconds after the URL is crawled. To increase crawl speed, you can increase the number of parallel requests and reduce the delay between them.

This can be adjusted in Crawl settings tab for each of your SA projects:

(see my image attachment)

Can any other Shopify users speak to this? It appears you can manually slow down the crawler to get around the API timeout issue.

Also, can any other Shopify users comment on their experience with Ahrefs? So far it seems like our top choice for an all-in-one SEO tool.

Richard_Russell
New Member
3 0 0

I thought I would bump this thread with an update.

 

So, I am now stuck in the middle of this issue. I use both shopify and ahrefs every day so getting ahrefs to crawl my site is important to getting accurate data.

 

I spent two hours on May 1 2019 chasing this issue back and forth between the companies. I exhausted the issue on the shopify side and am fairly confident that the issue is with ahrefs. I then spent a lot of time on the chat service (ahrefs support is not the best) literally arguing with the rep there who was pointing their finger at shopify. When I supplied the API document to them, they quickly ended the chat saying they would take a look at it. 

 

The issue seems to be that the ahrefs API set up is wrong and they are hammering the shopify platform which triggers the 429 errors we all see. I have four other robots crawling my site with no issue. Only ahrefs has this issue.

 

Hopefully someone there will take the time to update their API so that it actually works. 

Richard_Russell
New Member
3 0 0

May 8 2019 dropping in an update. This has become a frustrating nightmare for me.

 

I found a really good guy at AHREFS who has been working with me to resolve this. 

 

The issue is 100% Shopify's fault. Here is the issue, if you have any monthly plan below the Shopify Plus package, you are subject to lower API standards. The most common package (the one most subscribers are on) has a leak rate of only 2. The higher plans have a leak rate of 4. The leak rate of 4 works fine with the Ahrefs API. The leak rate of 2 does not. It is Shopify's decision to not allow the leak rate of 4 for all subscribers.

 

What does this mean if you are a paying subscriber with Ahrefs? They use crawling data as part of their calculations for website data. Without the crawl, 100% of your data is inaccurate, meaning you are paying for a service that is giving you the wrong data. Additionally, your site will be ranked in their calculations significantly lower than it should be. 

 

Because Shopify does not allow access to the robot.txt file, there is no fix to the issue unless Shopify agrees to fix it.

 

I will keep updating this post as I receive more information.

jackyle
Explorer
71 1 9

I find it seems that API ahrefs is set up incorrectly and they go behind the shopify platform, when collecting ahrefs files that produce hundreds of errors while other data collection tools don't report errors.
Also shopify does not allow users to interfere with robots file so it will not be able to fix this problem

Josh_Uebergang
Shopify Expert
934 39 241

Crawling Shopify stores due to API limits is a challenge. For our SEO clients, we'll use ahrefs but if it fails, jump to Screaming Frog. We favour using Screaming Frog combined with periodic alarms to remind us to pause the crawl (to prevent crawl errors from limits) then resume it in 2 minutes. Bit cumbersome but it works.

Run Google Shopping ads? Get the free definitive guide to Google Shopping for Shopify (no optin required): https://www.digitaldarts.com.au/google-shopping
Jason
Shopify Expert
11099 217 2256

I do need to call out that any limits for crawling are not API limits as noted in the link below:
https://help.shopify.com/en/api/getting-started/understanding-api-rate-limits

 

Might save some confusion for others reading this thread later on.

★ I jump on these forums in my free time to help and share some insights. Not looking to be hired, and not looking for work. http://freakdesign.com.au ★
Richard_Russell
New Member
3 0 0

To close out my previous notes on this issue, I am happy to announce that I successfully coordinated an effort between Shopify and Ahrefs to resolve this issue, and that now Ahrefs bots can crawl all Shopify subscriber sites without any further action by the subscriber. The updates on the Ahrefs data side are compiling right as I type this. The fix was on the Shopify tech side, and after they made their updates, the crawls initiated successfully. A BIG thank you to Max over at Ahrefs for driving this issue through to completion. He is a total tech support BOSS. This is being posted on May 13, 2019 for reference.

Marat
New Member
1 0 0

Thanks for the updates! I am in the same boat as you are. 

 

Would you mind describing steps and requirements on getting AhRefs code on Shopify in a manner that will void the API limitations.

 

Warmest regards,

M


@Richard_Russell wrote:

To close out my previous notes on this issue, I am happy to announce that I successfully coordinated an effort between Shopify and Ahrefs to resolve this issue, and that now Ahrefs bots can crawl all Shopify subscriber sites without any further action by the subscriber. The updates on the Ahrefs data side are compiling right as I type this. The fix was on the Shopify tech side, and after they made their updates, the crawls initiated successfully. A BIG thank you to Max over at Ahrefs for driving this issue through to completion. He is a total tech support BOSS. This is being posted on May 13, 2019 for reference.


 

MigueRock
New Member
1 0 0

Currently, this problem happens to me, the tracker of AHREFS and Semrush cannot enter shopify for inspection of backlinks and duplicate content, for its part semrush takes too long.

 

14 minutes and 0 URL crawler.

 

ahrefs.png

SEO Consultant Ecommerce Shopify from Chile
Trevor
Community Moderator
Community Moderator
3288 441 844

Hello!

As of today, June 21st, 2021, we have launched the ability to edit the robot.txt file to give merchants more control over the information that is crawled by search engines. You can learn more about how to edit your robot.txt file through our community post here

Due to the age of the topic, I will be locking this thread. If you have any questions about the new feature, please do not hesitate to create a new post under our "Techincal QA" board.

Trevor | Community Moderator @ Shopify
 - Was my reply helpful? Click Like to let me know! 
 - Was your question answered? Mark it as an Accepted Solution
 - To learn more visit the Shopify Help Center or the Shopify Blog