All things Shopify and commerce
I just figured out that ahrefs (the seo tool) cannot crawl any shopify website because of certain (API?) call limits. We use this tool to see the backlinks and internal link value of certain pages, so we can optimize on that.
The error Ahrefs gets from shopify is: HTTP/1.1 429 (Too Many Requests)
This is probably because of a shopify limit, and according to ahrefs this can only be solved by adding the following text in our sites robots.txt file:
User-agent: AhrefsBot
Crawl-Delay: 1
But shopify doesn't allow store owners to change robots.txt, nor will they do it for you. Which then results in ahrefs not being able to crawl any shopify site, as far as I can conclude right now.
Is there any way shopify can add this to the robots.txt? Or another way around this?
Any help would be appericiated!
Hi there,
I noticed this post by Nevil but I think Ahrefs has updated their site crawler since then. I have yet to try out a free account to verify this but here's what I was told by customer service:
Crawl speed is determined by both the number of parallel requests our crawler does and the delay between each request. By default, we send one request and wait 2 seconds after the URL is crawled. To increase crawl speed, you can increase the number of parallel requests and reduce the delay between them.
This can be adjusted in Crawl settings tab for each of your SA projects:
(see my image attachment)
Can any other Shopify users speak to this? It appears you can manually slow down the crawler to get around the API timeout issue.
Also, can any other Shopify users comment on their experience with Ahrefs? So far it seems like our top choice for an all-in-one SEO tool.
I thought I would bump this thread with an update.
So, I am now stuck in the middle of this issue. I use both shopify and ahrefs every day so getting ahrefs to crawl my site is important to getting accurate data.
I spent two hours on May 1 2019 chasing this issue back and forth between the companies. I exhausted the issue on the shopify side and am fairly confident that the issue is with ahrefs. I then spent a lot of time on the chat service (ahrefs support is not the best) literally arguing with the rep there who was pointing their finger at shopify. When I supplied the API document to them, they quickly ended the chat saying they would take a look at it.
The issue seems to be that the ahrefs API set up is wrong and they are hammering the shopify platform which triggers the 429 errors we all see. I have four other robots crawling my site with no issue. Only ahrefs has this issue.
Hopefully someone there will take the time to update their API so that it actually works.
May 8 2019 dropping in an update. This has become a frustrating nightmare for me.
I found a really good guy at AHREFS who has been working with me to resolve this.
The issue is 100% Shopify's fault. Here is the issue, if you have any monthly plan below the Shopify Plus package, you are subject to lower API standards. The most common package (the one most subscribers are on) has a leak rate of only 2. The higher plans have a leak rate of 4. The leak rate of 4 works fine with the Ahrefs API. The leak rate of 2 does not. It is Shopify's decision to not allow the leak rate of 4 for all subscribers.
What does this mean if you are a paying subscriber with Ahrefs? They use crawling data as part of their calculations for website data. Without the crawl, 100% of your data is inaccurate, meaning you are paying for a service that is giving you the wrong data. Additionally, your site will be ranked in their calculations significantly lower than it should be.
Because Shopify does not allow access to the robot.txt file, there is no fix to the issue unless Shopify agrees to fix it.
I will keep updating this post as I receive more information.
I find it seems that API ahrefs is set up incorrectly and they go behind the shopify platform, when collecting ahrefs files that produce hundreds of errors while other data collection tools don't report errors.
Also shopify does not allow users to interfere with robots file so it will not be able to fix this problem
Crawling Shopify stores due to API limits is a challenge. For our SEO clients, we'll use ahrefs but if it fails, jump to Screaming Frog. We favour using Screaming Frog combined with periodic alarms to remind us to pause the crawl (to prevent crawl errors from limits) then resume it in 2 minutes. Bit cumbersome but it works.
I do need to call out that any limits for crawling are not API limits as noted in the link below:
https://help.shopify.com/en/api/getting-started/understanding-api-rate-limits
Might save some confusion for others reading this thread later on.
To close out my previous notes on this issue, I am happy to announce that I successfully coordinated an effort between Shopify and Ahrefs to resolve this issue, and that now Ahrefs bots can crawl all Shopify subscriber sites without any further action by the subscriber. The updates on the Ahrefs data side are compiling right as I type this. The fix was on the Shopify tech side, and after they made their updates, the crawls initiated successfully. A BIG thank you to Max over at Ahrefs for driving this issue through to completion. He is a total tech support BOSS. This is being posted on May 13, 2019 for reference.
Thanks for the updates! I am in the same boat as you are.
Would you mind describing steps and requirements on getting AhRefs code on Shopify in a manner that will void the API limitations.
Warmest regards,
M
@Richard_Russell wrote:To close out my previous notes on this issue, I am happy to announce that I successfully coordinated an effort between Shopify and Ahrefs to resolve this issue, and that now Ahrefs bots can crawl all Shopify subscriber sites without any further action by the subscriber. The updates on the Ahrefs data side are compiling right as I type this. The fix was on the Shopify tech side, and after they made their updates, the crawls initiated successfully. A BIG thank you to Max over at Ahrefs for driving this issue through to completion. He is a total tech support BOSS. This is being posted on May 13, 2019 for reference.
Currently, this problem happens to me, the tracker of AHREFS and Semrush cannot enter shopify for inspection of backlinks and duplicate content, for its part semrush takes too long.
14 minutes and 0 URL crawler.
Hello!
As of today, June 21st, 2021, we have launched the ability to edit the robot.txt file to give merchants more control over the information that is crawled by search engines. You can learn more about how to edit your robot.txt file through our community post here.
Due to the age of the topic, I will be locking this thread. If you have any questions about the new feature, please do not hesitate to create a new post under our "Techincal QA" board.
Trevor | Community Moderator @ Shopify
- Was my reply helpful? Click Like to let me know!
- Was your question answered? Mark it as an Accepted Solution
- To learn more visit the Shopify Help Center or the Shopify Blog
Thanks to all Community members that participated in our inaugural 2 week AMA on the new E...
By Jacqui Mar 10, 2023Upskill and stand out with the new Shopify Foundations Certification program
By SarahF_Shopify Mar 6, 2023One of the key components to running a successful online business is having clear and co...
By Ollie Mar 6, 2023