Prevent Google Crawling Non-canonical URLs

Prevent Google Crawling Non-canonical URLs

Ken_Blasted-RC
Tourist
12 0 1

Hi fellow Shopifyers

 

I have an issue with Google crawling thousands of non-canonical URLs rather than spending what little juice they have on crawling the URLs that are actually valuable. I've spent some time looking through everything on Search Console and I believe I have located the different sources of the various URL's. However, I need some help in figuring out how to tell Google not to crawl these pages. Do I need to add some code snippet to Theme.liquid or can it be using Robots.txt?

 

The things I wish to exclude are URLs containing:

  • URLs with "filter" in them (generated by the Filter function on the site.)
  • URLs with "?pr_prod_strat=" in them (these are links from Product Recommendations as far as I can tell)
  • URLs with "?page=1" in them (generated by going to page 2 on a collection and then back to page 1)
  • URLs with "_pos=" in them (generated by the Search function on the site.)

I've managed to find a code snippet which should exclude the URLs excluded by the Search function, but can this be modified to exclude the above functions as well?

 

{% if template contains 'search' %}
<meta name="robots" content="noindex">
{% endif %}

Replies 4 (4)

claricelin
Shopify Partner
148 19 31

You create a new robots.txt.liquid template. It's becomes an "add-on" to the existing default robots.txt created by Shopify.

Read this for more details: https://help.shopify.com/en/manual/promoting-marketing/seo/editing-robots-txt

 

In the robots.txt.liquid, embed before

 

{%- if group.sitemap != blank -%}
{{ group.sitemap }}
{%- endif -%}
{% endfor %}

 

the following (or any others):

 

Disallow: *?page=*
Disallow: *?pr_prod_strat=*

 

After you save, the final robots.txt should exclude those pages.

 

 

I help store owners to double their traffic and sales by getting on Google and YouTube page one. Check out my YouTube channel for more tips and tactics to drive traffic and sales ➤ https://youtube.com/claricelin

Download my Shopify Marketing Guide to learn both FREE & PAID ways to drive visitors to your store that convert into paying customers. ➤https://claricelin.com/shopify-marketing-guide/
Ken_Blasted-RC
Tourist
12 0 1

Thank you for your response! 

Before going ahead and implementing this, do you know if there can be any negative impacts from exluding those pages from Crawling? Is there a benefit in letting google crawl filter, search and recommended products as well?

claricelin
Shopify Partner
148 19 31

There is no negative impact from excluding those pages. Google wants only unique pages on search results. In this case, you want to show each individual product page as it is. Those URLs are created based on users' interaction with your website and has no added value. 

 

Let's say, a filtered page - it means there is a specific groups of products - instead of indexing that page, it might be worthwhile to create a collection page instead to highlight that group of products and their distinct characteristics. You can also work on the SEO of that collection page by writing good meta description and in the long term get that collections page to rank higher on Google.

 

If there is a group of people looking for those (grouped) products, Google will show them that collection page. It's a more sophisticated solution and benefits the overall SEO of your store too.

I help store owners to double their traffic and sales by getting on Google and YouTube page one. Check out my YouTube channel for more tips and tactics to drive traffic and sales ➤ https://youtube.com/claricelin

Download my Shopify Marketing Guide to learn both FREE & PAID ways to drive visitors to your store that convert into paying customers. ➤https://claricelin.com/shopify-marketing-guide/

dsv1
Visitor
2 0 0

I found 

  • URLs with "?pr_prod_strat=" in them (these are links from Product Recommendations as far as I can tell) to be the largest driver of issues. I don't think Shopify has done anything to fix. I would assume a large portion of users are impacted by it and don't even realize.
  •