Robots.txt is blocking many URLs

Jim_Benham
New Member
11 0 0

I have been going crazy trying to figure out why my robots.txt file is blocking my site. I have been in contact with Shopify Gurus, Shopify Experts, and an app developer trying to get my issue resolved with zero luck. Hopefully I will have some better luck here. 

When logging on to my webmaster tools, I am getting an error message on my site stating "Severe health issues are found on your site" The page looks like this below

If I click on the "Check site health" link this is what I see

If I click on the "Some important page" link I am being redirected to this page on my website http://padholdr.com/collections/jeep-dash-mounts/wrangler+2008

If I click on the "Is robots.txt blocking important pages?" link it forwards me to my blocked URL page on webmaster tools, that currently looks like this

After hours and hours I have narrowed down the results to this:

-The only pages that are being blocked by robots.txt are all of my pages that have multiple tag drop down choices. We added the code from this page on shopify showing how to do so http://docs.shopify.com/support/your-store/collections/filtering-a-collection-with-multiple-tag-drop... 

-All of my pages that have multiple tag drop downs have a + sign at the end of the URL followed by the second tag which in my case is years. If I test any of the pages that have the second tag option with the plus in the URL it is stating that it is being blocked by robots because of line 8 of my robots.txt file

-If I submit a URL that has multiple tags along with the + in the URL to fetch as google I will get a success response, HOWEVER the link gets changed. The link now replaces the + with a %20. For example: I submit http://padholdr.com/collections/jeep-dash-mounts/wrangler+2008 but when I click on the successful link the URL now shows http://padholdr.com/collections/jeep-dash-mounts/wrangler%202008

-If I test the link with the %20 with robots.txt I get an approval, however if I switch it back to a + sign I get denied by robots.

I'm no expert by any means so hopefully someone here can help figure this out. I am thinking that this is possibly an issue with having a plus in the URL, but not sure. Any thoughts I would be very grateful for!

0 Likes
standoutd
Navigator
1135 0 128

My bet is that the "+" is the culprit. That was my thought before reading further, having clicked the links.

http://www.StandoutDesigns.com ::: Solid Wood TV Furniture for Enthusiasts. Made in USA.
Jim_Benham
New Member
11 0 0

Any idea how to change this? 

0 Likes
standoutd
Navigator
1135 0 128

Disallow: /collections/*+* is the culprit - that's my sense. As far as I know, there's no way to edit your Robots.TXT on Shopify. Support should be able to confirm. As for %20, that's the same as a space and is used to replace spaces, for example, in file names, so the URL can be handled right.

 

http://www.StandoutDesigns.com ::: Solid Wood TV Furniture for Enthusiasts. Made in USA.
standoutd
Navigator
1135 0 128

For what it's worth, search of 

padholdr jeep dash mounts wrangler 2008 ipad

gets you the top result. So the title is indexed at least.

http://www.StandoutDesigns.com ::: Solid Wood TV Furniture for Enthusiasts. Made in USA.
0 Likes
Jim_Benham
New Member
11 0 0

Yeah I see that. I'm not concerned about changing the robots.txt file, but changing my URLs so they don't show plus signs. Any idea how that is done?

 

0 Likes
Jason
Shopify Expert
10360 158 2009

Yeah I see that. I'm not concerned about changing the robots.txt file, but changing my URLs so they don't show plus signs. Any idea how that is done?

You can't. The plus is used for tag filtering, and it might not have the impact your think it does (even though it probably looks scary in the webmaster tools).

There is some reasoning for the blocking of the tags here.

 

I jump on these forums to help and share some insights. Not looking to be hired, and not looking for work.

Don't hand out staff invites or give admin password to forum members unless absolutely needed. In most cases the help you need can be handled without that.


★ http://freakdesign.com.au ★
0 Likes
aaron_son2
New Member
1 0 0

I am having the same problem today. My webmaster tools says that the ROBOTS.TXT is Blocking 2,613 URLs!. I noticed one thing that is odd, and inside the robots.txt it says:

# we use Shopify as our ecommerce platform

User-agent: *
Disallow: /admin
Disallow: /cart
Disallow: /orders
Disallow: /account
Disallow: /collections/*+*
Disallow: /blogs/*+*
Sitemap: https://www.orbitlightshow.com/sitemap.xml

User-agent: Nutch
Disallow: /

Notice the HTTPS in the Sitemap line, but shopify doesnt offer/need HTTPS so why is it searching for HTTPS in the robots.txt? How can we go in there and change the HTTPS to HTTP i think this is the issue. is there a way for me to automatically forward https requests to http?

0 Likes
Shopify_Adam
Shopify Staff
Shopify Staff
153 0 33

robots.txt is automatically generated by Shopify for each site. We work with Google to make sure it's optimal.

The question regarding "+" in the URL has been asked and answered in this thread: https://ecommerce.shopify.com/c/shopify-discussion/t/did-robots-txt-change-recently-164853#comment-1...

In short, the "excluded URLs" Google is complaining about are every possible permutation of tags. They shouldn't be indexed.

VP Product - Shopify
Jim_Benham
New Member
11 0 0

As long as this isn't hurting me as far as Google indexing my site that is fine. The scary thing is again last night I got an email from google stating they are not indexing my site due to too many errors found in my robots.txt file. My site is down 60% in visitors since I switched to shopify. I've just invested a bunch of time into SEO, so I'm hoping this will fix my issue, and not just a waste of time since Google isn't going to index my site. 

0 Likes