Dedicated to the Hydrogen framework, headless commerce, and building custom storefronts using the Storefront API.
Hi! We are using Shappo and the Storefront API to build a Headless storefront. All working great.
The last couple of days we saw some strange Google entries showing up. After some research we concluded this were pages Google found in the Sitemap generated by Shopify. Somehow Google found its way to this sitemap. Since these pages do not exists (we are doing headless), we obviously dont want this sitemap to exist.
For now we have changed the robots.txt to "Disallow all" on Shopify side but this does not feel as the right solution.
A beter solutions would be to completely disable or edit the sitemap. But we couldn't find this option.
So our question: is it possible to disable or edit the sitemap generated by Shopify?
Thanks!
Solved! Go to the solution
This is an accepted solution.
Natively on the platform, no.
Is the sitemap accessible via https://store-name.myshopify.com/sitemap.xml or through https://storename.com/sitemap.xml or another way? Or are you referring to the "inner" sitemaps like the collections/products sitemaps etc?
Jumping to solutions without seeing the setup, you could possibly proxy via CloudFlare and use a CloudFlare Worker to control the sitemap (lookup sloth.cloud for an idea of how to) though this is making some assumptions, is a fairly technical setup and has some associated costs.
What you've said about the Disallow sounds concerning - could easily be the wrong fix and have negative repercussions.
Can you paste a screenshot here of your current sitemap with URL? Or PM me and I can take a quick look if you like.
This is an accepted solution.
Natively on the platform, no.
Is the sitemap accessible via https://store-name.myshopify.com/sitemap.xml or through https://storename.com/sitemap.xml or another way? Or are you referring to the "inner" sitemaps like the collections/products sitemaps etc?
Jumping to solutions without seeing the setup, you could possibly proxy via CloudFlare and use a CloudFlare Worker to control the sitemap (lookup sloth.cloud for an idea of how to) though this is making some assumptions, is a fairly technical setup and has some associated costs.
What you've said about the Disallow sounds concerning - could easily be the wrong fix and have negative repercussions.
Can you paste a screenshot here of your current sitemap with URL? Or PM me and I can take a quick look if you like.
@KieranR wrote:Jumping to solutions without seeing the setup, you could possibly proxy via CloudFlare and use a CloudFlare Worker to control the sitemap (lookup sloth.cloud for an idea of how to) though this is making some assumptions, is a fairly technical setup and has some associated costs.
I understand that Shopify stores are already behind CloudFlare and to use your own CloudFlare account you need to be using CloudFlare Enterprise:
https://support.cloudflare.com/hc/en-us/articles/203464660-Using-Cloudflare-with-Shopify
So you can definitely do it with CloudFlare if you pay for it!
Yep exactly what I was saying, have seen that article a while ago.
So Shopify incorporates CF natively in the hosting stack for Online Store hence the need to request Orange2Orange though support ticket to gain control of the zone (hence why I said it was a bit technical), if you were to go down this route. I'm not sure how to to date that article is, as I know a couple people on cheaper CF plans who have requested O2O, but yes, hence me saying associated costs.
Although, still don't know if this CF idea would fit OPs situation (or there may be a better or easier way to the same end), needs more context of the headless setup.
Hi Guys,
Thanks for the reply!
The sitemap is accessible via our custom domain. This is a subdomain, since we are hosting our headless storefront on the root domain on another server. So:
mydomain.com = storefront
sub.mydomain.com = Shopify backend
There is no need for a Shopify sitemap, since all the routing is being done by the storefront on the root domain. This storefront has its own sitemap.
You answered my question Kieran. It is sadly not possible to disable or edit the Sitemap generated by Shopify. This could be a problem if Google finds its way to the sitemap in our situation. All the url’s are routed to 404’s. So the sitemap basically tells Google “here are the pages that don’t exist”.
I don’t think Disallows is a problem in our situation, since we don’t want Google to crawl the subdomain.
CloudFlare is something we could look into. A solution we found for now is to redirect the sitemap url of the shopify backend to our own sitemap. This can be done by making a custom Shopify theme, which we have already done for redirecting other url’s.
Thanks for the help guys. Appreciate that.
Erwin
Ahh yep, easy yeah 301 redirect from /sitemap.xml to wherever should be good too - probably all that's needed. But if you start to notice issues in GSC or pages you don't want indexed, indexed then maybe look into more intense options like CF.
Hi! I seem to be unable to implement a redirect through Shopify with the XML sitemap URL still live, is there a way to delete it so the page 404s? We're trying to redirect the default Shopify XML sitemap to our custom one.
i have the same issue sitemap dosent show inner pages or products url`s
@puneet3 the original issue was a pretty custom headless setup, I would be surprised if you are experiencing the exact same thing.
Do you have a link to the example sitemap URL that is missing pages?