Development discussions around Shopify APIs
We were told by the developer of one of the most popular sitemap and noindex manager apps on Shopify that Shopify forces a NOINDEX, NOFOLLOW when you remove a URL from the sitemap via the API which is crazy... And I am trying to determine is this is indeed the case.
Yes, if you NOINDEX a page with a <meta name="robots" /> element, you should always remove it from the sitemap.xml as it will trigger errors with the search engines because it's ambiguous (the meta robots says don't index it while the URL's existence in the sitemap.xml says do index it). However, the converse is not necessarily true. Just because you remove a file from the sitemap.xml does NOT mean you always want it flagged NOINDEX.
But what is FAR worse than forcing a NOINDEX when you remove a URL from the sitemap.xml is to force a NOFOLLOW in the <meta name="robots" /> element. It's VERY detrimental to SEO and the ability of ALL URLs on the site to rank. It creates PageRank/link juice black holes in your site where PageRank/link juice flows into a page but cannot flow back out. This drain your site of link juice and drastically reduces all of your site's URLs' ability to rank.
Is there documentation for the Shopify API that shows ALL metafield keys, ALL values valid for each key, and what the values do as they relate to sitemap, site product search, and meta robots? I haven't seen much in the way of complete documentation, only a couple of examples here https://help.shopify.com/en/api/guides/updating-seo-data.
Thanks.
Hey @jahodson3,
With regards to the seo "hidden" metafield mentioned in this guide, you're correct in that adding this metafield will add both NOINDEX and NOFOLLOW tag to the resource. This is currently the only metafield that allows configuration of SEO index behaviour.
Note that this method of adding the SEO tags through a metafield has been created to allow a quick and easy way to remove a particular product/resource from search engine index and storefront search results. In cases where something has been published/indexed that a merchant wants removed, this metafield can be added to remove the resource from search engine and storefront search results. This should only be used on individual resources, and not on theme templates. If you're looking to add just the NOINDEX tag to particular a resource, this can be done on theme templates. In case you haven't already seen it, this guide shows how to add a liquid tag that will render a NOINDEX meta tag on the storefront for a particular resource.
JB | Solutions Engineer @ Shopify
- Was your question answered? Mark it as an Accepted Solution
- To learn more visit Shopify.dev or the Shopify Web Design and Development Blog
Thanks for the reply JB. This is not something that would seem doable at the theme level. I need to selective choose particular collections, products, and pages that I don't want in the sitemap and/or site search and/or with noindex and/or noarchive.
Is there a way, using the API, that I could write an app that would provide functionality for users to independently control at the collection, page, and product level the ability to:
1) remove that URL from the sitemap
2) remove that URL from the on-site search
3) add NOINDEX
4) add NOFOLLOW (really don't even want this, should never be used but would do for completeness)
5) add NOARCHIVE
I would want the app to have the 5 checkboxes above and control each option independently of each other so that users could select them in any combination. For example, there are times that you might want to remove a URL from the sitemap, but leave it in site search and leave it indexable. And while I might include a NOFOLLOW option in the app, I would put a warning when selected that this could be detrimental to the site's visibility in organic search (SEO).
I have always heard that Shopify has challenges when it comes to SEO, and now that I've worked with a couple of clients on Shopify I know this is the biggest SEO flaw I've seen with the platform other than page load times. No knowledgeable SEO would ever suggest adding NOFOLLOW to a meta robots element because it literally siphons off link juice and drains the site of link juice.
Is there a way to JUST remove a URL from the sitemap without triggering a meta robots with NOINDEX, NOFOLLOW?
Hey @jahodson3,
I've sent up your feedback regarding the addition of NOFOLLOW in the current metafield. It's very possible there are things I'm not aware of that were considered when the decision was made to add NOFOLLOW with the metafield, but I definitely see where you're coming from in this regard. I can't make any promises as to if/when this will change, but your feedback is in the right hands now.
For your questions, #1 and #2 can be achieved with the metafield we discussed. Note that this metafield is currently the only way to remove a URL from the sitemap. #3, 4, and 5 can be achieved by adding code to the theme template. One idea could be to use a code snippet on the collection, page, or product template, containing conditional logic for rendering the HTML for NOINDEX, NOFOLLOW, or NOARCHIVE based on the resource's tags. Your app can inject this snippet on the template, and then add/remove tags based on the user's selection. This is just one idea, there may be other more elegant ways to do this as well. You may also want to explore using metafields instead of tags for the logic.
JB | Solutions Engineer @ Shopify
- Was your question answered? Mark it as an Accepted Solution
- To learn more visit Shopify.dev or the Shopify Web Design and Development Blog
Thanks so much for the reply JB. I appreciate you passing this on. It would be ideal if they could decouple everything. For example, currently:
POST /admin/api/2020-01/products/#{id}/metafields.json
{
"metafield": {
"namespace": "seo",
"key": "hidden",
"value": 1,
"value_type": "integer"
}
}
removes the URL from the sitemap, removes from site search, and adds noindex, nofollow. You could continue to support this existing treatment when value=1 while decoupling by adding new values for the existing metafield namespace=seo, key=hidden such as:
value=1: Continue doing what you do today
value=2: Remove from sitemap and sitesearch only (and do NOT add a meta robots at all)
value=3: Remove from sitemap only
value=4: Remove from site search only
If we had the above then users can handle generating the meta robots the way they want it. It's the fact that removing a URL from the sitemap with the existing method has three other side effects (remove from search, add noindex, add nofollow) which is problematic.
Again thanks so much for responding and looking into a possible fix/enhancement.
Any update on this?
Hi,
Is there an update on this issue? I am really waiting for the possibility to:
1) remove an URL from the sitemap
2) remove that same URL from the on-site search
3) add NOINDEX
4) add FOLLOW
Now it is only possible to remove an URL from the sitemap when you add NOFOLLOW. You don't want to add NOFOLLOW from a SEO perspective.
Is there any possibility to do this now?
Hope to hear from you.
Any updates ? I really need to remove some canonicalized pages from the sitemap but keep them index, follow
Crickets... I have heard nothing since posting and them saying they were going to raise the issue to the developers. It's funny as the current way this is implemented is very detrimental from an SEO perspective, and this should be a VERY minor fix (likely one or two dozen lines of strategically placed code in the API that would supply much needed functionality while leaving the existing calls backward compatible for those sites/apps that don't mind their link juice flowing into a black hole. It appears they don't have anyone with a lot of SEO talent. I saw the other day they were advertising for an SEO.
Hi, can someone tell me if I am doing this correctly? I read the tutorial but I don't know how to do this JSON post thing, I am not a developer.
My goal is to remove two pages from the Shopify automatically-generated sitemap. These pages are visible and I cannot just put them as hidden because they are used as sections of different pages. However, I want to remove the pages themselves from the sitemap.
So what I did is how I normally edit metafields. I went to https://testeliz.myshopify.com/admin/bulk?resource_name=Page&edit=metafields.seo.hidden and put a value of 1 in those two pages, please see the screenshot. For the other pages which should remain in the sitemap, I left the metafield empty. Is this correct?
How can I test if these pages are really no longer in the sitemap? If I go now to https://testeliz.myshopify.com/sitemap_pages_1.xml they seem to be gone but I'm not 100% sure if I did it correctly, so I would love some feedback.
Thanks!
Francesca
Firstly thank you Francesca for sharing this information and yes, I also used the same method and after some tests we noticed that the URL is no longer on the sitemap.
Once removed, it will no longer be listed by search engines such as Google.
In addition to the way you mentioned you can do it, you can also use the Metafields Editor app, which is very good by the way.
Interesting links:
https://www.shopify.co.uk/partners/blog/110057030-using-metafields-in-your-shopify-theme
https://www.sunbowlsystems.com/blogs/how-to/metafields-in-shopify-without-using-an-app
Hi @DabsDesign and thanks for your confirmation. I also did some tests and eevrything is working as expected.
It was from that Sunbowl link that I learned how to do this actually. I was always told that an app was necessary, but it turns out it's not true which is great. We always try to have the fewest number of apps possible both for site speed and because all those small monthly fees add up fast. So when it's possible to do things without an app, it's definitely better.
Shopify you guys really need to make a user-friendly way to access and update the metafields. It should not be necessary to use weird workarounds or an app just to update important fields about our products!
thank you @cescapesca_86 and @DabsDesign for your explanation how to hide pages from google manually. Could you doublecheck the bulk editor for resource_name=Page?
/admin/bulk?resource_name=Product&edit=metafields.seo.hidden works for me
/admin/bulk?resource_name=Page&edit=metafields.seo.hidden DOES NOT work, the page was not found
@DabsDesign wrote:Once removed, it will no longer be listed by search engines such as Google.
Not entirely true 🙂🙂
The META-robots tag is (as well as all other META-data, considered "signals" by Google - not " directives. What this mean is that they do listen to the signal, take it into (automatic) considerations but they do not promise to always respect it.
There can be many reasons why they chose to keep indexing a page that have META-robots NOINDEX - sometimes it's because so many link to it that they judge it important to keep. There could also be other (more or less valid) reasons. But the fact - and what we have to understand is, that it is not a directive and a secure way to remove a page from Googles index.
The XML sitemap have similar problems. This is again just a signal. Its basically a list of URL's we "suggest" Google should crawl. They often follow that suggestion but again its not a directive. We cannot be sure. Likewise, if we remove a URL from the sitemap there is absolutely no guarantee it will be removed from the index.
Robots.txt on the other hand is a directive. Google promise to respect it (just as all other bots and agents should). So if we exclude a file or directory from crawling in the robots.txt file Google will not crawl it.
However you need to understand that crawling and indexing is not the same. Google quite often include URL's in their index they have not crawled. Some estimates have actually suggested that up to 1/3 of Googles index is un-crawled URL's. Robots.txt do not prevent indexing - it only prevent crawling.
In reality, off course a page that is not crawled will most of not rank for anything because the only keywords Google can associate it with is based on words in the URL's and pages linking to it.
I too, would LOVE if shopify would extend the seo hidden field with the 3 additional variants! That should be a very easy fix and very, very useful!
Is there an update on this please? The suggestion by Jahodson3 would be an elegant implementation that is backwards compatible with existing merchants, but forward compatible for anyone using the alternative variables of 2,3 or 4. Please Please please implement this.
Copied again below:
"You could continue to support this existing treatment when value=1 while decoupling by adding new values for the existing metafield namespace=seo, key=hidden such as:
value=1: Continue doing what you do today
value=2: Remove from sitemap and sitesearch only (and do NOT add a meta robots at all)
value=3: Remove from sitemap only
value=4: Remove from site search only
If we had the above then users can handle generating the meta robots the way they want it. It's the fact that removing a URL from the sitemap with the existing method has three other side effects (remove from search, add noindex, add nofollow) which is problematic."
We are currently facing exactly the same problem. Thanks to this post, I now understand that this is how Shopify behaves here. This is unfortunately not optimal for SEO, I definitely agree with the opinion of @jahodson3 and many others in this thread!
We have found the following workaround for us: We manipulate that content_for_header object. I warn you all, be careful with changes to this object.
Have a quick read of this section in the Shopify docs: https://shopify.dev/themes/architecture/layouts#content
This is how we manipulate the content_for_header object:
(first example is what I want, but I get it NOT to work, anyone have suggestions?)
{% capture h_content %}
{{ content_for_header }}
{% endcapture %}
{{ h_content | remove: '<meta name="robots" content="noindex,nofollow">' }}
This Workaround is not working. I find out that the matching of the equals sign "=" is not working! Does anyone have an idea why the equals sign is not working in this case?
So I built this Workaround for the Workaround 😅: (It's working!)
{% capture h_content %}
{{ content_for_header }}
{% endcapture %}
{{ h_content | remove: ',nofollow' }}
When I just remove the ",nofollow" the outcome is: <meta name="robots" content="noindex">
Which is fine for us, because we are using the Yoast Plugin. With this plugin we can configure each page/product/blog/... to "noindex,follow" and it will remove the page also from the sitemap. Yoast will create a meta robot tag like this: <meta name="robots" content="noindex, follow"> which is exactly what we want to accomplish.
Overall, this is how our <head> looks like now after the above change:
[..]
<!-- This site is optimized with Yoast SEO for Shopify -->
<meta name="robots" content="noindex, follow">
[..]
<!--/ Yoast SEO -->
[..]
<!-- START Content from manipulated content_for_header object -->
<meta name="robots" content="noindex">
[..]
<!-- END Content from manipulated content_for_header object -->
[..]
Unfortunately, we now have the Meta Robots tag in twice, but we no longer have a conflicting signal.
I hope this helps others and maybe someone has a solution for my first example? Thanks already to all!
All you need to do now is figure out how to turn off Yoast generating a meta robots tag. Your workaround is generating the correct value ("noindex" which includes an implied "follow").
@jahodson3 wrote:All you need to do now is figure out how to turn off Yoast generating a meta robots tag. Your workaround is generating the correct value ("noindex" which includes an implied "follow").
That is correct, but we want to keep Yoast anyway. We are adding a lot of Rich-Snippets with it. Also, we use it to put noindex to some pages, because it is handy to do it via Yoast.
Thank you to everyone who participated in our AMA with Klaviyo. It was great to see so man...
By Jacqui May 30, 2023Photo by Marco Verch Sales channels on Shopify are various platforms where you can sell...
By Ollie May 25, 2023Summary of EventsBeginning in January of 2023, some merchants reported seeing a large amo...
By Trevor May 15, 2023