I am worried about the robots.txt file for my client website which you can see here:
is this blocking the blogs and collections pages from being indexed by the search engines?
Solved! Go to the solution
Search engine spiders will crawl your full website to temporarily store your site pages for indexing. Generally speaking, most website owners are happy that search engines crawl and index any page they want; However, there are cases where you don't want the pages to be indexed.
For example, if you are developing a new website, it is generally best to prevent search engines from indexing your website so that the incomplete webpage does not appear in search engines and sometimes website owners Stop search engines from indexing specific pages is necessary from time to time because website owners don't want every page to index due to many reasons, and yes robots.txt is blocking your blogs and pages from indexing in google search results.
This is an accepted solution.
@Eavesy, robots.txt file is not blocking all collections and all blogs pages from being indexed on that site.
The screenshot given is a stock standard Shopify robots.txt file. So any blog and collection URLs, if they contain a "+" (plus) character, will be prevented from crawling. The strings "%2B" and "%2b" are just a URL encoded '+' symbol, so mean the same thing.
While not officially documented by Shopify, I would assume that the reason Shopify has added this config is to minimize their own server resource usage (at scale). It'll also help a bit with unnecessary Googlebot crawl-budget consumption and infinite crawler traps. Most of the time (but not all the time) those faceted nav are not very useful to searchers, so it's usually a good thing that they are not crawled/indexed anyway.
Here's an example of paths that would be allowed vs. blocked with the default Shopify robots.txt:
This doesn't seem to be an issue at all for your site, because tag faceted/filtered links are not even being used internally on collections or blogs.
It's not so much about resources, but the above is correct on there being no block to the collections or blogs. The others posting here are just wrong.
When the filtered by tag urls are used there's a chance that the tags could be shown in different orders (eg something+else or else+something) but still return and show the exact some content. There's an advantage to not indexing those to reduce duplicate content risks. It doesn't stop your main collection being indexed.