Potential Security Issue: LLMS.TXT Apps Scraping Draft and Unpublished Content

I’ve just reported three separate apps from the Shopify App Store to Shopify. Each of them was designed to generate an llms.txt file for use by large language models (LLMs) such as ChatGPT, Claude, and others.

The use of llms.txt is a relatively new and fast-growing topic. As AI-driven tools become more integrated into search and discovery, many Shopify merchants are looking to optimise their stores for visibility within these platforms. However, this early-stage implementation may come with unintended risks.

The issue I encountered is quite serious in my opinion:
Each of the apps I tested scraped not only published store content, but also draft products, unpublished pages, and other non-public content.

This presents several risks, including:

Loss of confidentiality: sensitive product information or internal content could be exposed

Compromised data integrity: outdated or placeholder content may be indexed as live

Lack of control: unpublished materials are treated as public without consent or notice

If you’re currently using any app related to llms.txt or AI SEO, I strongly recommend reviewing what content is being included and whether the app offers proper settings to control or exclude sensitive data. None of the apps I tested provided any clear warnings or filtering options.

If anyone knows of a reliable solution that handles this properly and respects unpublished content, I’d be grateful for a recommendation.

1 Like