ChatGPT + Shopify: querying with privacy

Solved

ChatGPT + Shopify: querying with privacy

marie-lav
Visitor
1 0 0

AI chatbots like ChatGPT are rapidly changing how we work with data. I'd love to hear how you use them day to day. I run a Shopify store with a lot of order data and I'd like to tap into ChatGPT for deeper insights, but privacy is a top priority for me. I dont want to share customer names, emails or order ids with any external model.

 

How do you keep data safe while still giving gpt enough context to be useful? Do you sanitize or aggregate the data first, or use a masking script on the fly? Any advice or workflow suggestions would be greatly appreciated. 

 

Thanks

Accepted Solution (1)

BenoitDomingue
Shopify Partner
14 2 2

This is an accepted solution.

You can send a cleaned or partially encrypted CSV to GPT by removing columns like names or emails and hashing customer IDs. It works, but once you have lots of columns or thousands of rows, it quickly becomes messy, and you’ll need to re-clean and re-upload the file every time you want fresh insights, which is hard to maintain weekly.


Another option is using a Shopify app like Kuma. It connects directly to your store, runs queries against your data without actually sending it anywhere, and only passes anonymized parameters and aggregates to GPT. No sensitive data leaves the store, and you don’t have to clean or upload anything manually.

View solution in original post

Replies 3 (3)

websensepro
Shopify Partner
2120 265 315

Hi @marie-lav ,

 

Great question. To protect privacy while using ChatGPT, we first sanitize the data by removing or masking personal details like names, emails, and order IDs—replacing them with generic placeholders like Customer_001. When possible, we use aggregated data instead of raw records, so ChatGPT works with summaries like top-selling products or return trends. For efficiency and safety, we can also set up a script to automatically clean your Shopify exports before any analysis. This way, you get valuable insights without exposing sensitive customer information.

 

If my reply is helpful, kindly click like and mark it as an accepted solution.
Thanks!
Use our Big Bulk Discount app to boost your sales! 🚀 (https://apps.shopify.com/big-bulk-discount). Easy to set up and perfect for attracting more customers with bulk discounts. Try it now and watch your revenue grow!

 

Need a Shopify developer? Hire us at WebSensePro For Shopify Design Changes/Coding
For Free Tutorials Subscribe to our youtube
Get More Sales Using Big Bulk Discount APP
Create Your Shopify Store For Just 1$/Month
Get More Sales Using Big Bulk Discount APP

BenoitDomingue
Shopify Partner
14 2 2

This is an accepted solution.

You can send a cleaned or partially encrypted CSV to GPT by removing columns like names or emails and hashing customer IDs. It works, but once you have lots of columns or thousands of rows, it quickly becomes messy, and you’ll need to re-clean and re-upload the file every time you want fresh insights, which is hard to maintain weekly.


Another option is using a Shopify app like Kuma. It connects directly to your store, runs queries against your data without actually sending it anywhere, and only passes anonymized parameters and aggregates to GPT. No sensitive data leaves the store, and you don’t have to clean or upload anything manually.

Jonathan-HA
Shopify Partner
339 26 109

I personally use Maple AI which is privacy-focused and claims to have end-to-end encryption, so they can't even see your chats at all and can't use it for training.  They even accept bitcoin for payment if you don't want to give them your credit card info.

 

It's probably not as good as ChatGPT, but it's been good enough for my use case which is primarily software development.  I believe their model is based on the open source Llama model.

 

It's probably still a good idea to mask customer info, though, whenever you upload data to some external service. 

 

You could potentially assign a non-sensitive ID to each record (e.g. a UUID) and use that as the new identifier. Another approach is to generate a random string only known to you (which kinda acts like a password) and apply a one-way hash function (e.g. MD5) against this random string + sensitive data, and use that as the identifier.  The reason for using a random string with the sensitive data before hashing is to make sure that ID doesn't exist anywhere else that can be used for lookups to figure out the unhashed value.

Co-Founder / Developer at Highview Apps
Our Shopify Apps: EZ Exporter | EZ Inventory | EZ Importer | EZ Notify | EZ Fulfill