bulk CSV download of product catalog for users

Topic summary

A Shopify Plus store with nearly 4 million B2B SKUs needs a solution for customers to download filtered product catalogs as CSV files (e.g., “all red t-shirts from Brand X”). The main challenge is Shopify’s limitation of one concurrent bulk operation per store, which could take ~16 minutes per million SKUs.

Proposed Solutions:

  • Mirror the catalog locally: Set up webhooks to sync product data into a private database (Postgres, Elasticsearch, BigQuery) for instant filtering without hitting Shopify API limits
  • Queue system: Implement job queues (BullMQ, Laravel Queues) to process export requests sequentially, ensuring only one bulk operation runs at a time
  • Backend workflow: When customers submit filters, enqueue a background job that queries Shopify’s GraphQL Bulk API, downloads the JSONL result, applies filters server-side, and generates a CSV
  • Delivery: Upload finished CSVs to cloud storage (S3) and notify customers via email or dashboard link
  • Reserve bulk operations: Use Shopify’s Bulk API only for edge cases or fields not mirrored locally; handle most exports through the local database for speed

The discussion remains open with multiple architectural approaches suggested but no final implementation chosen.

Summarized with AI on October 28. AI used: claude-sonnet-4-5-20250929.

I have a client on Shopify Plus that has a very large product catalog, nearly 4 million SKUs (its a b2b website).

I am looking to provide a way for customers to be able to create a custom downloadable CSV with a product list based on filters of their choice, for example they want to download a CSV of all t shirts in size red from a specific brand.

Due to the size of this catalog, I can see potential issues. I am guessing we would need to use the GraphQL’s bulk function. However I see that only one bulk function can be used at any given moment, so these would likely need to be queued to happen, for example based on admin API limit it would be roughly 16 minutes to process 1 million SKUs like this.

Any input on how to best go about tackling this challenge?

Hi @m4doyle ,

I would like to outline a potential solution for enabling B2B customers to download filtered product CSV files from our Shopify Plus catalog, which contains approximately 4 million SKUs.

The proposed approach involves the following key components:

Customer Interface (Frontend): A straightforward form, potentially embedded as an app or functioning independently, would allow users to specify their desired product filters, such as brand, color, and category.

Backend Service (Node.js or Laravel): Upon form submission, a background job would be enqueued, containing the selected filter parameters. If no other bulk operation is currently in progress, this job would initiate a Shopify Admin GraphQL Bulk API product query, retrieving only the necessary fields.

Queue and Job Handling: A job queue system, such as BullMQ or Laravel Queues, would manage these tasks. To comply with Shopify’s limitation of a single concurrent bulk operation per store, we would ensure that only one bulk job runs at any given time. Once the Shopify bulk job concludes, as indicated by a webhook notification, the resulting JSONL file would be downloaded.

Data Filtering and CSV Generation: The downloaded JSONL file would then be parsed server-side, and the user’s specified filters (e.g., color, brand) would be applied programmatically. Subsequently, a CSV file would be generated using a suitable library like fast-csv or csv-writer.

Storage and Notification: The generated CSV file would be uploaded to a storage solution such as Amazon S3. Following this, the user would be notified via email with a direct download link, or the link would be made accessible through a user dashboard.

Please let me know if you need more information .

Thanks

Hello @m4doyle

Handling 4 million SKUs for on-the-fly CSV exports definitely pushes Shopify’s APIs to their limits. The Bulk Operations API can help, but you’ll want to layer in some architecture for queuing, segmentation, and storage so you’re not tying up a single Bulk query for 30 minutes at a time. Here’s what to do next:

1. Offload your catalog into your own queryable store
Rather than hammering Shopify every time, set up a private app that listens to product/update webhooks and mirrors relevant fields (title, vendor, options, tags, etc.) into a database or search index (Postgres, Elasticsearch, or even BigQuery).

  • That gives you sub-second filtering on “red t-shirts from Brand X” without touching Shopify API limits.

2. Build a CSV-export endpoint
Create an authenticated route in your app where customers submit their filter criteria (e.g., vendor, tag, option values). Your backend then:

  • Queries your local store for matching SKUs,
  • Streams a CSV directly to the user (or uploads to S3 and emails a link).

3. Use Bulk Operations only for one-off Shopify data
Reserve the GraphQL Bulk query for one-off edge cases—like pulling in fields you don’t mirror locally. If you do kick off a Bulk job:

  • Segment it by filter (run separate bulk ops for each vendor or tag group),
  • Poll the job status server-side,
  • Download the CSV to your app’s storage when it’s ready, then link it to the customer.

4. Queue and throttle jobs
If you must rely on Bulk Operations directly for every export, implement a queue:

  • Enqueue each customer request,
  • Process them FIFO,
  • Notify the customer by email (or in-app message) when their CSV is ready. This avoids collisions, since Shopify only allows one active Bulk job per store.

5. Provide a customer-friendly interface
In your storefront or B2B portal, let customers pick filters and hit “Generate CSV.” Under the hood, you enqueue the job and show them a “Your export is being prepared—check back in X minutes” message (or send them a link later).