Webhook reconciliation for large volume of data

Solved

Webhook reconciliation for large volume of data

ArneRademacker
Shopify Partner
2 0 0

Hi everyone,

 

We are currently developing an app for Shopify which relies on periodic status updates to function. We are considering implementing those status updates as webhooks. However, as webhook delivery is not guaranteed, we also need a reconciliation job.

 

The status changes that interest us are whether an order has been shipped/cancelled/returned. There are reasonable webhooks for each of these, but the basic resource for these events is unfortunately "orders", a constantly updating large pool of information.

 

Our current reconciliation approach would see us

  1. Poll all events (for the status changes that interest us) us since the last poll
  2. Take the order IDs from those events
  3. Poll all Orders with those IDs
  4. Store the retrieved orders in our own database

However, that is two subsequent requests per event with possible pagination and a decently large volume of data to process. I.e. we're basically just querying the majority of orders several times over, once for each event that we're interested in.

 

My questions:

  1. Is there a way to match the Event API responses with the Webhook API responses? Do they contain IDs that could be compared? This would allow us to query for events and simply discard those that we have received via webhook already, saving subsequent requests. Still not great, but it's a start.
  2. Is there a way to query for webhook events that have not been delivered, or did not return a 200? That would be optimal. In that case reconciliation would be a simple scheduled poll for missed events.
  3. What is the recommended strategy for webhook reconciliation beyond "just query everything all the time"? What's the point of webhooks if we're just querying the entirety of the resource anyway? I feel like I'm missing something here. The Shopify Docs seem a little vague on this point.
  4. Is there a better way to receive updates to individual fields of a resource? E.g. could I query for all orders that had their "return" status set within the last 24 hours? (updated_at is discouraged by the Shopify Docs, as orders can update very frequently, including internal technical reasons)

We've gotten to the point of considering not using webhooks, as a reconciliation function at a regular interval would fulfill mostly the same requirements.

 

Cheers and thank you to all who would take the time to answer some questions 🙂

Accepted Solution (1)

Liam
Community Manager
3108 341 879

This is an accepted solution.

Hi ArneRademacker, 

Great to hear you're working on this app project - I'm sure we'll be able to figure out the best process for receiving reliable notifications about the status changes of orders. I'll answer the questions below.

 

  1. Matching Event API responses with Webhook API responses: The Event API responses and Webhook API responses do contain IDs that can be compared. The Event API provides a stream of events, and the Webhook API delivers notifications based on those events. By comparing the IDs of the events received via the Event API with the data received through webhooks, you can determine if you have already processed a particular event and avoid duplicating the work.

  2. Querying for webhook events that were not delivered or did not return a 200: Currently, Shopify does not provide a built-in way to query for webhook events that were not successfully delivered or did not return a 200 status. Shopify assumes responsibility for delivering webhooks, and in most cases, retries failed deliveries automatically. If you require more robust delivery monitoring or missed event detection, you would need to build custom logic to track the delivery status of webhooks and handle missed events accordingly.

  3. Recommended strategy for webhook reconciliation: Implementing a reconciliation function at a regular interval is a common strategy to complement webhooks. This function would query the necessary resources, such as orders, and compare them with the data received through webhooks. The purpose of webhooks is to provide real-time event notifications, but a reconciliation process helps ensure data integrity and handles any missed events or discrepancies that may occur due to the asynchronous nature of webhooks.

  4. Receiving updates to individual fields of a resource: Shopify's API does not provide a direct way to receive updates for specific fields of a resource. The recommended approach is to query for the entire resource and filter the results based on the desired field changes or date ranges. While updated_at is discouraged due to frequent updates, you can still utilize it along with other filters to narrow down the results.

Considering the trade-offs and complexities involved in reconciling webhooks, it's important to weigh the benefits against the overhead of managing the reconciliation process. If you find that the reconciliation function at regular intervals meets your requirements and offers a simpler implementation, it might be a better alternative to relying solely on webhooks.

 

Hope this helps!

Liam | Developer Advocate @ Shopify 
 - Was my reply helpful? Click Like to let me know! 
 - Was your question answered? Mark it as an Accepted Solution
 - To learn more visit Shopify.dev or the Shopify Web Design and Development Blog

View solution in original post

Replies 2 (2)

Liam
Community Manager
3108 341 879

This is an accepted solution.

Hi ArneRademacker, 

Great to hear you're working on this app project - I'm sure we'll be able to figure out the best process for receiving reliable notifications about the status changes of orders. I'll answer the questions below.

 

  1. Matching Event API responses with Webhook API responses: The Event API responses and Webhook API responses do contain IDs that can be compared. The Event API provides a stream of events, and the Webhook API delivers notifications based on those events. By comparing the IDs of the events received via the Event API with the data received through webhooks, you can determine if you have already processed a particular event and avoid duplicating the work.

  2. Querying for webhook events that were not delivered or did not return a 200: Currently, Shopify does not provide a built-in way to query for webhook events that were not successfully delivered or did not return a 200 status. Shopify assumes responsibility for delivering webhooks, and in most cases, retries failed deliveries automatically. If you require more robust delivery monitoring or missed event detection, you would need to build custom logic to track the delivery status of webhooks and handle missed events accordingly.

  3. Recommended strategy for webhook reconciliation: Implementing a reconciliation function at a regular interval is a common strategy to complement webhooks. This function would query the necessary resources, such as orders, and compare them with the data received through webhooks. The purpose of webhooks is to provide real-time event notifications, but a reconciliation process helps ensure data integrity and handles any missed events or discrepancies that may occur due to the asynchronous nature of webhooks.

  4. Receiving updates to individual fields of a resource: Shopify's API does not provide a direct way to receive updates for specific fields of a resource. The recommended approach is to query for the entire resource and filter the results based on the desired field changes or date ranges. While updated_at is discouraged due to frequent updates, you can still utilize it along with other filters to narrow down the results.

Considering the trade-offs and complexities involved in reconciling webhooks, it's important to weigh the benefits against the overhead of managing the reconciliation process. If you find that the reconciliation function at regular intervals meets your requirements and offers a simpler implementation, it might be a better alternative to relying solely on webhooks.

 

Hope this helps!

Liam | Developer Advocate @ Shopify 
 - Was my reply helpful? Click Like to let me know! 
 - Was your question answered? Mark it as an Accepted Solution
 - To learn more visit Shopify.dev or the Shopify Web Design and Development Blog

ArneRademacker
Shopify Partner
2 0 0

Hi Liam,

 

Thank you for your swift response!

Your answers are very helpful, I will add them to our infrastructure planning 🙂

 

As this does a good job of answering what I asked, I will mark it as accepted, even though I didn't present a singular problem.

 

Cheers!