We are continuously noticing that we are missing some order which we try to collect via Admin Order Api. We are using the updated_at_min field to get all the new/modified/updated/refunded orders.
We are noticing that some orders are created at/updated at back date_time.
For example if I query all the orders after timestamp_1 and current_timestamp ( name it as timestamp_2) using updated_at_min field in the query , I would get n numbers of orders . But when I do the same query again (order between timestamp_1 and timestamp_2) after 2 hours or 6 hours, I would get n+m numbers of orders.
So my questions
1) Can any application controls the created_at/updated_at fields when creating or modifying orders?
2) Is Shopify allowing the orders to be created/updated in back datetimestamp?
3) Is it possible that some orders might be delayed before getting updated for the API access ?
We are basically missing orders in our ETLs
Out ETL process runs every 10 minutes like this
1) bookmark_datetime = bookmark_datetime - 30 seconds
2) Get all the orders after updated_at_min=bookmark-datetime
3) Modify the bookmark_datetime to last order updated_at field after sorting the response according to updated_at field. ( bookmark_datetime = lastest order updated_at datetime )
4) Go to step 1 after every 10 minutes
But we are sometimes misses order in this simple flow. The only possible reason I can think of is shopifying is creating or updating orders in back-datetime.
Please help me here . If we miss orders it changes all our r orders, revenue and profit reports.
To answer your questions directly:
1. created_at and updated_at are immutable, but apps and Shopify internal services can indirectly modify updated_at by modifying the resource itself.
2. I don't completely understand this question, are you asking if we or others directly modify the timestamp? If so, see 1.
3. I've never seen a delay longer than a few seconds in order serialization before it can be accessed via the API, and those were what I would consider to be edge cases.
I'd like to better understand your situation. Are you saying that if you make the same GET request to the orders endpoint with the exact same updated_at_min and updated_at_max (assuming it was initially the current time) values at Time + 0hr, 2hr, 6hr, etc you are getting a growing list of orders back as a response?
We also found that we are missing orders when using the updated_at_min query parameter.
We do rolling, overlapping windows: every hour we request orders updated during the last hour + 15 minutes. We also check for:
- status = open;
- payment status = paid;
- fulfillment status = unfulfilled;
- test = false;
- confirmed = true.
And that sometimes, somehow, makes us miss orders. So we go back in and request all orders updated after last week, and lo and behold, now there's orders from 3 days ago, that weren't available yesterday.
Which makes us ponder and hypothesize (because we aren't insiders on the API) whether there's any updating happening that doesn't affect the updated_at date. In which case checking for the updated_at_min query parameter is useless at best, and misleading at worst.
I found a gap in Shopify's order publication. Shopify API devs: please note. I will share this in a post by itself, too.
One of our orders was created and processed at 2019-09-13T07:29:00-04:00.
It got updated at 2019-09-13T07:29:06-04:00.
At 2019-09-13T11:30:32.920Z (a little under a minute and a half later) we submit an hourly request for orders. This time those updated since 2019-09-13T06:15:42-04:00.
You'd think the response from Shopify would include that order.
16 Minutes later, at 2019-09-13T11:46:05.495174Z, we perform a daily audit and request orders updated since 2019-09-12T07:47:13.013-04:00.
That did include the order updated at 2019-09-13T07:29:06-04:00.
Then an hour later, at 2019-09-13T12:30:32.954Z, we submit another hourly request. This time for orders updated since 2019-09-13T07:15:38-04:00. Note that we increased the updated_at_min window with 15 minutes, just to catch orders missed previously.
That also included the order updated at 2019-09-13T07:29:06-04:00.
Therefore the Orders API has a gap when publishing orders. It's larger than a minute and a half, and smaller than 16 minutes.
I advise everyone to increase their updated_at_min window with 15 minutes earlier than the last time they downloaded orders.
I am revisiting this question since our company uses a very similar flow and we experience the same issues with missing orders.
Has anything been clarified regarding when an order is "visible" for an orders request using updated_at_min? And are there some changes that may lead to updated_at not being changed on the order?
I don't recall having seen a clarification. My advice stands: increase your window with 15 minutes into the past.