A space to discuss GraphQL queries, mutations, troubleshooting, throttling, and best practices.
Hello,
lately we've been experiencing a large number of timeouts from the Shopify Admin/REST API. This has been going on for a while now, but we hadn't noticed because the retrying logic handled it (too) silently. These timeouts are happening while trying to update products, they happen intermittently and across various API keys. In a 12h window out of 183.955 requests, we got 27.727 timeouts.
The requests are made from Cloud Functions on GCP using https://github.com/MONEI/Shopify-api-node. We just updated to the last version, still happening.
So far we have neither been able to discern a pattern, nor been able to reproduce it reliably. During the 12h window there was also around 50 Internal Server Errors interspersed between the timeouts.
If someone could take a look at the logs and check how it looks from the servers' side, that would be really helpful.
Cheers
Will
Solved! Go to the solution
This is an accepted solution.
OK, did some more testing.
By spinning up instances of GCE vm's, I found that only the ones using a Cloud NAT had the issues. VM's with an external IP did not have the issue.
So the fix was actually very easy - I am no expert in Cloud NAT, but I found a setting that indicated a maximum of 64 connections to a given ip:port destination per vm. I bumbed that to 4096, and now everything works with no slow connections or timeout.
I tested it with this from a linux command line
siege -t 120s -c 10 -v https://shapingnewtomorrow.myshopify.com/collections.json
I got around 4500 transactions in 2 minutes
I hope you can apply similar changes to your setup and get it working
thanks,
Peter
bump
It feels like our calls are being throttled by a load balancer or something on Shopify's end. Can someone please check the logs. Bump.
bump
Still ongoing.
did you ever get this resolved? I am seeing the exact same thing, using shopifys own graphql client.
I can repro be hitting the same endpoint from the server via a curl command on command line. Random reply times, and often just timeout
Hey. No, didn't get it resolved. I scheduled some time for the coming weeks to revisit this, but we may be forced to migrate away to https://medusajs.com/ if I can't resolve this. Support was super unhelpful as well unfortunately.
Thanks -- I wonder if this has something to do with an unfortunate setup that causes Google Cloud traffic to Shopify (via cloudflare) to be flaky. Similar mentions of this issue seems to all include traffic from google cloud.
I will try to move the traffic to a different host not on Google metal - will be interesting to see what happens then
Oh yes, that would be very interesting to know indeed, when are you going to try this? Would you mind letting me know about your findings?
This is an accepted solution.
OK, did some more testing.
By spinning up instances of GCE vm's, I found that only the ones using a Cloud NAT had the issues. VM's with an external IP did not have the issue.
So the fix was actually very easy - I am no expert in Cloud NAT, but I found a setting that indicated a maximum of 64 connections to a given ip:port destination per vm. I bumbed that to 4096, and now everything works with no slow connections or timeout.
I tested it with this from a linux command line
siege -t 120s -c 10 -v https://shapingnewtomorrow.myshopify.com/collections.json
I got around 4500 transactions in 2 minutes
I hope you can apply similar changes to your setup and get it working
thanks,
Peter
Wow, that sounds ver promising. You're amazing! 😍
What is the setting called?
Alright, found it in the GCP console and in the Terraform provider docs. Works like a charm! Thanks so much @hrstrand, this has been bugging us for half a year!
https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_router_nat
Yes, this exactly. I did not know I had the issue until I had an uptake in traffic. I had noticed spurious timeouts, but could not repro. Glad that it fixed your issue - for me, it was good to know that others saw a similar issue, so thanks for reporting it initially 👍