I'm currently implementing a feature that is supposed to bulk import orders, convert them into csv and upload them to a custom storage endpoint.
However, the documentation states that nested connections are no longer nested in the JSONL output. Therefore, it might happen that a product / lineItem etc. that is part of an order, might not be the direct successor (in terms of lines) of the related order.
Using the JSONL format is perfect for streaming data and preventing high memory usage. But how do I know how much data I need to load at least to ensure that I covered all relations for a specific item? Is it possible that a connection for an entity of line #1 might appear at the very last line of the file? This would break the concept of streaming as I would still be required to push the whole file into memory to restore the relations.
Would be great if you can provide feedback on this topic.
Thank you a lot!!!
Hey @bastian12 ,
It is possible for a connection on line #1 to appear at the bottom of the file, but using the JSONL format gives you the advantage of not having to load all the data into memory at once. Since it provides all data as separate objects, you can load the file using far less memory by parsing it line-by-line. You'll still need to restore the connections using parent_id as you mentioned, but you can also do this one object at a time without the need to load everything into memory at once.
To ensure you've covered all relations, you can query currentBulkOperations and look at the objectCount. This will tell you how many objects are contained in the file, and you can use that to ensure you've parsed the same number of objects when restoring the connections.
JB | Developer Support @ Shopify
- Was my reply helpful? Click Like to let me know!
- Was your question answered? Click Accept as Solution
|a minute ago|