Given a query that returns orders and then needs to return all the items on each order, the query cost is much higher than it should be if you need to list the actual items.
This issue stems from the fact that the estimator looks at the number of items requested to be returned on an object but then multiplies that by how many child items are requested, however this is a problem. If I have 100 orders each with 1 item and then a few (5 orders) that have multiple items, lets just say 10 each, then I can’t simply return all 100 orders with their items. The initial query has to specify a number of items to pull, say 20 which makes the api think I’m actually pulling a cost of say 20100=2000 instead of the 100+510 = 150. That’s a big difference.
Given that sub objects are often found based on a key value, I think it would be wise to either reduce the cost for those items, avg the cost or simply count the cost after the query has executed and only use that cost.
The result of checking the cost first is that valid queries that will only return a few results are rejected and as a result more queries than necessary need to be executed as the data would then have to be paged.
So in the example above, you can do several queries for 10 items on each order, or locate all the orders and page through each one that has multiple items. In either case it costs more in overhead than simply accepting the requested query and returning the original results.
In summary: A subquery / node should have a much lower cost on it after the first item, probably a log based curve as in most cases the item counts decrease per order.
Overall it would be a good idea to revisit this and perhaps look at how to create more margin for requests or set the cap to be for a longer period such as 10,000 / 10 mins instead as that would create a lot more room to execute more costly queries without hitting the limits. Unfortunately the more complex a query, the more likely this is to be an issue.