What does this OSDK Error mean?

I’m performing such an Aggregation via the OSDK

client(MyObjectType).aggregate({

   $select: { “price:sum“: “desc“},

   $groupBy: { “customer“: { “$exact“: { $defaultValue: “Unkown“ }} }

})

It’s throwing the following error:

errorCode: INVALID_ARGUMENT, errorName: AggregationAccuracyNotSupported

What does this error really mean? (I think error descriptions can’t be found in your Documentation)

___________

When I run I it unordered:

client(MyObjectType).aggregate({

   $select: { “price:sum“: “unordered“},

   $groupBy: { “customer“: { “$exact“: { $defaultValue: “Unkown“ }} }

})

I’m not getting any errors. So I could imagine there are too many customer Distinct Values?

Even though on other Object Types exactly the same query worked with more than 10K unique customers to group by

Disclaimer: I don’t work on the team that handles this but have some experience with similar issues. In short, the description for the error is:

The given aggregation cannot be performed with the requested accuracy.
Try allowing approximate results or adjust your aggregation request.

The longer answer is computation for aggregations can get spread across multiple servers. If you sum a bunch of numbers on each server and then ask to order the sums, doing the work on each server and then combining those results is not guaranteed to be the same as doing all the work in one place. The elasticsearch docs have the full details of what is happening behind the scenes: https://www.elastic.co/guide/en/elasticsearch/reference/8.5/search-aggregations-metrics-cardinality-aggregation.html#_counts_are_approximate

So if you want to stop the error you can swap the exact accuracy requirement to approximate, decrease the number of buckets, or order outside the aggregation query itself. Hopefully this is helpful!

1 Like

Thank you @ckucera for the really good explanation here.

I was also looking into changing the queries over the OSDK to “approximate“.

But I think there’s no option for it?

The Dataset has around 2 Mil records with 80K unique attributes I want to group by.

(Just curious here - I’m probably going at the end anyways for Backend)