Resource Managment fails to attribute usage to resources, while the data sits there in Dataset API

Resource Management is a great app to see usage, but this may be a bug.

I see these false claims in the Resource Management app:

However, in the Dataset UI I can see one particular dataset which has the exact same usage perfectly attributed to it:

This makes this dataset the topmost resource user!

Clearly, since this data is within the system, it should be reflected in the Resource Management app too.

Hello @ZygD

  1. The date range selected in resource management is not exactly same as you viewing in resource usage metrics for individual dataset, this is the reason compute details are not same.
  2. Unit for usage is not same in both screen (dataset resource usage metrics and resource management) - I don’t know why but this is strange.

Sharing below with example-
Resource management ↓ (Compute seconds)

Dataset Resource usage metrics ↓ (Compute hours)

Now we can simply do the conversion of compute seconds to compute hours
compute seconds = 2748 / (60 * 60) → 0.763

Let me know for any doubt :slightly_smiling_face:

Hi @roshanv329v. Thank you for taking a look.

The dataset was only built once during February (on 27 February), so taking monthly statistics should be fine. There is no way to see individual days in Dataset Resource usage metrics for dates a few months ago. But again, this dataset was only built once, so monthly stats are ok.

Also, I can see perfectly attributed compute resources to 4 other datasets in Resource Management, but not this one. I am showing a problem which may occur in some special cases, but not always.

Hi @ZygD,

Thanks for posting, and apologies for any confusion here. The top resources included in this table are determined based on their rankings within a larger time window (in this case the past 90 days), but then the usage displayed in the table is aggregated only over the selected time range. So in this case, you identified a dataset that was more expensive on one particular day in February, but not within the broader time range for which rankings were calculated.

I’ve just shipped this change to clear things up in the future:

If you’re looking to do more open-ended analysis of usage data than what the Resource Management app currently supports, consider using the internal dataset export feature.

EDIT: I just looked more closely at your screenshot and realized that you only have four resources in your table, so the issue is not truncation here. Is it possible that the dataset in question has moved between projects?

Thank you @NoahL for looking into this and the shipped feature.
The dataset was not moved, it is still in the same lovation. It was built twice: on February and the other time a couple of weeks ago in May.

The reason I was asking about whether the project may have changed was because I see you have a project filter and a source filter applied in Resource Management, so I’m wondering whether those filters could be excluding the dataset you’re looking for.

I’d also be curious to know how the dataset in question is produced in case that could be relevant to the puzzle.

Back in February this was a regular Python @transform() (Spark transformation).
On 5 June we have refactored the transformation to run on DuckDB with SQLFrame just like in this documented example.

Maybe there is a simpler explanation here: in Resource Management, the date filter is for February 27, 2026 – did the dataset in question build on that day?

There could also be a timezone issue since Resource Management uses UTC but I think the Dataset UI uses local timezone.