Aggregation in Pipeline

Hi, I am facing a weird problem. I have e columns: asset (string), time (timestamp), operating hours (double, cumulative). I wanted to see what is the maximum operating hours for each asset and the related time. I used both roll up and aggregation cart but surprisingly it is giving wrong output. I used asset as the group by column, in aggregation i used max, then in expression i used the operating hours column. i also check the same by putting time column but still wrong output. I noticed I have no access to max by cart. What did I do wrong here?

Hey @Enamul16022, can you share the output you’re expecting to get and the output you’re getting? And to clarify, do you want to get the max operating hours per unique asset and unique time? Or just per the unique asset

Hi, expected output should be maximum operating hours per unique asset and their associated timestamp. for example 2025.04.21.13.45.13 (timestamp), xyz (asset), 2500 (operating hours). The mismatch output is both on timestamp and operating hours. When I checked manually for the maximum operating hours of a specific machines at specific time, it is not being similar to the output I got.

Since you want the associated timestamp I would try to use a window on the asset, get the max operation hours, save that in a separate column, and then filter to only keep the row where the max operation hours = the operation hours for your row.

^ (filter not pictured but you would add it after the window so you only keep rows that match the maximum value)

Let me know if that makes sense or if you have any questions

The output looks better now, still found some mismatch, I will investigate the data quality issues, then still found problem will reply here. Thanks for your help.