Thanks.
It would be nice to have predicate push down in the Dataset API so it can handle any order of operations and do the filtering like in Spark or other tools here: https://sites.google.com/view/raybellwaves/blog/what-data-processing-tool-should-i-use
What i’m actually after is:
expected = (
Dataset.get("puzzle_inputs")
.where(Column.get("file_name") == file_name)
.read_table(format="pandas")[["input"]]
)