Is the filter transform from pipeline builder on virtual tables fetch required data only or does it read all the data?

youngC · October 29, 2025, 2:37am

Hi, Is pipeline builder’s filter rows transform on virtual table automatically converted to the query condition and fetch required data only? Or will it try to fetch all the data and filter in the spark?

We are using databricks virtual table.

hugo · October 29, 2025, 9:37am

Hey! For databricks tables this depends on if you have external access enabled.
See: https://www.palantir.com/docs/foundry/available-connectors/databricks#external-access-to-storage-locations-virtual-tables-only

If external access is enabled, we delegate to delta/iceberg spark connectors which will attempt to do file pruning based off of metadata. If the data is partitioned correctly, this can lead to less data being fetched.

If external access is disabled, we rely on databricks JDBC which should push down filters.