Optimizing spark pipelines using Velox

The documentation mentions some specific cases where it is less likely to help:

There are generally two patterns which indicate poor native acceleration performance:

  • A small percentage of nodes executed natively, as indicated by the ^ symbol.
  • A large number of RowToVeloxColumnar and VeloxColumnarToRowExec nodes resulting in high serialization overheads.

Optimizing and debugging pipelines • Spark • Native acceleration • Palantir

2 Likes