Optimizing spark pipelines using Velox

Hi everyone!

Found that spark pipelines in Foundry can be speed up with Velox (ref: https://www.palantir.com/docs/foundry/transforms-python-spark/velox-overview). However, after I tried to do it by myself, it only increased the build duration.

Does anyone know how to properly estimate which pipeline is a good target for Velox acceleration?

We tested it on a few transforms last year, and didn’t notice much of a change in most cases.

In the end we only moved one transform to use this profile (which is a very large transform doing quite a lot of regex work), but even then the improvement was not dramatic.

Changing the configuration is pretty quick and easy (it’s not like re-writing it in one of the Lightweight languages), so I’d suggest just trying it and seeing the result. Do test with more than one build and take an average, and don’t forget to set the accompanying EXECUTOR_MEMORY_OFFHEAP_FRACTION_* option.

Thanks for response! (happy, that somebody also tried to use Velox)

Yeah, I was doing like that, but the problem was: I need to wait some time for build finish and so time for analyzing one dataset, whether it is suitable for Velox, acceleration was huge.

Generally, I tried to calculate the amount of conversion operations (VeloxColumnarToRow and RowToVeloxColumnar) in Spark Physical Plan, because they are computationally expensive and then to estimate the effect of applying Velox. It shows some result, but it is clear that the dataset suitability not only dependents in this parameter. So, I want to find something like “rule of thumb” in Velox-Spark world.

1 Like

Yes, sadly I can’t give you a ‘rule of thumb’ for when Velox makes sense or not.

At the moment we don’t really consider it as a major option in our pipelines. It’s nice as in theory it’s just a ‘drop-in’ replacement, but we haven’t found situations where it reliably makes a positive impact.

1 Like

The documentation mentions some specific cases where it is less likely to help:

There are generally two patterns which indicate poor native acceleration performance:

  • A small percentage of nodes executed natively, as indicated by the ^ symbol.
  • A large number of RowToVeloxColumnar and VeloxColumnarToRowExec nodes resulting in high serialization overheads.

Optimizing and debugging pipelines • Spark • Native acceleration • Palantir

2 Likes

This topic was automatically closed 91 days after the last reply. New replies are no longer allowed.