I need to perform API Call from a transform (e.g. an External Transform).
I’m not sure what to use: Spark-based
transforms or lightweight
But I’m also not sure which approach to take: Threadpool
, Processpool
, parallelisation via rdd.mapPartition
or rdd.map
or UDFs (different types of UDFs as well exist, …)
Is there any benchmark or example of very fast processing of API calls from Spark/transforms context around ?