I’m trying to use the KNN join, but want to use a custom function for my distance (takes in two strings, returns a float). I wrote a python function and imported is as a udf. Is there a way for me to use this udf as a custom expression to be used for the KNN join? I seem to be able to make the udf into a custom transform, but not a custom expression
Hey this currently isn’t supported as a UDF is not an expression… If your scale isn’t too large then you could try a cross join, run the udf on every row, and then take top K rows