I deployed a Python user defined function (UDF) in my code repository. The function takes multiple str parameters.
@function
def standardize_address(address1: str, address2: str,
address3: str, address4: str,
city: str, state: str, postalCode: str,
country: str, dummy: bool = False
) -> str:
address = ' '.join(filter(None, (address1, address2, address3, address4, city, state, postalCode, country)))
# Remove all multiple whitespaces with a single
address = " ".join(address.split())
address = address.upper()
address = parse(address)
for key, value in addressTerms.items():
address = re.sub('\\b'+ key + '\\b', value, address)
# Remove all non-alphanumeric characters and space
address = re.sub(r'[^A-Za-z0-9 ]+', '', address)
return address
I am able to build and tag the code appropriately. I am also able to import the UDF into the pipeline.
However, when I try to use the function in the pipeline, the order of parameters is different.
Although the function parameters are named, and the data values I pass in are the correct fields, the function appears to work as if the name of the parameters does not matter; and only the position matters.
In this example, instead of returning a string that is address_1 + address_2 + address_3 + … and so on, the function returns country + address_3 + address_2, and so on.
To try and fix this, I also added a dummy boolean parameters with a default value; but that does not help either.