How to format Phone Numbers in a dataset

Hi,
I am new to palantir.I have phone numbers column in my dataset.I am trying to make transformations for my data set.I need help to transform this to a standardized Phone number.I tried Pipeline Builder functions too.It didnt help either.
Phone Number

Hi @afd87510c79e60c7e9b6 , have you looked into using regex? I would specifically use the find and replace expression.

Assuming standard format is:

+<country_code>-<area_code>-<exchange>-<line_number>[x<extension>]

you can use something similar to the regexs below to find and replace:

Step Find Replace Purpose
1 [^0-9xX]+ (empty) Remove non-digits/non-x
2 [ X ] x Normalize extension to ‘x’
3a ^001 +1- Normalize country code
3b ^1 +1- Normalize country code
3c ^(\d{10})(x\d+)?$ +1-$1$2 Add country code to 10 digits
4 ^(+1-)?(\d{3})(\d{3})(\d{4})(x\d+)?$ +1-$2-$3-$4$5 Add dashes

Feel free to tweak it or ask AIP to help generate the right regex for your usecase

Something else that you could do is use the use LLM node if you have AIP on your stack. This would be more computationally expensive but easier to set up because you could just tell the model how exactly you want to clean up the phone numbers

Ive tried using Regex replace but doesn’t seems to work.

There are various kinds of numbers some start with 0s and some end with x1234.For example +1-123-456-7891x000,234-234-1234x12345. This numbers are made up.

You can still use multiple regex replaces to cover the cases you want if a single regex doesn’t work for all formats. If you have a specific example of what you want the regex replace to do you can also ask AI to help with the regex.

If regex doesn’t work I would try using use LLM if that’s easier to set up!