Entity Resolution / Fuzzy Matching

Are there any AIP examples for fuzzy matching / entity resolution between systems using pipeline builder? This is a relatively standard use case and I can’t seem to find any “out of the box” or example solutions. Has anyone figured out a creative way to accomplish this without writing a custom fuzzy matching algorithm?

1 Like

Hi @ally! There are a few things you can do. You can use the Levenshtein distance expression in an advanced join to match on columns that are “close” to each other.

You can also use the use LLM node and ask the model (eg. GPT4o) to output the closest match for a particular column value. You can read more about the node here: https://www.palantir.com/docs/foundry/pipeline-builder/pipeline-builder-llm

Let me know if this is what you’re looking for or if you have any follow up questions!

1 Like

Thank you @helenq !! Joining using Levenshtein distance makes a lot of sense.

How would you use the LLM node to accomplish this when the node can only take in one dataset/transform? ie if I have column X in table A and want to match on column Y in table B how would the LLM node support this?

1 Like

With the llm node this would be helpful if you have a pre-set list of values you’re expecting and you can ask the llm node to find the closest match out of that list. If you don’t know the pre-set list then the Levenshtein distance join is your best bet!

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.