Clarification on Ontology primary key datatype (String vs Integer)

0

In Palantir Foundry, I do have a few streaming pipelines that cast numeric IDs to STRING before using them as primary keys. This was developed by Palantir Engineers.

I am trying to understand if that is a best practice.

Are STRING PKs generally preferred over INTEGER in Ontology? If so, Why?

If the source ID is purely numeric, what are the trade-offs of casting to STRING vs keeping INTEGER in terms of storage size, index/filter speed, join/shuffle cost, and repeated cast overhead?

Thank you

1 Like

Hey sorry for the delayed reply – I would say string’s could be preferred over integer because it allows you the flexibility to easily change your id’s from numerical values to a combination of numerical and alpha numerical values without having to break your schema and streaming pipeline. I’m not sure on the impact it would have on the other areas you mentioned but will let someone from streaming chime in there