What's the best way to do geocoding in Foundry?

Hey,
I’d like to convert text addresses into coordinates from within the pipeline, either via transforms or pipeline builder.

I’m thinking of treating it as an External Transform in Transforms or via an UDF in Pipeline Builder, that would make the external call.

I was wondering if there’s a smarter way to do it, given there are mentions of Mapbox integrations for geospatial workflows in the doc, but I couldn’t find anything for geocoding.

Thanks !

Julien

9 Likes

We would also love to see this as a first-class platform feature. Very common workflow.

5 Likes

AFAIK, there is currently no way of doing this in a first class way!

We did it a few times by now and here are some recommendations I have from my experience doing this with regular Transforms:

Use an incremental pipeline that caches geocoding results: In an incremental pipeline you can read the previous output and add new geocoding results to the output.
This way you won’t need to constantly geocode the same adresses multiple times if they reappear.

Don’t use UDF’s: From my experience UDF’s create a ton of overhead and take up multiple workers. It’s much easer to just take a limited amount of Rows from dataframes and iterate through them with regular python code.
If you need parallelisation you can use Threads instead.
If you need to geocode more rows at once you can just run the pipeline more frequently and it will fill up you cache over time.

If anyone is interested I can search through some previous code and share some snippets!

4 Likes

Very interested in seeing some past examples! I am working through this problem now. Echo the sentiment that this would be a first-class platform feature

1 Like

I know this is not immediately helpful, and we (pipeline builder) have heard this is an important workflow. We are building out a first class solution coming soon (~months)!

2 Likes

Awesome, thank you for the feedback!