Creating API Endpoint in Foundry for Databricks AWS Integration

My Setup

  • Databricks in AWS private bucket
  • Palantir Foundry environment
  • Familiar with Databricks connector in Foundry

What I Need Help With

I need to create an API endpoint in Foundry that can receive data pushed from our AWS Databricks environment. I understand the Databricks connector exists, but I’m struggling with:

  1. How to properly set up the API endpoint in Foundry to receive external data pushes
  2. Configuration needed on the Databricks side to successfully push data
  3. Authentication setup between the systems

Questions

  • Are there examples for creating this type of API endpoint?
  • What’s the recommended approach for handling the data once received?
  • Any specific documentation pages that cover this specific use case?

I’ve reviewed available documentation but couldn’t find clear step-by-step instructions for this particular integration. Any guidance or examples would be greatly appreciated.

While we have upcoming support for receiving webhooks in Foundry, it sounds like this might not be totally appropriate for your use case.

If you are trying to push data because your Databricks is in a private AWS bucket, I’d recommend setting up a Data Connection Agent: https://www.palantir.com/docs/foundry/data-connection/set-up-agent/

If you are trying to push data into foundry for some other reason, I’d recommend utilizing out public APIs, which include endpoints for pushing data directly to datasets and to streams: https://www.palantir.com/docs/foundry/api/v2/general/overview/introduction/?productId=foundry

1 Like

Can you describe your requirements in terms of business KPIs? What do you want to achieve with this data push? Streaming? Near real time?
Also: do you have direct network connectivity from your databricks Jobs to your foundry instance?

More context will help to find the best possible integration method.

Below are the requirements for the solution

  • Frequency: Batch synchronization occurring twice daily (not real-time or streaming)
  • Purpose: The data will undergo processing in Foundry and ultimately be presented in dashboards

There is no direct network connection exists between your Databricks environment and Foundry instance