Can I ingest from an email server or a given email to Foundry?

I have some email inbox that I would like to ingest in Foundry, so that my users can send attachment to process in a pipeline.

Is there a way to ingest emails e.g. by monitoring and ingesting the content of an email inbox, in Foundry ?

As of 4/30/24 there is no email connector. Your best bet is to use external transforms with python to hit your mail server API, or to save the emails into an S3 bucket (or similar) and ingest them from there into Foundry.

As a workaround we are using AWS SES → S3 → Foundry

If you are connecting to outlook you can use the Microsoft Graph API via a REST source in Foundry to connect.

There is also an available connector for Gmail.

As a reply to myself, here is a specific example with Exchange for instance:
https://community.palantir.com/t/ms-exchange-online-to-foundry/386/3

You are probably going to want some intermediary service. Yes, you could use the GMail JDBC driver, but that won’t work for all email providers, and I don’t think it is straightforward to understand.

I built a PoC of an email streaming pipeline using EmailEngine (https://github.com/postalsys/emailengine) as a data source.

If you have seen Nylas (https://www.nylas.com/) before, this service is similar but self-hostable and more barebones. EmailEngine can hook into an email client (Gmail, Outlook, etc.) in an OAuth fashion, and you can configure it to send webhooks to a service of your choosing (in this case, Foundry) based on inbox conditions, like receiving a new message.

To make this work, I needed to:

  1. Get EmailEngine running - I did this locally with Docker.
  2. Hook up an email account to EmailEngine
  3. Create a Foundry Stream with the correct schema
    1. I had to send a few email engine webhooks to a service so I could see and understand the JSON schema
  4. Create an EmailEngine webhook to forward info to the Foundry Stream
  5. Create Pipeline Builder Streaming pipeline to munge the JSON
  6. Profit

I hooked up my own GMail to it and created an ontology as a PoC.


1 Like

Note: In case you want to stream the email to Foundry - which is a different approach than batch but likely useful in some scenario - you can likely use compute module to host and run the docker in Foundry as well
https://community.palantir.com/t/streaming-from-on-prem-dashboard-service/1402/3