Best practices for bulk decryption using Cipher in Functions: Handling timeouts and API limits

Hi everyone,

I am currently working with Palantir Cipher and attempting to decrypt sensitive data within a Functions (Python & typescript v1) repository, following the official documentation.

The Challenge:
I need to decrypt approximately tens of thousands of records. When I use decryptAsync() on each record individually, the Function fails due to execution time limits (timeouts) and other internal errors.

My Questions:

  1. Is there a specific rate limit or a maximum number of calls allowed for the Cipher API within a single Function execution?

  2. Is there a way to perform “bulk decryption” on a specific column/property within Functions, rather than iterating through records one by one?

  3. If processing this volume in Functions is not recommended, what is the best practice for handling large-scale decryption while maintaining security (e.g., using Restricted Views or Batch Pipelines)?

I would appreciate any insights or code snippets on how to handle large datasets with Cipher efficiently.

Thanks in advance!

Hi @wisteria! Apologies for the delayed response.

To answer your questions individually:

  1. The Cipher API has no limits on how many times it can be called, but Functions themselves do have timeouts. You should be able to bump this timeout in Ontology Manager (see the docs here).

  2. I don’t believe there is a bulk endpoint to hit for decryption, but you can definitely collect a bunch of decryptAsync() calls in a Promise.all()! Functions will batch those calls together.

  3. Great question! Functions should be sufficient to handle this I think. However, if you’re still running into issues, you could always create materializations of your ontology objects with encrypted values and then pass those materializations into Transforms. You can get around the time restrictions of Functions that way! Note that you would need to ensure that the Cipher license used for that has the appropriate permissions (which are different from the ontology action permissions).

    Finally, if you’re able to provide details (no worries if you can’t), what is your current workflow that requires this large-scale decryption methodology? Would dynamic decryption upon accessing specific data also fulfill the workflow requirements?

Hi @kat,
Thank you for your quick response!
I now understand that the timeout and other errors are most likely caused by the Function itself (e.g., memory errors) rather than the Cipher API. Thank you also for sharing the best practices!

To answer your question: to simplify my workflow for explanation purposes (the actual workflow is more complex).
My workflow involves aggregating a business metric from an encrypted dataset based on identifier codes and date ranges provided by users via Workshop.
For example, I have an encrypted dataset like the following:

encrypted_column value date
<enc:A> 1 2025-01-01
<enc:A> 2 2025-01-05
<enc:B> 1 2025-01-03

Users input a plaintext identifier code (e.g., ID-20250101-001) and a date range via Workshop. Since the matching between the user’s identifier code and encrypted_column involves complex business rules, encrypting the user input (e.g., ID-20250101-001) as a workaround is not an option. Therefore, all records must be decrypted first, making partial decryption impossible. Furthermore, storing the plaintext of encrypted_column in Palantir Foundry is not allowed due to my customer’s governance requirements, so pre-decryption in Transforms is also not an option.

This may be a fairly unique use case, but thank you for taking the time to address it!
Thanks again for your help!

Thank you so much for the details! It sounds like you’re doing the right approach for the workflow you described; needing to aggregate the plaintext data is most common reason to bulk decrypt!

If you’re using Transforms it may be most efficient to perform the calculation upstream of the ontology (if you’re not working with any ontology edits), which would avoid the need to create materializations. Depending on the complexity of the aggregation you need to perform, it should be possible to collect your metrics by decrypting in-memory and simply discarding the plaintext result instead of writing it to a table.

Best of luck!

1 Like