Zip multiple notepad files within Palantir

Praveen_Sharma · April 22, 2025, 11:15am

We have a requirement to zip multiple notepad files and export it to another source.
I have a way of doing this for a dataset input in code repository as shown below:

@transform(
    my_output=Output("output_path"),
    source_df=Input("dataset_input")
)
def compute(ctx, my_output, source_df):

    files = source_df.filesystem().files().collect()

    with my_output.filesystem().open("foundry_code_examples.zip", 'wb') as write_zip:
        with zipfile.ZipFile(write_zip.name, 'w') as zip_file:
            for file_row in files:
                with source_df.filesystem().open(file_row["path"], 'rb') as markdown_file:
                    zip_file.write(markdown_file.name, arcname=file_row["path"])

    return

However, this approach doesn’t work when we input a notepad rid as I get this error:

Code references a non-dataset resource

A non-dataset resource is referenced as the input or output of a transform. With a few specific exceptions, it is not possible to reference non-dataset resources (folders, etc.) as transform inputs or outputs.
The referenced non-dataset resources are as below. You can identify which files are referencing them using the code repository search functionality.
ri.notepad.main.notepad-template.*

I would really appreciate any response if it’s possible to do this in Palantir. Also, feel free to suggest alternative approaches (if any) to zip multiple notepad files and export it to other source from Palantir.

Maverick · April 22, 2025, 11:45am

AIP ASSIST:
Automating ZIP and Export Processes in Foundry
Here’s how to automate zipping and exporting files in Foundry:

Using Pipeline Builder

Basic Setup

Create a new pipeline
Add source files/datasets
Configure compression transforms
Set up export destination

Automation Steps:

# Example transform configuration @transform( output=Output("/path/to/output"), source=Input("/path/to/source") ) def compress_and_export(output, source): # Compression logic here output.write_dataframe(source.dataframe(), output_format="csv")

Batch Processing 2

`@transform(
my_output=Output(“”),
source_df=Input(“”),
)
def batch_zip_export(ctx, my_output, source_df):
files = source_df.filesystem().files(glob=“.”).collect()

with my_output.filesystem().open("export.zip", 'wb') as write_zip:
    with zipfile.ZipFile(write_zip.name, 'w') as zip_file:
        for file_row in files:
            with source_df.filesystem().open(file_row["path"], 'rb') as input_file:
                zip_file.write(input_file.name, arcname=file_row["path"])`

Export Configuration 3

Only modified files since last export will be processed by default
Files in destination will be overwritten unless configured otherwise
Create dedicated sub-folders for exported data

Best Practices

Filter and prepare data before export 4

Ensure data meets export control rules
Filter to necessary data only
Optimize file sizes

Configure Output Format 5

Coalesce partitions if needed
Set appropriate compression levels
Choose suitable file format for destination

Monitor and Validate

Check export logs
Verify file integrity
Monitor storage usage

If you need specific guidance on particular file types or export destinations, please provide more details.

Praveen_Sharma · April 22, 2025, 1:30pm

Hello Maverick,

Thanks for your response. As you can see, I am already using your batch processing approach.
However, when we take a notepad rid as an input, it doesn’t work as we get the error mentioned above.

I want to know an approach for notepad files.

nicornk · April 22, 2025, 5:26pm

Transforms and pipeline builder operate on datasets or media sets as inputs, not on notepad documents.

Did you check if there is an api endpoint you could use to get the string content of a notepad?

Praveen_Sharma · April 25, 2025, 8:22am

We do have such APIs for datasets, mediasets and objects. However, I couldn’t find any API that gets the content from a notepad. Please help me with the endpoint, if you know of any. Thanks!!

system · June 24, 2025, 8:22am

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.