Bypassing Workshop 200k Export Limit Using Ontology Attachments

lindavataksi · December 18, 2025, 8:46am

I’m trying to work around the Workshop 200k row export limit for five large datasets that I want users to be able to download directly from my application’s landing page. These datasets can grow to 5 million rows, so I need a scalable export mechanism.

Proposed Approach
My idea is to create a separate ontology object with two fields:

Dataset — the dataset name

Attachment — a string storing an ri.attachment… reference

T.StructType([
    T.StructField('Dataset', T.StringType()),
    T.StructField('Attachment', T.StringType())
])

My thought process here is to create a transform to generate CSVs using a transform that coalesces the dataset into a single file:

from transforms.api import transform, Input, Output

@transform(
    output=Output("/REDACTED/CSV Datasets Storage/Final3Object"),
    my_input=Input("ri.foundry.main.dataset.ea77c655-b761-4d7c-af64-fa546a2993c7")
)
def computeFinal3(output, my_input):
    output.write_dataframe(
        my_input.dataframe().coalesce(1),
        output_format="csv",
        options={"header": "true"}
    )

After the CSV is produced, I want to run a function that uploads the CSV as an attachment and assigns it to the ontology object. However, I’m running into an issue: the attachment upload API expects a local file path, but the CSV exists only as a Foundry dataset path.

from functions.api import function, OntologyEdit
from ontology_sdk import FoundryClient
from ontology_sdk.ontology.objects import BuilderAttachmentDataset

@function(edits=[BuilderAttachmentDataset])
def create_csv_attachment(object: BuilderAttachmentDataset) -> list[OntologyEdit]:
    client = FoundryClient()
    ontology_edits = client.ontology.edits()

    final3_csv_path = "/REDACTED/.../CSV Datasets Storage/Final3Object"
    # final2_csv_path = '' // there will be ab 5 csv paths total in this dataset

    if object.dataset == "Final 3":
        attachment = client.ontology.attachments.upload(
            file_path=final3_csv_path, attachment_name="final3.csv"
        )
        editable_object = ontology_edits.objects.BuilderAttachmentDataset.edit(object)
        editable_object.attachment = attachment
        return ontology_edits.get_edits()
    else:
        return []

The idea is that the CSVs will be regenerated weekly or so, and the ontology attachments will update accordingly. Then, in Workshop, I can use an object card with an attachment property so users can download the full dataset directly (see image ex below)—bypassing the 200k row limit.

I’m running into issues dynamically assigning attachments to this new attachment-storing dataset (which should only store five rows total). Is there a solution for this, or a better approach I should consider?