Convert PDF files to Mediaset files in Code Repo

I’m working on a transform where I need to convert and store multiple file types — including PDFs, DOCX, and PPT files — into MediaSets in Foundry. However, I’m running into some issues.

Here is a snippet of my current code for reference:

from transforms.api import transform, Inputfrom transforms.mediasets import MediaSetOutput

@transform(input_dataset=Input(“ri.foundry.main.dataset.xxxx”),output_media_set=MediaSetOutput(“ri.mio.main.media-set.yyyy”,media_set_schema={“schema_type”: “document”,“primary_format”: “pdf”},additional_allowed_input_formats=[“docx”, “ppt”],storage_configuration={“type”: “native”},retention_policy=“forever”,should_snapshot=False,),)def upload_files(input_dataset, output_media_set):output_media_set.set_write_mode(“modify”)output_media_set.put_dataset_files(input_dataset,ignore_items_not_matching_schema=True)

Any suggestions on troubleshooting this issue in code repo? Thanks

What issues are you running into?

Error: The code in transforms-python/src/myproject/datasets/mediaset_output.py is attempting to create or reference a media set but has not provided all the mandatory parameters required for media set creation.

But I have provided all the required parameters.

Thanks for giving that information! I think you might need to provide the write mode of the media set as well (ie. transactional or transactionless).

You can follow the docs here on how to use the UI to populate all this information for you, and to check that you have all the parameters.

https://www.palantir.com/docs/foundry/transforms-python/media-sets#create-a-media-set