How to import & read any unstructured data file from Foundry to code repository

Dear Community !
Need support to have a clarity over, how to import & read any unstructured data file (for example: .pdf, .rpt etc.) from Foundry to code repository. Further get the file output to be picked up in dataset / data frame using python code?

Many thanks in advance for your support with the same.

You can upload your files to a media set. You can then use the media sets in python transforms or in pipeline builder.

Alternatively you can also use this guide to access files stored within a dataset for various purposes:
https://www.palantir.com/docs/foundry/code-examples/raw-file-parsing-transforms/

Dear @bkaplan ,
For media set .rpt is not supported, also for .pdf also it is not taking.
For now focusing .rpt what will be the approach ?

As we are not able to load the data we are stuck here without a progress.

Also, we have tried it to load as a dataset but no luck too (also as it is not structure files, nothing is getting displayed at dataset).

So, any kind of help or direction will be very helpful !

You should be able to upload the files to a dataset (you will not be able to see anything in the preview, but if you go into the files (details > files), you should be able to see the files that are included. You can then iterate and parse the raw files in a transform as @manu flagged above by looking at the individual files and transforming them into a new output dataframe.

It might be worth taking a look at our docs for building with unstructured data to understand more on how to read from a dataset with no schema and transform it into a dataframe.

Maybe it’s too late, but this post may be useful you to process .rpt files (if they look like the one in the post)

Hope it helps.