Hey, I’m trying to transform eml files to pdfs media reference using a dynamic UDF. I dont want to create an output dataset but keep the .eml so i can use the transform “Extract rows from a dataset of email files“. Is this possible to build a dynamic UDF in code repo to use in pipeline builder?
What do you mean by you don’t want to create a dataset?
In terms of writing a UDF for use in builder, you can build a python function which can act as a UDF in builder (docs) or you can write a java udf that can be used in pipeline builder.
Thanks for your response. I can explain better: I am trying to build a python UDF that from within pipeline builder takes an input dataset full of .eml files and outputs a PDF that can then be used as a media reference in my dataset (also an ontology object). The problem I am running into is that there doesn’t seem to be a way to have a UDF that inputs unstructured data(eml files) then outputs unstructured data (pdf files). the workaround I found is doing a transform repo that defines the inputs and outputs but I would like to handle everything dynamically within pipeline builder. Is my analysis/strategy wrong?