Optimizing PDF Transformation Workflow: How to Process Only the Latest Uploaded Document in a Media Set?

Hello,

I am developing a workshop where users can upload PDF documents, which are then processed and transformed in the backend. Currently, when a PDF is uploaded, it is placed into a media set, and the entire media set undergoes transformation. This approach is both time-consuming and resource-intensive.

I am seeking advice on how to modify this workflow so that only the most recently uploaded PDF is transformed, rather than reprocessing the entire media set. Given that media sets do not support incremental pipeline builds, I am looking for suggestions on how to achieve more efficient, incremental processing.

Hey @etuhabonye the media set team will be working on adding incremental media set support in the coming weeks.

In the meantime potentially you could upload the new sets of documentations separately and union those together. I’ll also let the media set team comment on the above and see if there is another workaround that’s more preferred