Problem with Document Page to Image in Logic

jackmiller2003 · February 20, 2025, 10:22am

Hi,

Having an issue with a block called “Document Page to Image in Logic”. It seems that it can’t take my document reference (the same document reference works for OCR extraction). Would love to use this so that I can put both OCR and an image into the LLM block to get good PDF → markdown.

nickk · February 20, 2025, 4:43pm

Hey! I see that you are directly passing in a media reference value as a JSON string into the input right now. Is there a way for you to create an Ontology Object for each media reference you want to use? You can do this in Pipeline Builder (media set → convert to dataset rows → create new object type with media reference property). The reason this is happening is because the case expression is not allowing you to define a specific media schema type (PDF, png, etc). And because it is being populated with an empty schema, it fails because the Convert doc page to image requires a PDF to come in.

There is work to fix this though through either:

Becoming more lenient for these media expressions to allow for any / no media schema to just run and fail at runtime
Supporting media references as Logic Inputs where users can manually define the schema
Supporting cast to support defining a Media schema