Does AIP Agent Studio support Scanned document?

narmbrust · January 28, 2025, 4:13pm

(1) Currently you cannot set input/output token limit in Agent Studio. We are tracking the feature request.

(2) For the “Document-context” mode of Agent Studio, we do text extraction and do not support extraction of images within the document. It is also a feature request we are tracking.

As an alternative, if your document has images you would like to be given to an LLM, you could build a pipeline that does the preprocessing you would like, OCR or extracting each page as an image and then using a custom Function to do the semantic search. This can be used in Agent Studio’s new “Function-backed context” or via a Function tool.

(3) We natively support PDF but if your Media Set has a secondary type associated with it you can upload that type. For example, in Agent Studio, I can go to Document-context, Upload Documents, Create a new media set and add a docx file