Extract Text from PDF in Pipeline Builder

tiffany · July 18, 2024, 2:36pm

Hi, the PDF text extraction board is not backed by GPT. It is backed by text extraction models that are specialized at reading and parsing PDFs. You cannot use self-hosted models for the PDF parsing. If you want to use self-hosted models, you can import a UDF that uses the model directly into your pipeline.