Implementation of Docling for Intelligent Document Processing in Foundry

Hello,

Working for a while on a versatile intelligent document processing pipeline that will allow to ingest large amount of unstructured documents until objects, I would like to exchange with the community that might have similar use-cases.

Of course, AIP tools are a possible way but processing cost could reach high amount so I will not really detail this one below.

An equivalent to AIP that would require lower compute would be the implementation of projects like Docling or small VLM (Qwen 2B, InternVL, SmolVLM).
Docling

My pipeline today is working via 2 inputs:

  • A batch part where documents are ingested, rotated and clustered per similarities (Purchase Orders, Good receipts, Invoice, etc…)
  • A Template declaration part - where user will declare a set of documents that they want to ingest. This will include creation a JSON schema for each template as well as assigning the template to the right cluster.

After that, both the expected schema and the individual document are given to the model to process and perform key-value pair / KIE. From my point of view, this ensure a low compute, versatile tool across document type and consistent output that can be transformed as objects.
Today, I am only using a small VLM (in a model asset) for now but Docling could be an interesting way forward thanks to the Doctag output format.

Looking forward if anyone is working on similar idea,

Note: I could not find a suitable tag.

1 Like

Hi there,

Have you tried importing docling into a code repository? It looks like it should be available via conda.

Alternatively you can import models from Huggingface or even import code via Compute Modules, and use those in your pipeline to process documents.

It’s all there for you!

A word of advice: Don’t focus solely on cost, but look at the benefits too. A car that only starts half the time might not be worth half the cost of one that starts every time.

Hello @jakehop ,

Sorry my initial post was maybe not the best explanation !
I don’t have any issue to import the library or models used by docling, I just wanted to exchange on this Intelligent Document Processing use-case and curious how other people tackle it in Foundry :slight_smile:

But thank you for your help as well !