Hi everyone,
I uploaded a Docker image containing the docling library to Palantir Foundry using the artifact feature. The image was pulled and uploaded successfully, and I can confirm that the library is present inside the container when I test it locally.
However, I’m not sure how to actually use or import this library in a Python Code Repository in Foundry. I just don’t know the correct approach or configuration to make Python recognize and use the library from the Docker artifact.
I really want import libraries in palantir.
from docling.document_converter import DocumentConverter
source = “https://arxiv.org/pdf/2408.09869”
converter = DocumentConverter()
result = converter.convert(source)
print(result.document.export_to_markdown())
Thank you for your answer.
I’ve already tried installing Docling via PyPI in both the Code Repository and VS Code environments. Although the installation technically succeeded, I encountered unknown errors that prevented proper usage.
This question is specifically about the model prefetching and offline usage option mentioned in the github
The documentation recommends running the following command to prefetch models:
My question is: where exactly can I run this command in the context of Foundry? Is it possible to execute it from the VS Code terminal connected to the Code Repository, or does it need to be run in a different environment outside of Foundry?
If you need to stay within foundry you could run this in a vscode with two egress policies: one to your own stack and one to the hostname where the models are stored.
I am not sure how your stack is setup but at least in our case it’s significantly easier to run the model download on the local machine and upload the files to a dataset or package them as docker container.