AI/LLM Streaming: createChatCompletionStreamed exist, but streaming isn't supported?

jakehop · January 7, 2026, 2:35pm

Hi all,

Streaming text is an excellent UI pattern for relieving user frustration, but aside for internal applications (some Workshop modules, Agent Studio, etc.) streams don’t seem to be supported functions outputs in Foundry.

When building functions and applications with LLMs, one can call createChatCompletionStreamed on several models, but all chunks are collected server-side in Foundry and returned at once.

So why does this even exist?

One can subscribe to Ontology object types, but this won’t solve the problem. It would also be an extremely awkward workaround.

And when is streaming expected to be implemented?

Alternatively, the best course of action right now is to use the model provider APIs directly, negating the stated advantages of AIP.

Jim · January 9, 2026, 7:09pm

Hi @jakehop
Thanks for raising this. We are working internally to reproduce and identify the root cause of the issue that you are seeing. Are you able to share the following information to help with the investigation:

How you’re calling createCompletionStreamed?
What request type(s) you’re using?
Model(s) that exhibit this behavior?

Thanks, your feedback is appreciated as we attempt to resolve this issue.

jakehop · January 15, 2026, 1:02pm

Hi @Jim,

This is more about Foundry architecture, rather than specific models.

Foundry functions do not support streaming responses (see below). I think this is also true even if I’m trying to handle this with a function in an OSDK app, but feel free to keep me honest here.

Looking at the LMS, it looks like the following is happening, even :

The streaming happens server-side within the Foundry Function runtime
The LLM response is collected as chunks on the server
The API returns a Promise of an array of chunks.
The entire array of chunks is returned at once to the caller

So back to my original question: Does this API exist solely for Foundry implementations (e.g. Agent Studio), and is therefore not of much use in it’s current state?

Docs:

Is there support for streaming responses from LLM queries in Ontology functions?

No, there is currently no support for streaming responses from LLM queries in Ontology functions, but the feature is actively being worked on.

Timestamp: June 17, 2024

csw · April 14, 2026, 5:56pm

Hi @jakehop - apologies for the late response. As you said, Functions does not support streaming responses, so the only difference between the “streamed” vs non-streamed Functions is a separate API shape. (As an aside, Agent Studio does not use those functions for the streaming behavior.)

We recently introduced LLM-provider compatible APIs though the Foundry API, which do support streaming [1], although you still will have the same non-streaming limitations if using via Functions.

[1] https://www.palantir.com/docs/foundry/aip/llm-provider-compatible-apis

jakehop · April 15, 2026, 1:17am

Hi @csw,

No apology needed, but I hope you do realise this - plus the 280 second function timeout - makes Foundry a very bad fit for many AI use cases.

We’ve had to build several use cases outside of Foundry for this reason, and the questions one then gets from the leadership is “why do we need to pay for this expensive license?”

I’m a big proponent of Foundry because working prototypes beat powerpoints, a single person can build and own a use case end to end, and it’s been easy to onboard folks who aren’t coders to do relatively advanced data analysis work. However, this is getting easier now with other tools.

Foundry no longer feels as cutting edge as it did 3-4 years ago, and it is getting harder to defend the platform, when functionalities I actively hear other builders at large companies ask for is either in development for over a year (!) or being politely deflected with a suggested workaround.

To mention a few:

AIP Logic is a mess if you go beyond a dozen steps.
280 second function timeouts (anyone using agent-style workflows these days?).
AI resource management as a percentage of total capacity per project is either too little (1 project at 10% will timeout users during use case spikes) or too much (20 projects at 10% will time out all users during capacity spikes).
Agent Studio is lacking in functionality and tool calling is not reliable compared to out of platform alternatives.
Many errors just return “Error” or “Ask your Palantir representative”, making the life of a platform admin a denatured experience.
Workshop still looks as cheap as it did 4 years ago. And custom modules are a workaround, not a solution.
AIP does not actually provide the production functionality one would expect from an AI suite, so one has to write graceful rate-limiting logic and retry handling.
Lots of 404 links in the official documentation makes the platform look like a second rate product and is something a coding agent could fix during the lunch break.

It’s very sad to see, as I have lots of love for the platform, but as of late the development lately seems more reactive than bold.

mmcgwin · April 15, 2026, 11:52am

Hi @jakehop , chiming in from the Palantir docs team. Thank you for flagging the 404 issues you’ve been seeing on our site. We recently invested time to clean up broken links we could find and hope the experience has improved for you. If you happen to run into a broken link, we invite you to let us know through the Send feedback button on the given docs page so we can put up an immediate fix. Thank you!

jakehop · April 15, 2026, 2:37pm

I appreciate the effort @mmcgwin, but it misses the point slightly.

Optics matter when you have to defend a 2-digit million dollar a year contract internally, and when developers complain about missing or 404’ing documentation, the solution isn’t to ask them to provide feedback or link to a developer forum where workarounds are shared. The solution is fixing the problem.

This can be automated with either an agent or a git action, not by crowdsourcing from paying customers.