Streaming text is an excellent UI pattern for relieving user frustration, but aside for internal applications (some Workshop modules, Agent Studio, etc.) streams don’t seem to be supported functions outputs in Foundry.
When building functions and applications with LLMs, one can call createChatCompletionStreamed on several models, but all chunks are collected server-side in Foundry and returned at once.
So why does this even exist?
One can subscribe to Ontology object types, but this won’t solve the problem. It would also be an extremely awkward workaround.
And when is streaming expected to be implemented?
Alternatively, the best course of action right now is to use the model provider APIs directly, negating the stated advantages of AIP.
Hi @jakehop
Thanks for raising this. We are working internally to reproduce and identify the root cause of the issue that you are seeing. Are you able to share the following information to help with the investigation:
How you’re calling createCompletionStreamed?
What request type(s) you’re using?
Model(s) that exhibit this behavior?
Thanks, your feedback is appreciated as we attempt to resolve this issue.
This is more about Foundry architecture, rather than specific models.
Foundry functions do not support streaming responses (see below). I think this is also true even if I’m trying to handle this with a function in an OSDK app, but feel free to keep me honest here.
Looking at the LMS, it looks like the following is happening, even :
The streaming happens server-side within the Foundry Function runtime
The LLM response is collected as chunks on the server
The API returns a Promise of an array of chunks.
The entire array of chunks is returned at once to the caller
So back to my original question: Does this API exist solely for Foundry implementations (e.g. Agent Studio), and is therefore not of much use in it’s current state?
Docs:
Is there support for streaming responses from LLM queries in Ontology functions?
No, there is currently no support for streaming responses from LLM queries in Ontology functions, but the feature is actively being worked on.