Streamed response from LLM models

elkolorado · May 8, 2025, 12:31pm

Is it possible to call LLM model lets take example gpt4o, so that resposne will be streamed?

This is already achieved inside workshop, or via ai agents.
However, my use case is we are building custom app inside react that uses OSDK, and currently I struggle to find a way to either directly call gpt4o from frontend side so that response from it will be streamed, or calling a backend function that could be created that elevates streamedchatcompletion interface

Can anyone show me PoC code inside ts that allows calling function which will return stream real-time as LLM responses?

import { FunctionsGenericChatCompletionRequestMessages, GenericCompletionParams, ArrayOfFunctionsGenericChatCompletionChunk } from "@palantir/languagemodelservice/api";
import { StreamedChatCompletion } from "@palantir/languagemodelservice/contracts";

@StreamedChatCompletion()
public myStreamedChatCompletion(messages: FunctionsGenericChatCompletionRequestMessages, params: GenericCompletionParams): ArrayOfFunctionsGenericChatCompletionChunk {
    return //what to put here?;
}

Current gpt4o function is like:

export async function gpt4oCompletion(systemPrompt: string, prompt: string, temperature: Double = 0.7): Promise<string> {
    const systemMessage = { role: "SYSTEM", contents: [{text: systemPrompt}] };
    const userMessage = { role: "USER", contents: [{text: prompt}] };
    const gptResponse = await GPT_4o.createChatCompletion({messages: [systemMessage, userMessage], params: { temperature } });
    return gptResponse.choices[0].message.content ?? "Uncertain";
}

I read some docs stating it is not possible to achieve in OSDK… which sounds silly as this is like #2 feature to be used in AI world xd.

Obviously I tried using new feature… AI Agents
and saving it as function

but ultimetly the minimal PoC for AI Agents returns string
which wont fit

let response = await Queries.agentForStreamingAiAssist({ instructions: "", prompt: "hello" });
//resposne is type of string

so I went this logic and tried to add interface to AI Model

but there is no way to choose with your mouse (no-code smh) correct types for input and output.

Even if I would try to provide such, there is no way to implement return type in this no-code studio the type

cause how on earth you can do Array<|> in this silly interface

Even if I correctly provided it in input

The error still says the input is invalid, and output of course will be invalid since it is no way to choose that from this no-code platform.

tl;dr I want to achieve this:
stream: true,
in your standard gpt4o call that whole world is using

Please someone response with actual working code and PoC it can be done in palantir