What models are supported with LLM proxies?

CodeStrap · February 22, 2026, 9:09pm

When calling OpenAI LLM proxies the only model I can get to work so far is the gpt 4 series. IE gpt-4.1, gpt-4.1-mini etc. Is the GPT 5 series supported? How am I susppoed to know which models are supported?

import {
  SupportedFoundryClients,
  type OpenAIService,
} from '@codestrap/developer-foundations-types';
import OpenAI from 'openai';
import { foundryClientFactory } from '../factory/foundryClientFactory';
import type { ChatCompletionCreateParamsStreaming } from 'openai/resources/chat';
import type { RequestOptions } from 'openai/core';
import type { ResponseCreateParamsStreaming } from 'openai/resources/responses/responses';

// ADd tpe definitions for the OpenAI response here, or in a separate file and import them in, to ensure type safety when working with the API response data.
export function makeOpenAIService(): OpenAIService {
  const { getToken, url, ontologyRid } = foundryClientFactory(
    process.env.FOUNDRY_CLIENT_TYPE || SupportedFoundryClients.PRIVATE,
    undefined,
  );

  return {
    // TODO code out all methods using OSDK API calls
    completions: async (
      body: ChatCompletionCreateParamsStreaming,
      options?: RequestOptions,
    ) => {
      const token = await getToken();
      const client = new OpenAI({
        baseURL: `${url}/api/v2/llm/proxy/openai/v1`,
        apiKey: process.env.FOUNDRY_TOKEN,
      });

      const stream = await client.chat.completions.create(body, options);

      let text = '';
      for await (const chatCompletionChunk of stream) {
        text += chatCompletionChunk.choices[0]?.delta?.content || '';
      }
      return text;
    },
    responses: async (
      body: ResponseCreateParamsStreaming,
      options?: RequestOptions,
    ) => {
      const token = await getToken();
      const client = new OpenAI({
        baseURL: `${url}/api/v2/llm/proxy/openai/v1`,
        apiKey: process.env.FOUNDRY_TOKEN,
      });

      // Responses API streaming emits semantic events (delta, completed, error, etc.)
      const stream = await client.responses.create(
        { ...body, stream: true },
        options,
      );

      let text = '';

      for await (const event of stream) {
        if (event.type === 'error') {
          throw new Error(`OpenAI API error: ${event.code} - ${event.message}`);
        }

        if (event.type === 'response.output_text.delta') {
          text += event.delta ?? '';
        }
      }

      return text;
    },
  };
}

csw · March 11, 2026, 7:44pm

Hey @CodeStrap, we’re working on exposing this information directly in Model Catalog, with code/usage examples. In the meantime, the currently supported models/endpoints for the proxy [1] are:

Anthropic “Text Completion” models support the proxied Anthropic Messages API.
OpenAI “Text Completion” models (generally) support both the proxied OpenAI Chat Completions and Responses APIs. (OpenAI itself doesn’t support Chat Completions for all models, e.g. gpt-5.3-codex only supports Responses.)
OpenAI “Embeddings” models support the OpenAI embeddings API.

The above should be the case for any models exposed in Model Catalog.

Additionally, we’re starting to roll out in beta xAI Chat Completions, xAI Responses, Google generateContent, and Google streamGenerateContent APIs. There might be inconsistencies between the Foundry API and provider’s specs which we are actively working to remove as we GA the endpoints. If you encounter any issues with those, please let us know.

You can use the model RID from Model Catalog (e.g. ri.language-model-service..language-model.gpt-5 for GPT-5) as the model name within requests. We also support model aliases within requests (e.g. gpt-5 for GPT-5) and are working to expose the supported aliases within Model Catalog.

[1] https://www.palantir.com/docs/foundry/aip/llm-provider-compatible-apis

CodeStrap · March 17, 2026, 9:23pm

All tests passing. Please note I removed the max tokens (it was too low for reasoning models) and temp which is an ussuported param for reasoning models. Sample code and test cases that is passing for others looking into this issue:

import {
  SupportedFoundryClients,
  type OpenAIService,
} from '@codestrap-tech/developer-foundations-types';
import OpenAI from 'openai';
import { foundryClientFactory } from '../factory/foundryClientFactory';
import type { ChatCompletionCreateParamsStreaming } from 'openai/resources/chat';
import type { RequestOptions } from 'openai/core';
import type { ResponseCreateParamsStreaming } from 'openai/resources/responses/responses';

// Palantir has not documented what models are supported for this proxy
// I opened an issue: https://community.palantir.com/t/what-models-are-supported-with-llm-proxies/6065
// I have only tested with gpt-4.1-mini. gpt-4.1 may also be supported. % series models are not.
export function makeOpenAIService(): OpenAIService {
  const { getToken, url, ontologyRid } = foundryClientFactory(
    process.env.FOUNDRY_CLIENT_TYPE || SupportedFoundryClients.PRIVATE,
    undefined,
  );

  return {
    // TODO code out all methods using OSDK API calls
    completions: async (
      body: ChatCompletionCreateParamsStreaming,
      options?: RequestOptions,
    ) => {
      const token = await getToken();
      const client = new OpenAI({
        baseURL: `${url}/api/v2/llm/proxy/openai/v1`,
        apiKey: process.env.FOUNDRY_TOKEN,
      });

      const stream = await client.chat.completions.create(body, options);

      return stream;
    },
    responses: async (
      body: ResponseCreateParamsStreaming,
      options?: RequestOptions,
    ) => {
      const token = await getToken();
      const client = new OpenAI({
        baseURL: `${url}/api/v2/llm/proxy/openai/v1`,
        apiKey: process.env.FOUNDRY_TOKEN,
      });

      // Responses API streaming emits semantic events (delta, completed, error, etc.)
      const stream = await client.responses.create(
        { ...body, stream: true },
        options,
      );

      return stream;
    },
  };
}

Test case:

it('completions() returns aggregated streaming text using gpt-5-mini', async () => {
      const svc = makeOpenAIService();

      const stream = await svc.completions({
        model: 'ri.language-model-service..language-model.gpt-5-mini',
        stream: true,
        messages: [
          {
            role: 'system',
            content:
              'You MUST reply with exactly: OK (no punctuation, no extra text).',
          },
          { role: 'user', content: 'Reply now.' },
        ],
      });

      let text = '';

      for await (const chatCompletionChunk of stream) {
        text += chatCompletionChunk.choices[0]?.delta?.content || '';
      }

      expect(text).toBeDefined();
      expect(text.trim()).toBe('OK');
    }, 10000);

Please update the error messages regarding unsupported properties like tmp so we don’t get back error like 500 no body.

csw · April 6, 2026, 4:11pm

For posterity: this issue has been fixed and errors should be more actionable!