LLMs in the Model Catalog do not allow all options

I’ve noticed that support for options in the public documentation, such as structured outputs, is not supported as options for the model found in the model catalog. For example https://ai.google.dev/gemini-api/docs/structured-output#javascript. But the params defined for the Gemini family of models in AIP are:

type Parameters = {
    "stopSequences"?: Array<string> | undefined;
    "temperature"?: FunctionsApi.Double | undefined;
    "maxTokens"?: FunctionsApi.Integer | undefined;
    "topP"?: FunctionsApi.Double | undefined;
};

Ideally, we could use the SDK from Google, OpenAI, or a similar package like Vercel’s AI offering. Is there a reason we are forced to use the proxy API in AIP? Is there a workaround?

You can either register in the flavour of Bring Your Own Model or your can call it as an external system via your functions as well to get the full suite of functionalities.

The issue with the one-size-fits-all approach of AIP is that what one doesn’t support, all can’t support, so some API/model specific functionalities must be sacrificed on the altar of broad support.

This shouldn’t stop you however.

Have you tried defining this schema as the output of e.g. an AIP Logic function? Not fully sure what you are looking for in the Gemini API that this output format can’t deliver, so feel free to elaborate if there’s more to your use case than a structured output.

I think you are missing the point. The advantage of using built-in AIP models is not the abstraction offered by the model catalog. It’s the fact that these are private endpoints provisioned by the hyperscaler in partnership with Palantir. They tend to offer better rate limits and better security than anything I could provision on my own. I’m totally familiar with bring your own model, etc. In the early days of AIP, Palantir provided us with API keys and the URLs of these private endpoints. I actually used to make raw API requests in my transforms to these endpoints. Then this concept of abstraction was introduced. I can see how this might be valuable in providing an interface that allows features in modules like AIP logic to interoperate with different models. That’s not a problem that most developers building AI systems close to the foundation layer have. For example, we made a focused decision to use the Gemini family of models for planning due to their cost and performance characteristics. We use OpenAI for coding as they tend to make fewer mistakes. As such, we are choosing to couple these models to the solution tightly. A simple example of how the current abstraction in Foundry makes this difficult is the inability to use structured outputs that pass a schema. Both Gemini and OpenAI support structured outputs, but within Foundry, I don’t see a structured outputs option in the type. This feature is critical for driving error rates in generated output to zero. Below is a sample request that uses structured outputs and OpenAI with gpt-5-mini

const response = await fetch('https://api.openai.com/v1/responses', {
        method: 'POST',
        headers: {
            'Content-Type': 'application/json',
            Authorization: `Bearer ${process.env.OPEN_AI_KEY}`,
        },
        body: JSON.stringify({
            model: 'gpt-5-mini',
            input: [
                { role: 'system', content: [{ type: 'input_text', text: system }] },
                { role: 'user', content: [{ type: 'input_text', text: user }] },
            ],
            reasoning: { effort: 'low' },
            // Optional: keep or remove web_search; it isn't needed if you fully inline the spec + code
            tools: [
                {
                    type: 'web_search',
                    user_location: { type: 'approximate', country: 'US' },
                },
            ],
            text: {
                format: {
                    type: 'json_schema',
                    name: 'EditPlanV0',
                    schema,
                    strict: true,
                },
                verbosity: 'low',
            },
            store: true,
        }),
    });

Additionally, tools like WebSearch don’t appear to be supported. Additionally, the outputs may lack the details I need. For example, are the input/output tokens included in the responses from AIP? Ideally, developers could have direct access to the model endpoint and could use native tooling from OpenAI, Google, Vercel, or make a raw fetch request.

2 Likes

It looks like this issue should be solve with API Proxy pictured below. But I can’t find any documentation on this. Can you please provide a link to the docs? Also is TypeScript supported?

Full video:
https://youtu.be/vEU_UgsQZAA?si=t0nTfO_BbaCX_RMt

1 Like