AI Vision Function in OSDK

bbd5be9832271c105aa7 · February 11, 2025, 11:26am

Hi,

I have recently Installed the public GPT-4o Vision Marketplace product that allows you to input a picture and then gives you information on it.

It’s super cool!

My issue is that I want to use one of the functions in my OSDK, but when i try to import the function into it. it doesn’t appear as an option that I can select. If someone could explain what I need to do, that would be great!

I have copied the function code below that I am trying to import.

import { Function } from "@foundry/functions-api" 
import * as modelsApi from "@foundry/models-api";


export class VisionFunctions {

    /** 
     * Shows a more complex call to the GPT Vision model with custom interfaces where model input comprises an
     * image as a base64 string and a prompt, both provided by the user.
     * 
     * @param {string} b64Image - The image as base64 string for model input.
     * @param {visionPrompt} b64Image - The user prompt for for model input.
     * @returns {Promise<string>} - If successfull, then returns the model response.
     */
    @Function()
    public async runVisionWithUserPrompt(b64Image: string, visionPrompt: string): Promise<string> {

        type ImageDetail = 'HIGH' | 'LOW';

        // Interfaces to show the structure of model input and all the details that can be included in the model call

        interface Base64ImageContent {
            imageUrl: string;
            detail: ImageDetail;
        }

        interface ChatMessageContent {
            text?: string;
            image?: Base64ImageContent;
        }

        interface MultiContentChatMessage {
            contents: ChatMessageContent[];
            role: 'USER' | 'SYSTEM';
        }

        interface Params {
            maxTokens: number;
            temperature: number;
        }

        interface GptChatWithVisionCompletionRequest {
            messages: MultiContentChatMessage[];
            params: Params
        }

        // MultiContentChatMessage input for the model comprising the prompt and image details
        const MESSAGES: MultiContentChatMessage = {
            "role": 'USER',
            "contents": [
                {
                    "image": {
                        imageUrl: "data:image/jpeg;base64," + b64Image, //image base64 string with prefix "data:image/jpeg;base64,..."
                        detail: 'HIGH'
                    },
                },
                {
                    "text": visionPrompt
                }
            ]
        }

        // GptChatWithVisionCompletionRequest comprising MultiContentChatMessage and Params for the model
        const completionRequest: GptChatWithVisionCompletionRequest = {
            messages: [MESSAGES],
            params: {maxTokens: 3000, temperature: 0}      // Adjust parameters as necessary
        };

        // Send the request to the GPT-4o model
        return modelsApi.LanguageModels.GPT_4o.createChatCompletion(completionRequest)
            .then(response => {
                // Parse the model's response
                if (response.choices && response.choices.length > 0 && response.choices[0].message.content) {
                    const completion = response.choices[0].message.content;
                    return completion;
                } else {
                    throw new Error('No response from GPT-4o model');
                }
            });
    }
}

ewitkon · February 11, 2025, 11:44am

You need to decorate the function with the Query decorator and give it an API name for it to show up.

https://www.palantir.com/docs/foundry/functions/query-functions/

bbd5be9832271c105aa7 · February 11, 2025, 1:58pm

Hi,

Thanks for your response. I did that initially but I got this error, so thought I was going down the wrong path.

To publish a Query with an API name, an Ontology must be imported.

I am not using any ontology objects. How can I solve this?

Kind
Regards
Sam

ewitkon · February 11, 2025, 3:51pm

Yes, I know this is confusing. just choose the ontology your OSDK was built against and choose a random object type, maybe one you already have in the dev console application and tag a new version.

bbd5be9832271c105aa7 · February 11, 2025, 6:43pm

Hi,

Thank you it works now!

Kind Regards
Sam

system · April 12, 2025, 6:43pm

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.