Using Typescript function to convert audio file to text

robind · March 18, 2025, 2:57pm

I am trying to convert an audio file to text that is recorded via audio widget in Workshop. I have an audio media set and I plan to have a function backed action that runs this function (I need to add the create object bit in the code below, just need help with the transcribing bit to work for now). I copied the code from the documentation (https://www.palantir.com/docs/foundry/functions/api-media#transcription) but am getting “Module '”@foundry/ontology-api"’ has no exported member ‘AudioFile’." for the line “import { AudioFile } from “@foundry/ontology-api”;”

import { Function, MediaItem, TranscriptionLanguage, TranscriptionPerformanceMode } from "@foundry/functions-api";
import { AudioFile } from "@foundry/ontology-api";

@Function()
public async transcribeAudioFile(file: AudioFile): Promise<string|undefined> {
    if (MediaItem.isAudio(file.mediaReference!)) {
        return await file.mediaReference.transcribeAsync({
            language: TranscriptionLanguage.ENGLISH,
            performanceMode: TranscriptionPerformanceMode.MORE_ECONOMICAL,
            outputFormat: {type: "plainTextNoSegmentData", addTimestamps: true}
        });
    }

    return undefined;
}

What am I doing wrong please…

Isy · March 18, 2025, 3:01pm

Hello!

The AudioFile referenced in the function is the name of the object type, so you’ll need to replace it with whatever your object is called.

You will also need to make sure you import the object type to the code repository too. You can do this by going to the ‘Resource imports’ tab on the left hand side.

Hopefully that helps!

robind · March 18, 2025, 6:35pm

Thanks for your reply and that makes sense. I currently have it so that pipeline builder rebuilds when a new audio file is added to the media set (pipeline builder does the voice to text). This process takes a long time, I was hoping that a typescript function would make this transcription process much quicker. Do you mind describing the high level process, from recording a new audio file in Workshop to creating a new object in an object type with the transcribed text? If typescript needs to work with the object type, I am not sure how to get the new audio file added to the object type without building in pipeline builder.

robind · March 19, 2025, 3:43pm

I think I have figured this out. In OM, you need to define a mediaReference property as a Media Reference type. Then create a create Action that adds a new object with properties path and mediaReference. Then in Workshop, create an audio widget that fires this Action when recording is complete (path and mediaReference can be pre-populated). Then create a Typescript function that converts an audio file to text as per the docs and tag and release, with your object type name instead of AudioFile. I need to do some LLM tasks to this text so in AIP Logic, I read in the object and add a new Function block that runs this Typescript function which converts the audio to text and then passes this on to other LLM blocks. At the end of AIP Logic, I have an Action that updates the object. Then run an Automate on this AIP Logic that triggers when a new object is added.

In Typescript, recommend setting
performanceMode: TranscriptionPerformanceMode.MORE_PERFORMANT, if you want performance over economy.

system · May 18, 2025, 3:44pm

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.