How will functions registed via compute module scale

maddyAWS · November 26, 2024, 5:40pm

I have been working with the compute modules to expose LLM’s in bedrock via bedrock SDK

Here is my current setup

Compute module with a function name InvokeLLM that takes input messages calls bedrock and responds
Webhook that calls the compute module
Typsecriipt function that calls the webhook. I had to do this because as per a previous post, I cannt call a compute function directly from a tyescript function
the function published Via typescript now shows up as a model to choose from list of registered models in AIP logic

My question is, if multiple users use the same model at the same time, will the compute module process these requests sequentially ?
I guess I can control the RPM at the webhook level, if the webhook is configured to concurency limit 10, I would assume 10 of these requests will hit the compute module

Will compute module process these request sequentially ?
If so how to make the compute module process more than 1 request at a time

Based on the Documentation, I understand that compute module manages concurrency when using the SDK, This concurrency is set when configuring the container and the replica’s will scale to meet the concurrency.
So I need to make sure that I have enough scaling in the webhook and compute setting.

lauraaf · December 13, 2024, 9:56pm

You can in fact all a Compute module function from a typescript function. here are the steps to do so

Prerequisites

You must register your function in the Compute Modules application with a global API name.
You must have the compute module running for live preview to work.
You must initialize a TypeScript code repository.

Enable resource generation

Before you begin, ensure that resource generation is enabled in your Typescript code repository.

Open your functions.json file.
Set the enableResourceGeneration property to true.

Import your compute module function

To import a compute module function in TypeScript, follow the steps below:

From the left panel of the Compute Modules application, find and select the Resource imports tab.
Select Add, then select Query Functions to display a pop-up window to select an Ontology.

Screenshot 2024-12-13 at 2.52.10 PM1514×566 126 KB
Although compute modules are not tied to a specific Ontology, you must select one for the import process. It is recommended to select an Ontology that is related to the compute module function you want to import.
Search for your compute module function’s API name.

Screenshot 2024-12-13 at 2.53.16 PM1514×930 95 KB
Select the function.
Choose Confirm selection.

Rebuild your remote workspace in Code Repositories

After importing the function, rebuild the remote workspace to apply the changes.

Import and use the function

The example below shows how to import and use a compute module function:


import { Function } from "@foundry/functions-api";

// API Name: com.mycustomnamespace.computemodules.Add
import { add } from "@mycustomnamespace/computemodules"; 

export class MyFunctions {
    @Function()
    public async myFunction(): Promise<string> {
        return await add({ x: 50, y: 50 });
    }
}

Important considerations

Project location: Ensure the compute module is in the same Project as your TypeScript code for live preview to work correctly.
Type consistency: TypeScript enforces strict type checking. Ensure the declared return type matches the actual return type of your compute module function. For example, if you declare a string return type, your registered compute module function must return a string, not a struct type.
Asynchronous operations: Compute module functions are typically asynchronous. Use async/await syntax for proper handling.

lauraaf · December 13, 2024, 10:00pm

As for how we process the job

Your replicas have a global configurable “concurrency limit”. We are aware of how many jobs each replica it’s currently processing. If we can schedule a job in any of the running replicas we will do so. If not we put that job in a queue that we start dispatching the minute there are replicas available.

On a side note, we scale up and down based on the current load both in progress + queue

system · December 27, 2024, 10:01pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.