Use of Model Adapters in Pipeline Builder

Hello,

Following a question I previously raised in SO:
palantir foundry - How to use a Model Adapter in a User Defined Functions (UDF) for Pipeline Builder - Stack Overflow.

With the new Python UDF functions, we would like to know if it would be possible to use a Model Adapter API within a UDF in order to infer all the rows of input data ?

Best regards,

Hi! Welcome to the Palantir Developer Community!

As of now, model assets are not supported in Pipeline Builder. We definitely want to build an integration, however we do not have any concrete timelines at this time.

As the stack overflow answer suggests, you can use a python transform or batch deployment to create a dataset of model predictions and then use those in your pipeline.

Hello Tucker,

Thank you for your reply and will look forward for this implementation.

Slightly related to this topic but from Workshop module, what would be the way to infer a Live deployed model from Workshop ?
Does using Typescript or Python function with an API call to the model endpoint would work ? (It does work in Slate.)

Best regards,

This should work out of the box

https://palantir.com/docs/foundry/functions/functions-on-models/

Hello,

I would like to follow-up on this feature. Is there any update on:

  • Using a Model Asset in Pipeline builder via a UDF?
  • Using a registered model (not Palantir provided - externally hosted) via a UDF?

Cheers,

It depends on what type of model you want to use.

If you want to use an LLM you can register the LLM in Foundry, which makes it available in the same manner that Palantir provided models are (easy use in Pipeline Builder, AIP Logic, etc). See docs about that here https://www.palantir.com/docs/foundry/administration/bring-your-own-model.

If your model is not an LLM and you want to call from Pipeline Builder you can use a Python function as a UDF in Pipeline Builder, and inside the function you can call the model.

Calling the model from a python function isnt the easiest thing, you need to register the model as function, and then use the platform SDK to call the model.

In the below example, I have a model that is registered with the api name com.foundryfrontend.models.TitanicModel2:

import foundry
from functions.api import function
from foundry.v2 import FoundryClient
from foundry_sdk_runtime import FOUNDRY_HOSTNAME, FOUNDRY_TOKEN

TITANIC_MODEL = "com.foundryfrontend.models.TitanicModel2"

@function
def execute_titanic_model(age: int, gender: str) -> str:
    client = FoundryClient(
        auth=foundry.UserTokenAuth(hostname=FOUNDRY_HOSTNAME.get(), token=FOUNDRY_TOKEN.get()),
        hostname=FOUNDRY_HOSTNAME.get(),
    )
    result = client.functions.Query.execute(
        TITANIC_MODEL,
        parameters={"age": age, "gender": gender},
        preview=True,
    )

    if result.value:
        return "The person is predicted to survive"
    else:
        return "The person is predicted to NOT survive"

You’ll have to change the code to make it work for your use case

Hello Tucker,

The provided code snippet is working well in preview from Code Repository, I manage to get my results back from the model.
However, when calling the UDF from Pipeline Builder, I am having an APIUsageDenied in Preview and another error in build.

 Failed to call udf for recoverable error: {transformId=ri.function-registry.main.function.08b46bac-78f7-4ce7-a2b6-91d91ac67af7, transformVersion=Version{major: 0, minor: OptionalInt[0], patch: OptionalInt[2], preReleaseIdentifier: Optional.empty, semanticVersionRange: Optional.empty}, transformName=ModelAsset_model, batchFailedResult={result={type=runtimeError, runtimeError={message=PermissionDeniedError: {
    "errorCode": "PERMISSION_DENIED",
    "errorInstanceId": "edc07d65-5ff4-4a74-9ee2-2c0212213715",
    "errorName": "ApiUsageDenied",
    "parameters": {
        "missingScope": "api:usage:functions-read"
    }

In build:

Caused by: com.google.common.util.concurrent.UncheckedExecutionException: java.util.concurrent.ExecutionException: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 11.0 failed 1 times, most recent failure: Lost task 0.0 in stage 11.0 (TID 7) (localhost executor driver): org.apache.spark.SparkException: [TASK_WRITE_FAILED] Task failed while writing rows to foundry://my_hostname.com/datasets/ri.foundry.main.dataset.772dxxxx/views/ri.foundry.main.transaction.0000xxxxx/startTransactionRid/ri.foundry.main.transaction.0000xxxxxf/files/spark."

[...]

  File "/app/var/cache/yorw9IXjpm2/lib/python3.9/site-packages/foundry_sdk/v2/functions/query.py", line 46, in __init__
    self._api_client = core.ApiClient(auth=auth, hostname=hostname, config=config)
  File "/app/var/cache/yorw9IXjpm2/lib/python3.9/site-packages/foundry_sdk/_core/api_client.py", line 344, in __init__
    assert_non_empty_string(hostname, "hostname")
  File "/app/var/cache/yorw9IXjpm2/lib/python3.9/site-packages/foundry_sdk/_core/utils.py", line 120, in assert_non_empty_string
    raise TypeError(f"The {name} must be a string, not {type(value)}.")
TypeError: The hostname must be a string, not <class 'NoneType'>.
, parameters={}}}, index=0}}

So it looks like the hostname could be malformated when using hostname=FOUNDRY_HOSTNAME.get() ?