Making a Model Adapter for Image Data

adampsu · January 5, 2025, 6:02am

As the subject states, I’m having problems making a model adapter for image data. Specifically, I don’t work around the api method. For all of the examples in the documetation, pm.Pandas() is used. Of course, for my task, I need images as inputs, not Pandas. When I saw the list of available functions in the palantir_models library, these seemed promising:

Object
ObjectSet
Pandas

However, I don’t know how to use them, and the lack of examples is making this workflow pretty difficult. Here’s what I have so far, with comments abstracting what I don’t know how to do:

class ImageGeolocatorV1Adapter(pm.ModelAdapter):

    @pm.auto_serialize(
        model=PytorchStateSerializer()
    )
    def __init__(self, model):
        self.model = model

        # Input image preprocessing steps
        self.transform = transforms.Compose([
            transforms.Resize((224, 224)),
            transforms.ToTensor(),
            transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
        ])

    @classmethod
    def api(cls):
        # Define inputs, outputs
        
        return inputs, outputs

    def predict(self, image_in):
        # Some way to load the images 

        image = self.transform(image).unsqueeze(0).to(device) # Convert to Tensor
        image = image.to(torch.device('cpu'))
        
        # Run the model to predict latitude and longitude
        self.model.eval()
        with torch.no_grad():
            prediction = self.model(image)  
            
        # Return the output (latitude, longitude) as the most convenient format

Any help with this?

eiriklt · January 6, 2025, 9:18am

Hi!

If you define your inputs (in #api) using a MediaReference (from palantir_models import MediaReference) this should allow you to pass in an image;

inputs = [
            ModelInput.Tabular(
                name="inference_data",
                df_type=DFType.PANDAS,
                columns=[
                    ModelApiColumn(name="media_item_rid", type=str, required=True),
                    ModelApiColumn(name="media_reference", type=MediaReference, required=True),
                ],
            )
        ]

You can then in your #predict access the image using

input_media: MediaReference = row["media_reference"].get_media_item()
image = Image.open(input_media)

I hope this helps!

-Eirik

william · January 6, 2025, 3:14pm

Hey @adampsu @eiriklt ,

Media References model inputs are a little more complicated, and would not work with just the rid and MediaReference input.

It is easier to pass the base64 encoded image as a string input and decode it in the adapter. This might look like:

class MyModelAdapter(pm.ModelAdapter):
    ...
    
    @classmethod
    def api(cls):
        inputs = {
            "image_base64": pm.Parameter(type=str),
        }
        outputs = {
            "your_output": pm.Parameter(type=str),
        }
        return inputs, outputs
    
    def predict(self, image_base64: str) -> str:
        # decode the base64 and pass it into the model
        return "..."

If this does not work for your case or you have any other questions feel free to reach out

eiriklt · January 6, 2025, 4:20pm

Hey @william,

Thanks for flagging! Are you sure this is still the case? It is explicitly listed as supported here, and the source code I posted above is from an example using it (Disclaimer; I did not verify that example).

Of course, using this requires wrapping your input as tabular so if you want to avoid that using base64 is probably easier. Media references are also not supported as outputs.

-Eirik

adampsu · January 7, 2025, 4:38pm

Hey! This works for the most part, and it is relatively easy to implement. I’ve submitted the model, but I’m getting a different error when testing it out via the Sandbox tool.

TypeError: Compose.__call__() got an unexpected keyword argument 'image_base64'

I’m following an example, and my inputs and outputs match pretty closely. I must be overlooking something entirely; I can’t quite tell what the error is. The model’s .predict() function seems to work fine (I was able to pass in an input in my Code Workspace and it worked), but there seems to be an error elsewhere.

class ImageGeolocatorV1Adapter(pm.ModelAdapter):
    @pm.auto_serialize(
        model=pms.DillSerializer()
    )                  
    def __init__(self, model):
        self.model = model
        self.transform = transforms.Compose([
            transforms.Resize((224, 224)),
            transforms.ToTensor(), 
            transforms.Normalize(
            mean=[0.485, 0.456, 0.406],
            std=[0.229, 0.224, 0.225])
        ])

    @classmethod
    def api(cls):
        inputs = {
            "image_base64": pm.Parameter(type=str),
        }
        outputs = {
            "coordinates": pm.Parameter(type=dict),
        }
        return inputs, outputs

    def predict(self, image_base64):
        image_data = base64.b64decode(image_base64)
        
        # Convert the byte data to an image
        image = Image.open(BytesIO(image_data)).convert('RGB')
        image = self.transform(image).unsqueeze(0)

        with torch.no_grad():
            outputs = self.model(image)
            latitude, longitude = outputs.squeeze().cpu().numpy()

        result = {'latitude': latitude.item(), 'longitude': longitude.item()}

        return result

Any help would be appreciated.

william · January 7, 2025, 4:59pm

Do you have a stacktrace? Also what does the JSON input you are passing in look like? Could you also send the imports for the model adapter?

This potentially looks like a pytorch error

adampsu · January 7, 2025, 7:03pm

Hello William,

I’m not sure it’s a PyTorch error. I’m able to use the model adapter in Jupyter Code Workspaces; it works fine. The below image shows my process:

And here’s the result when I paste the same image into the Sandbox environment:

The stack trace is included. Please let me know if you notice something; I’m grateful for your help thus far.

william · January 8, 2025, 3:11pm

Could you send your entire model adapter (with imports)?

Also in code workspaces, could you try calling image_geolocator_v1_adapter.transform(image_base64="...")

adampsu · January 8, 2025, 8:39pm

I’ve found the issue! Seems like the image_base64 input was getting passed into all constructor members, which wasn’t an intended effect. I simply had to move my self.transform variable within the self.predict() method. Thanks for the help!!

9eab477d6eb76c06c4f6 · February 25, 2025, 7:50am

Hello @william ,

Following on this topic, does the Base64 string image is still the preferred method to pass an image to a model ?
I see some added value to use Media reference instead of Base64. For example to infer the model from Workshop UI with a backing Mediaset. (The B64 option is also possible with a function…)

Cheers,

william · February 25, 2025, 4:01pm

Yes, this is still the preferred method, but we are working on adding a more first-class way of accessing media sets in a live deployed models this term.

The current beta flow would be to use a model adapter with the platform SDK like this:

import requests

import palantir_models as pm
from palantir_models_serializers import DillSerializer

from foundry_sdk_runtime import FOUNDRY_HOSTNAME, FOUNDRY_TOKEN


class MediaReferenceModelAdapter(pm.ModelAdapter):
    @pm.auto_serialize(
        model=DillSerializer(),
    )
    def __init__(self, model):
        pass

    @classmethod
    def api(cls):
        inputs = {
               "mio_set_rid": pm.Parameter(str),
               "mio_item_rid": pm.Parameter(str),
        }
        outputs = {
                "properties": pm.Parameter(str),
        }
        return inputs, outputs

    def predict(self, mio_set_rid, mio_item_rid):
        token = FOUNDRY_TOKEN.get()
        hostname = FOUNDRY_HOSTNAME.get()
        resp = requests.get(
            f"https://{hostname}/api/v2/mediasets/{mio_set_rid}/items/{mio_item_rid}/content?preview=true",
            headers={
                "Authorization": f"Bearer {token}",
                "Content-Type": "image/png",
                "Accept-Encoding": "gzip, deflate, br",
                "Accept": "*/*",
            },
            stream=True,
        )
        print(resp)
        return str(resp.content)

9eab477d6eb76c06c4f6 · March 6, 2025, 9:53am

Hello @william ,

I just came across this announcement from Feb 2025: February 2025 • Announcements • Palantir

In your beta flow, I understand that you are feeding directly a media item and mediaset RID but not an object.
Does the beta flow you are proposing here will be integrated somehow to the “object layer” ?
The idea would be to feed a media from an object, I believe using mediaReference (backed by a mediaset) to a model.

Cheers,

william · March 6, 2025, 2:45pm

The “beta” flow calls a URL directly which is not very stable, and as you pointed out, it is not connected to an object at all.

The improved flow would be the input to the model is an object (which contains a media reference property) and then you can use this media reference however you would like in your model. Currently you can have a media reference as an object property, but the SDK does not generate type bindings for these properties in Python.

I believe the Ontology SDK team is working on this feature right now, and I would be happy to reach out and ask for a timeline if you are interested. I can also provide an example Model Adapter when this feature is released.

system · May 5, 2025, 2:45pm

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.