Best way to use a workshop user input as a pipline input

HenryStansfield · December 30, 2024, 4:26am

I’m trying to have the user enter a phrase from a workshop UI which will then be processed by a pipeline. Is there a way to link the user’s text input so that it feeds into a pipeline input?

joshOntologize · December 30, 2024, 5:00am

Hey,

You can store user inputs in a new Ontology object, generate a materialization, and use that materialization as an input to your pipeline.

This would entail:

Creating a dedicated Object Type (i.e., User Responses) with a simple schema (i.e., response_id and phrase/input at a minimum, but can log things like the user’s ID and the timestamp if helpful).
Create an associated “Create new Object” Action Type.
Add a button group widget that triggers the Action Type in your Workshop application.
Generate a materialization for that Object Type.
Use the materialization as an input in the relevant transform code, accessing the relevant column containing user’s responses. You might consider converting this transform to an incremental one to avoid re-processing responses that have previously been processed.

Hope this helps and happy holidays!

harikrishna · December 30, 2024, 9:36pm

This might might work as well: instead of creating a new Ontology object, you could leverage an existing object or dataset that already captures user interaction (if there’s one available). You could just add a field specifically for the user input phrase.

Once that’s in place, I’d set up a UI component where users can input their response, and link that directly to the object. Then, you’d trigger the pipeline whenever new input is added, either by setting up a trigger on the object or using a scheduled job to periodically check for new responses.

This might be a bit simpler if you want to avoid creating a whole new schema, and it could still feed into the pipeline effectively. Let me know if this helps!

ifeoluwaadebusoye17 · December 30, 2024, 9:36pm

Hi @joshOntologize adding to this would it be possible for users to upload CSV files in Workshop?

I would like to implement the following workflow:

User uploads a CSV file in Workshop containing past data
The data will be added to a trained machine learning model (following the steps in the ’ Tutorial - Train a model in a Jupyter® notebook’ https://www.palantir.com/docs/foundry/model-integration/tutorial-train-jupyter-notebook)
Once predictions are generated they will be displayed in Workshop where users can filter to explore the results.

joshOntologize · December 30, 2024, 10:06pm

Hey,

This should be possible!

For simplicity, I’ll refer to the relevant datasets in the following notional pipeline as Dataset A, Dataset B, etc.

You can use the Media Uploader widget on Workshop and you’ll want to configure it such that it a) accepts only CSVs (prevent other file types from being accepted to avoid dataset parsing issues) and b) adds a new file to a single, designated dataset (the schema of the new CSVs must match the existing schema as well). This is Dataset A.

That same designated dataset (Dataset A) should be an/the input to your Code Workspaces code in the Jupyter notebook where your code should handle the train/test split accordingly. Assuming you save your model’s predictions into an output dataset, that will be Dataset B.

Since Workshop only works with/can only display Ontology Data, you may want to create an optional – but likely needed – Dataset C (using Dataset B as an input) that will serve as a backing dataset for an Object Type that you can import into the same Workshop (perhaps in a new tab or new page) to visualize the predictions from your model that end users can interact with.

All in all, your Workshop Application will:

permit user uploads and store new files in Dataset A
eventually display results from your end-to-end pipeline via the Object Type backed by Dataset C

In the background, your Code Workspaces/Jupyter Notebook will:

take Dataset A as an input
and produce Dataset B containing model predictions

In the background, you should also:

create Dataset C using either Code Repositories or Pipeline Builder and sync it to an Object Type in OMA
configure a schedule so that this process runs automatically whenever a new file is uploaded from the frontend

Keep in mind that there will likely be a few minutes’ worth of latency between when a user uploads a new CSV to the application and when the new results appear in the same application due to the pipeline running + the Ontology sync jobs.

HenryStansfield · January 2, 2025, 3:01pm

This worked, thank you!

MSL0727 · January 6, 2025, 4:58am

Why not use deployed pipelines and pass the data a parameter?

system · January 20, 2025, 4:58am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.