Pipeline builder "use LLM" troubleshoot"Cannot coerce to provided output type"

maddyAWS · September 19, 2024, 7:44am

I have been trying to ue the “use LLM” feature in Pipeline Builder, but looks like there are a few drawback with this product feature

There are times when the output of the LLM block responds with “Cannot coerce to provided output type”. But its not clear what was the response from the LLM that was not coercible. Because when I run the same row as input in trial run, I get a response. how do I troubleshoot further. may be, provide the response that the LM provided to help understand what went wrong ? Can PB, try to call the LLM with the error message that the response was not coercable and see if LLM would like to edit the response ? may be have a retry count of 3 ?
There are some rows where the request is rate limited, can there be a feature where I Can rerun the same transform where PB tries to run inference on rows where it got throttle ?
can you provide some configuration where I edit the configuration of the retry logic ?

helenq · September 19, 2024, 12:35pm

Hey @maddyAWS would you be able to provide more details on what output type you’re trying to coerce it to or the prompt?

Right now we don’t support the items in 1 but we currently do retry 3 times to get the desired output type. Only if it fails after the 3 times do we output the cannot coerce to provided output type error message.

For 2 have you tried the Skip recomputing rows functionality? We only save the rows that were successful so you could build your pipeline again without worrying about recomputing outputs that were already successful and only try to recompute on rows that error-ed or are brand new. Can learn more here: https://www.palantir.com/docs/foundry/pipeline-builder/pipeline-builder-llm/#skip-computing-already-processed-rows
(This is supported on both the use llm and text to embeddings transform)

maddyAWS · September 19, 2024, 3:54pm

here is an example where I have attached screenshot of output where its not able to coerce the output when the Pipline executes. but when I select the same row in Trial run, it works fine

here is the output from LLM which is a struct

the output type is struct

I will try to use the option for Skip recomputing rows, but the documentation you refer here doesnt clearly say if the option to recompute will help with throttling or coerce errors in the previous runs?

can you please provide little more clarity in the documentation on what this feature does ?

" When Skip recomputing rows is enabled, rows will be compared with previously processed rows based on the columns and parameters passed into the input prompt. Matching rows with the same column and parameter values will get the cached output value without reprocessing in future deployments."

When I read this text, I intrepret this as a feature where it check if the prompt and the variables are checked to see if there was a previous row that has the same input data, if so then instead of invoking the LLM, the system will just duplicated the output from the LLM . This make me think, it will help when there are duplicate rows in the dataset, where the same output is used to reduce invocation calls to LLM.
If this is to be intrerpreted as a statement where, the PB logic will check if there is output in the inference column, if so, it will skip the row processing and will only process, rows that got recently added/changed, I think the documentation has to be updated.

here is my output stats before running the pipeline 2nd time with the skip option

here is screenshot after I running the pipeline for the 2nd time

northing changed with the rows that got throttled or failed to complete due to coerce errors

maddyAWS · September 20, 2024, 6:55am

Here are things that I have done till now, Including a choosing a different model with larger context window.

selected a small profile, hoping the requests rate will be reduced
Selected Claude 3 sonnet with 200k context windows
enabled the option “Skip recomputing rows” ran the Job for the first time, got throttling errors
reran the same job, northing changed with the rows that got throttled.

maddyAWS · September 20, 2024, 6:55am

When I use the option to debug using the trial run, it sys its truncating the prompt when its too long.
Im now stuck not being able to troubleshoot further on these errors, {“ok”:null,“error”:“Cannot coerce to provided output type”}

any suggestions or recommendations on what response that the LLM gave that cause PB issues with coerce the response ?

david · September 20, 2024, 6:47pm

Hey Maddy! Appreciate your use of the UseLLM block - I have some clarifying points, and maybe some things to try to unblock your use case!

On the note of the skip recomputing rows bit - you can think of this as us maintaining a hidden map from prompts to a successful response that persists across deployments and builds. When we get a valid successful response (i.e. one that is not rate limited, that can be coerced, etc) from the LLM, we associate that prompt with that response. On any build thereafter, for each row, we check if the dataset has the constructed prompt for that row. If so, we use the previously computed successful response, and if not we call the llm. This can help with rate limit errors for example, by not hitting the LLM with as many calls since some of the rows have already succeeded.

On the cannot coerce error - you have quite a complicated struct response you expect from the LLM here! This error means we got a response from the LLM, but we couldn’t actually “fit” (coerce) it into the type you requested. We will try three times to coerce - if at first we can’t fit the response into the type you request, we’ll tell the LLM that, and then ask it to retry. If this fails 3 times, we return the error message you’re seeing here. Something that might help the LLM return a valid response is asking the LLM for something simpler. In essence this means splitting the prompt and response into the sub components, and then manually fitting them into the larger struct you’re asking for there. For example, instead of asking for tools, parts, trim, make,model,year, etc, all at once, you could ask just for the make/model/year, and put that into an output column, and in a separate UseLlm blcok just ask for the trim, and put into an output column, and etc etc, and finish by manually constructing this large struct type.

On the rate limits errors - there’s unfortunately not very much we can do here at the moment. We retry the call tens of times (over 40) before returning this error. There is a larger effort to improve resource utilization, but this constraint comes from the LLM provider itself, and isn’t something we can easily change for you. You just need to retry here (and remember that LLMs have global rate limits, so you might have more success if you run the build at a time when fewer other people are making calls to the LLM).

Good luck, and please reach out if you need any further clarification on any of these points!

maddyAWS · September 24, 2024, 5:09am

Hi David, thank you for the detailed explanation. Would be helpful for other if the documentation on “skip recomputing rows”.

It would be great if there was a retry option where PB can retry the pipeline build after few hours interval . Or can I create a schedule on the dataset to try every 1 hour so it will keep retrying throttled rows till it succeeds ?

The last time I checked the rows that were got throttled didn’t succeed when retrying to run the build. I will rerun the pipeline once again.

When there is 8000 rows and only a couple of hundred rows fail with coerce error message, it would be helpful to understand what was the response provided by the LLM.

Can you provide the LLM response in the error to troubleshoot the prompt ?

I agree the output structure is complicated, but works in most of the cases.

May be I can tweak the prompt with some examples. If I can see the response

maddyAWS · September 24, 2024, 3:22pm

@david I followed your suggestion to rerun the build with Skip “recomputing rows” turned on.

It ran for 2 minutes, but the output didnt change. I have attached the two screenshots with the timestamp. But the output didnt change.

Not sure if im doing something wrong. But I wish there was some logs that shows PB trying to reprocess rows that were throttled. But the build report shows it ran. But no more troubleshooting ideas I can think of to debug this.

maddyAWS · September 26, 2024, 3:21pm

I’m still struggling with this PB issue, here is some progress I have made

looking at the query plan showed me a dataset that seems to be the one that is used to track which rows to be re-run when using " Skip recomputing Row".

intresting that when I ran a quert to select rows WHERE value.error LIKE ‘%Request was rate limited%’ I got the rows where they were throttled

but is shows “shouldBeReadFromMemory” as True. I would expect this to be false as this is throttled and would have to be re-run

any thoughts ?