Hey Maddy! Appreciate your use of the UseLLM block - I have some clarifying points, and maybe some things to try to unblock your use case!
On the note of the skip recomputing rows bit - you can think of this as us maintaining a hidden map from prompts to a successful response that persists across deployments and builds. When we get a valid successful response (i.e. one that is not rate limited, that can be coerced, etc) from the LLM, we associate that prompt with that response. On any build thereafter, for each row, we check if the dataset has the constructed prompt for that row. If so, we use the previously computed successful response, and if not we call the llm. This can help with rate limit errors for example, by not hitting the LLM with as many calls since some of the rows have already succeeded.
On the cannot coerce error - you have quite a complicated struct response you expect from the LLM here! This error means we got a response from the LLM, but we couldn’t actually “fit” (coerce) it into the type you requested. We will try three times to coerce - if at first we can’t fit the response into the type you request, we’ll tell the LLM that, and then ask it to retry. If this fails 3 times, we return the error message you’re seeing here. Something that might help the LLM return a valid response is asking the LLM for something simpler. In essence this means splitting the prompt and response into the sub components, and then manually fitting them into the larger struct you’re asking for there. For example, instead of asking for tools, parts, trim, make,model,year, etc, all at once, you could ask just for the make/model/year, and put that into an output column, and in a separate UseLlm blcok just ask for the trim, and put into an output column, and etc etc, and finish by manually constructing this large struct type.
On the rate limits errors - there’s unfortunately not very much we can do here at the moment. We retry the call tens of times (over 40) before returning this error. There is a larger effort to improve resource utilization, but this constraint comes from the LLM provider itself, and isn’t something we can easily change for you. You just need to retry here (and remember that LLMs have global rate limits, so you might have more success if you run the build at a time when fewer other people are making calls to the LLM).
Good luck, and please reach out if you need any further clarification on any of these points!