How many retries are done (to ensure structure) in the backend during use LLM board in builder?
And is that configurable ?
How many retries are done (to ensure structure) in the backend during use LLM board in builder?
And is that configurable ?
Today we retry the output type coercion three times. This isn’t configurable today.
Hey @drew , follow up: are we batching requests in the backend? Asking since batching API calls is cheaper and cost is becoming a concern at scale.
We’re not batching calls today, but we do parallelize individual LLM calls.