Pipeline build w/ LLM hangs indefinitely

Hi there - I have a build pipeline using an LLM transform that was working fine, and now it appears to hang indefinitely. In the logs, the entries below just generate endlessly. I am using the advanced > local spark and have the constraint propagation disabled, which I did based on another topic here.

INFO 
[2024-12-24T16:53:53.302323Z]
Retrying call after failure {}/{} backoff: {}, channel: {}, service: {}, endpoint: {}, status: {}
{
  "7": null
}
INFO 
[2024-12-24T16:53:53.496217Z]
Exhausted {} retries, returning last received response with status {}, channel: {}, service: {}, endpoint: {}
{
  "5": null
}

This was very strange - in data lineage, there were 4 random ‘Compute Module - xxx’ datasets tied to the intended output dataset for this pipeline. Additionally, I could no longer deploy any changes to the pipeline - I would get a permission denied error on the output dataset, even though I clearly had permissions.

Ultimately I resolved by created a new pipeline, copy / pasting all the steps, creating a new dataset output and updating any connections to the old one. The new pipeline deployed and built w/o issue.

Figured out the real issue. Think there is a bug, where if you have an LLM node in the pipeline builder that references a file from a media set, the actual media set has to be an input into the pipeline. If you attempt to use the rid / media reference from a downstream dataset, the LLM trial runs will work, but you will not be able to build the pipeline (will just hang indefinitely), and the previews for the LLM node and every downstream node will hang indefinitely.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.