I’ve been using the LLM Node in my pipeline and it’s been working well, but recently I’ve noticed it occasionally skips or doesn’t properly process certain lines in my data. It happens randomly - most of the time everything works fine, but some rows just get missed or produce incomplete outputs.
Has anyone experienced this before? I’m wondering if it could be related to token limits, timeouts, or some configuration I might be missing. Any insights would be really helpful!
Hey @Jacob_SE one thing you could do to test out the values that are producing nulls or are incomplete is (on a separate branch) using the include errors option on the output and using the trial run to see what the output error is
Thanks for the suggestion! I’ve already tried the “include errors” option, but unfortunately most errors just show up as “unknown error” with no actionable information. Even after resolving context limit issues, these random failures continue with no clear pattern.
This issue is significantly affecting my operations:
Build times: 3-4 hours on average (I’ve had builds fail after running for 8 hours)
Substantial resource waste from failed runs
Extensive time spent on trial-and-error optimization with disappointing results
Interestingly, I rarely experienced these issues during the early LLM Node days with GPT-4o. These random failures seem to have emerged with newer model integrations, which makes me wonder if there’s a regression or compatibility issue.
Have you or the Palantir team identified any patterns for these “unknown errors”?
Hey sorry for the delay reply! What model are you currently using and what build profile are you using? Can you double check to see what your enrollment rate limits are or if you have any project rate limits that may be affecting your builds?
Hey, I also had the same issue with pipeline builder. For my case the reason was that I was providing all document to model and if the document was long/big enough, context window for same models became too small. So the output was null and the error was also written as ‘unknown error’. I think claude worked better, but I moved to splitting documents into pages and later running the query.
Hi, I’ve already tried several models including Claude, but none of them deliver both quality and consistency. We need every output to reliably match our expectations