I’m building out a pipeline to process customer service tickets and translate them to English.
One of the preprocessing steps is to identify those cases that are non-English or a combo of English and another language before sending the non-English ones for translation.
When using Gemini for that initial identification step, all the Gemini models will append a line break to the categorisation.
Hi @Samwise_AIP, I was also able to repro that behaviour using the Gemini models, but not for other models. I have seen online that other folks had similar issues with the Gemini models, so maybe some more prompt engineering might fix the issue, but it looks like a Gemini issue.
However, you can easily clean these ones up by using either the “Clean string” or “Trim whitespaces” board from Pipeline Builder. You can use it by adding a new transform right after the use llm node. Let me know if you have issues adding the board.