Kafka Message Too Large Error in Stream Pipeline

shmuel_mizrachi · May 19, 2025, 12:12pm

Subject: Kafka Message Too Large Error in Stream Pipeline

Hi everyone

I’m building a Stream mechanism using Pipeline Builder, where I call a UDF function that performs inference using an Ontology Model.
In addition, there’s a processing cube called Use LLM within the pipeline.

The stream generally works well, and I can see data successfully reaching the output
However, once every few days, I encounter the following error:

Root exception after attempted recovery (occurred


Caused by: org.apache.flink.streaming.connectors.kafka.FlinkKafkaException: Failed to send data to Kafka: The request included a message larger than the max message size the server will accept.

Caused by: org.apache.kafka.common.errors.RecordTooLargeException: The request included a message larger than the max message size the server will accept.

Important context:
There is another upstream stream called Stream Nof that feeds data into my pipeline — and it runs continuously without any failures.
The key difference is that my pipeline includes both the Use LLM cube and the UDF inference call, which may be contributing to the larger message sizes.

At one point, I reduced the stream’s resources to a medium configuration — which resolved the issue for about 10 days, but the error eventually returned.

Has anyone encountered a similar issue or has recommendations for handling oversized Kafka messages in Foundry Stream pipelines?
Any help, configuration tips, or message-size mitigation strategies would be greatly appreciated

Thanks in advance!

svercillo · May 22, 2025, 3:48am

It sounds like you have a pipeline can occasionally produce data that is very large (10 MB is a historical Kafka limit that I have seen) and not something that the product can support out of the box.

I would suggest applying some filtering logic if possible and determining why it is the case that individual record sizes are so large, and try to consider ways it might be possible to split this out into multiple rows.

Outside of that, there is also special runtime config that can be applied by a stack admin, however there are limits to how high record sizes can go.

system · July 21, 2025, 3:49am

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.