What is the difference between jaggedRowBehavior "Ignore" and "DROP_ROW"?

I can see the option to drop jagged rows in the Edit Schema section, but I can also directly edit the schema to “IGNORE”. Is there a difference between these two behaviors? Does IGNORE replace missing columns with null, whereas dropping would remove the row entirely?


Hey are you seeing this in the Pipeline Builder app?

  1. Ignore:
  • When jaggedRowBehavior is set to “Ignore”, the system processes the jagged rows as they are… typically I would either put REPLACE_WITH_NULL or Drop instead of Ignore
  1. Drop:
  • This will discard any rows that do not match the expected number of columns. This ensures that only complete rows are processed and passed through the pipeline, maintaining data consistency and integrity. Rows with missing columns are not included in the output, effectively filtering out incomplete data.
1 Like

Ah sorry, I saw this issue due to some PB transforms erroring out, but this definitely more of a general dataset thing than a PB thing.

Also looks like REPLACE_WITH_NULL is in invalid value for this field. I see the options are DROP_ROW, THROW_EXCEPTION, IGNORE. I was imagining IGNORE as REPLACE_WITH_NULL effectively, but are you saying that with IGNORE, the system would process jagged rows by “mixing” rows together?

Thanks for your help!