I made changes in Pipeline Builder that changed the schema of several of the datasets that I both import and produce in this pipeline. My pipeline runs well on the branch, given I fixed all the schema mismatches on this branch, however, when I try to merge it to master, I cannot and I get an error below:
How can I force my changes on the master branch? I’m aware it won’t be deployable immediately, but I can fix the schema mismatches on the branch later.
Hey just to clarify are some of the inputs in your pipeline builder file also outputs in the same file? Right now we don’t have a way to force merge on a protected branch. What happens when you try to click Fix schemas on the above?
Yup, exactly. Since it’s a longer pipeline, we’re splitting it into stages, where we save into a dataset after each of this stage, but then right after, we import the dataset and continue the transformations on top of it. (ideally, the checkpoints would also save into a dataset, so we wouldn’t need to do this)
For this reason, it’s hard to fix the schema. Since on the branch our whole pipeline works well, but on master, the parts that are after one of the “checkpoints” depend on the new schema but are served an old dataset which doesn’t have it.
How I went around it was I copied each stage one by one from the branch to master, deployed it so it creates a new dataset with the correct schema, and then copied the next stage over.