I am aware that we can define a Python Transforms pipeline from a JSON file by using a transform generator and ensuring that any inputs to the pipeline already exist before we commit the JSON file.
To do this we would have a pre-defined path prefix for all pipeline input and output types. E.g. if we had 3 sequential stages “X”, “Y” and “Z”, we could put our inputs for X in namespace/project/X/inputs/{input_name}, and the outputs in namespace/project/X/outputs/{ouput_name}, with the next stage of the pipeline using the previous stage’s outputs as inputs.
The transforms generator can then iterate through each stage of the JSON and dynamically define the inputs when the code is committed.
Our problem is, if this JSON file is coming from e.g. an external library, or something external to the repository, we want to be able to have the jobspecs of these transforms updated automatically when the library updates, without having to manually bump the version in the repo. We would also be ok with this just being done via API calls that we can automate. Is this possible? How can this be done?
If I understand your usecase properly you want to have some arbitrary file (which in your case seems to be JSON, but doesn’t really need to be) which you read in a transforms generator and then create transforms from?
Which you then parse in the transform generator code in your code repository.
Where are you intending to get this file from? Are you trying to ingest it as a dataset and read it? That won’t be easy to make work with the timing of running checks etc. You mention reading it from an external library - in that case it might be slightly easier to work with.
I’d be interested to have some more information about the problem you’re trying to solve. You’re going against quite a few significant design decisions in the platform and this will end up being a fragile setup of scripts and external transforms which will be a nightmare to maintain. Of course, it might still be worth it but there is likely a much better way to accomplish whatever you’re trying to achieve.