Hello,
I would like to implement the following process.
I want to create “datasetB_yyyymmdd” as a backup of “datasetA”.
※yyyymmdd will be the date the process is executed.
I want the process to be executed once a month.
(One dataset will be created each month,
such as
datasetB_20250101,
datasetB_20250201,
datasetB_20250301, etc.)
【Question】
Is it possible to output a new dataset every time it is executed?
(If 1 is possible,) even if the output dataset is dynamic, is it possible to run it by setting a schedule?
CI checks must run in order to have a new output or input to a transform, so you cannot have a new output created dynamically.
However, you could create N empty datasets for the next N months in a multi-output transform, and have code to choose and write to the output corresponding to the desired month. Something like this for instance:
@transform(
my_input=Input("ri.foundry.main.dataset.XXX"),
**{
month: Output(
f"folder_path/datasetB_{month}"
)
for month in ["202411", "202412"]
},
)
def my_transform(my_input, **outputs):
current_month = datetime.now().strftime("%Y%m")
if current_month not in outputs.keys():
raise ValueError("Current month has no corresponding output")
outputs[current_month].write_dataframe(my_input.dataframe())