Hi,
I have a MediaSet OR Unstructured Dataset - containing multiple extracted images from different PDF Documents.
My Objective is to extract summaries of each image and produce a dataset which has media_item_rid, path, timestamp, media_reference and image_summary.
I want to use an incremental lightweight transform for faster processing:
The issue i’m facing is typically weird:
For snapshots:
I am defining my @transform as such : Which runs fine for the first time build
@lightweight(cpu_cores=4, memory_gb=16)
@incremental(v2_semantics=True)
@transform(
output=Output("ri.foundry.main.dataset.feb18440-757b-4958-8138-62cc4ba8eb21"),
input_images=MediaSetInput("ri.mio.main.media-set.020d9d31-b3b9-4178-b76b-43f6346d8199"),
vision_model=GenericVisionCompletionLanguageModelInput("ri.language-model-service..language-model.gemini-2-0-flash")
)
On the second build when the Incremental Load kicks off :
It throws an error: Failed to resolve dataset properties for input datasets: Build2:MissingIncrementalInputResolution {rid=ri.language-model-service..language-model.gemini-2-0-flash, inputType=language-model} (2dfd0e12-3a35-48e7-a351-f67584e5e573)
IF i manually define the vision_model inside the compute function and not inside the @transform
decorator then it runs and build the dataset incrementally.
What could be the cause of such an issue and How to resolve this? Need some guidance here.
Thanks in Advance
Regards,
Solon Das