Pipeline Builder - Images? Big Data?

I apologize in advance for the potentially trivial question – does Pipeline Builder have support for image datasets? If so, can it handle large files (4.3m images, ~120GB)? If not, what are some worthy resources at my disposal?

1 Like

Pipeline builder supports Media Sets which can allow an arbitrary number/size of images. Processing these at scale should be easy in builder as you can just scale up the compute assigned. Note that using vision models or other expensive operations might be slow though.

Given this is tagged with the winter fellowship label you might struggle with processing big data on the free dev-tier enrollments as compute is quite limited. Would your workflow still make sense with a subset of the data? It might be best to work on a subset to build out your workflow before scaling up once the development is done.

What you’ve proposed is likely the smarter approach; I’ll work with a subset before scaling up post-development. I appreciate the clarification about Media Sets. Thanks!

1 Like