Embeddings via pipeline builder and Ada 002 taking long?

Peymanp · June 25, 2024, 8:09am

Hey, I am creating an embedding vector for ~150k rows of data each row containing on average 180 characters. Pipeline has been running for the past 2 hours. I saw the pop-up that the model is slow. But did not see any quantifiable measure on the speed. Appreciate the insight on an estimate pipeline runtime with the inputs above!

Xander · June 25, 2024, 10:24am

Hey! There is a rate limit imposed on the model of about 1,000 embeddings per minute so I would expect this build to complete in ~2.5 hours, depending on other transformations that are being done in the pipeline.

We’re also investing in observability tooling that will help you see embeddings counts as the build is ongoing.