Hey, I am creating an embedding vector for ~150k rows of data each row containing on average 180 characters. Pipeline has been running for the past 2 hours. I saw the pop-up that the model is slow. But did not see any quantifiable measure on the speed. Appreciate the insight on an estimate pipeline runtime with the inputs above!
Hey! There is a rate limit imposed on the model of about 1,000 embeddings per minute so I would expect this build to complete in ~2.5 hours, depending on other transformations that are being done in the pipeline.
We’re also investing in observability tooling that will help you see embeddings counts as the build is ongoing.
1 Like