AIP logic function execution is too slow

Hi everyone,

we have a latency sensitive workflow, and AIP logic function is currently a latency bottleneck

current latency

  • we have llm blocks with single completion outputting strings, and creating objects with action blocks based on the string output.
  • currently each action block is taking 5-6 seconds.
  • there are 3 action blocks in the AIP logic function, and so total execution time is 45-60 seconds.

workflow

  • we really want to bring this down latency to 10-20 seconds if possible.
  • the aip logic function is executed real time in workshop by the end user. we cannot pre-run the functions due to workflow constraints (AIP logic function input is a combination of N products. there are millions of products. there are way too many potential combinations to pre-run and store)

curious if anyone had similar challenges or have any suggestions on how we can achieve 10-20 second latency

1 Like

Hey - I think you are on the right path with using single completion and understanding the blocks. It’d be worthwhile to understand which part of the logic is running slowly?

A few leading question:

  • How long does each block take? Can anything run in parallel?
  • Are you having the LLM call the actions or are you calling the action with an action block?
  • When you run your action in code repo, do they take a long time? If so, can you write more efficient code? Can you call the actions in parallel?
  • How long does each block take?
    Action blocks are the biggest bottlenecks with 5-6 sec latency. all other blocks run in less than 0.5 secs

  • Can anything run in parallel?
    This is a great suggestion. I can explore breaking it up to multiple logic functions and parallelizing

  • Are you having the LLM call the actions or are you calling the action with an action block?
    LLM blocks output strings on single completion. Action blocks execute actions

  • When you run your action in code repo, do they take a long time? If so, can you write more efficient code?
    The actions are not function backed. The action creates an object based on given inputs, with no other side effects. I don’t see opportunity for optimization here.

One option would be to write a function backed action that writes all three actions at same time (this might be more efficient). Are you just modifying three objects or are you applying a lot of edits?

One way of running things in parallel that I have used is to import a few logics into a code repo as queries and calling them all inside of a Promise.all(). I am not sure how much this will help given the bottleneck appears to be the action.

This is actually a great suggestion. I’m creating 3 objects of the same object type

This will parallelize the action execution. I will try the async calls with Promise.all()

Yea - note that all edits are stored and applied once at the end of the action, so there will still be some overhead but should help improve the perf.

Another avenue to explore is if you can change to use a faster model for the LLM blocks. If you set up some Logic Evaluations (more guidance here and here) you can quantify precisely the change in behavior using different models and evaluate the tradeoffs between cost, speed, and accuracy for each block of your Logic.

Also, I’ll second @bkaplan with the general guidance to “factor out” anything that can be done deterministically from the LLM blocks into either Typescript Functions or other Logic blocks as they will operate much more quickly that having the LLM block determine and use a tool.

1 Like

following the suggestions, I was able to bring down latency to ~5 seconds!
thanks everyone for the help

2 Likes

Amazing - can you share what helped you get it down so much? I think it might be helpful for others

mostly following recommendations in here and on slack

I created an ontology edit function in code repo. this functions calls the aip logic function.

the logic function doesn’t create objects anymore, but returns a list of structs, each struct representing an object that needs to be created.

objects created with the Objects.create() syntax in the FoO instead of using an ontology edit action.

object creations are parallelized with promises in typescript

2 Likes

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.