Summary: Revamped Model Selector simplifies the process of choosing the optimal Large Language Model (LLM) for a use-case by providing a user-friendly interface with comparative analysis, metadata insights, and performance indicators.
Why did we build this?
In the ever-evolving landscape of Large Language Models (LLMs), selecting the right model can be a daunting task. Previously, users found it challenging to gather comprehensive information about models without navigating away from the tools provided on Foundry. Our newly introduced Model Selector addresses this challenge by providing a streamlined interface that enables users to choose the best LLM for their specific workflow.
What has changed?
The Model Selector UI tool provides users with a dropdown to select an LLM for a resource, such as the “Use LLM” tool in Pipeline Builder or AIP Logic.
Some changes to highlight are:
- A smarter list of recommended LLMs, directing users towards the latest flagship models.
- Exposing metadata such as
Context Window
,Training Data
, andCapability
to help a user determine if a model fits their workflow’s demand. - Four performance indicators (
Model Class
,Cost
,Speed
, andAvailability
) to help users compare models against each other. Similar Models
to the current selected model, to allow for ease of switching models as needed.
What does the Model Selector offer?
The Model Selector is designed to enhance your decision-making process by offering:
- Comparative Analysis: Easily compare models through relational attributes, helping you identify the best fit for your specific use case.
- Metadata Driven Insights: Gain immediate access to detailed information about each model’s capabilities, context window, and training data.
- Similar Models: Quickly find alternative options with similar attributes, ensuring you have a wide range of choices at your fingertips.
What does each indicator actually mean?
Model class
A Lightweight
model is typically characterized by its faster and less intensive computation, making it ideal for smaller tasks. In contrast, a Heavyweight
model is better suited for heavyweight tasks that require more resource-intensive models, designed to handle complex tasks with a higher degree of accuracy and depth. A Reasoning
model is specialized for tasks that require logical inference and decision-making capabilities, excelling in applications that demand understanding and manipulation of complex relationships and abstract concepts. Each model type serves distinct purposes, balancing trade-offs between performance, resource consumption, and task complexity.
Cost
Cost measures the average expense of processing input and generating output tokens. Lower cost models will have a value of Low
while more expensive models will have a value of High
.
Speed
Speed measures the time it takes a model on average to generate output tokens back to the user. Faster models will have a value of High
while slower models will have a value of Low
.
Availability
Availability reflects the capacity and readiness of a model to handle requests, directly derived from its enrollment limit sizes. This metric is determined by comparing the maximum tokens per minute (TPM) and requests per minute (RPM) that each model can utilize without running into rate limits. High
availability signifies that a model can consume large amounts of TPM/RPM, making it suitable for high-demand scenarios, while Low
availability indicates a more limited capacity. For detailed information on each model’s specific enrollment limit sizes, please refer to this documentation.
How do I get started?
Simply put… just start building with AIP! In applications leveraging AIP, you’ll often see our Model Selector Dropdown, which hopefully simplifies your job of picking a model! Feel free to share thoughts, feedback, and future ideas for the Model Selector in the comments!
Happy building!