Is there a way to restrict the models a user has access to

I understand that under AIP settings there is the option to disable a model family. But what about a specific model within the family? Many enterprises want to restrict access to older models or sometimes newer ones for various reasons including cost savings. Is there a way to do this? If not, is it something Palantir could configure on the enterprises’s behalf?

To add on.. the backup idea I thought would work (zeroing the rate limits) doesn’t seem to apply to non project associated requests. Even tried completely zeroing on a builder account temporarily in aip usage and limits, then sent multiple back to back requests through the llm proxy endpoints and was very surprised to see them all succeed. Are limits cached or how can this be governed?

Hi @CodeStrap :waving_hand:

Setting the rate limit to 0/0 for specific models is possible with backend flags (service configuration) for single tenant Foundry stacks.

There is also an API to set model overrides for specific groups of people, a UI is supposed to ship soon within RMA / AIP Limits. These model overrides currently only support 1% of the enrollment capacity as minimum - effectively not completely disabling models.

We are using both together to restrict Opus usage by default and „Jail“ users that have reached their monthly budget for Sonnet.

I would also like to +1 the FR to have more granular control on model enablement.