Hi All,
The AI FDE is amazing. I love it. The one issue I’m having is the fact that after I get to 500k context, I have performance issues and it becomes very slow and not really useable. Thus I have to switch over to another FDE, which is a suitable approach.
However I wanted to flag this, as the whole point of having 1M tokens is to be able to use them all, however if performance makes that untenable, what’s the point of having all them tokens. So raising this in case you can fix the performance issues.
Thank you
Sam
Hi!
Although the latest advanced inference models support up to 1 million tokens and it may seem like you can solve problems by providing a large context, this is not actually the case. Please manage your context so that it does not exceed 200,000 tokens. Otherwise, the excessive context will cause the “Middle in the Lost” phenomenon, resulting in a significant drop in performance and a snowball increase in token consumption. In AI FDE, use the “Summarize above” feature to reset the context for the next session before continuing the conversation. This is a recommended practice not only in AI FDE but also in all LLM services outside of Foundry, including Claude Code and Codex.
Thanks for your note!
We’re shipping improvements to address perf issues in longer sessions. We’re attacking it from a few angles, including virtualizing the main chat list and improving memoization in our state management.
I also want to note that larger context windows degrade model performance, so in many cases using double the tokens means your model is working at half capacity, even if the model allows for a larger context window.
Amazing thanks for this, learnt something new today so I appreciate it.
Where is this ‘summarise above’ feature, I couldn’t find it, and that would help solve my issue and then I can just start a new session
Once the token has filled up to a certain extent, if you hover over the area where you were chatting, you’ll see a button that says “Summarize above.”