A fully managed platform in Microsoft Foundry for hosting, scaling, and securing AI agents built with any supported framework or model
Hi SriramN
Thanks for the detailed question. This is a common observation when using Azure AI Foundry agents with the Teams connector.
There is currently no setting available in AI Foundry agent, Teams connector, or Bot Service to limit conversation history to last N turns reset conversation automatically after timeout prevent tool responses from being stored in history
This behavior is by design with the current connector implementation. The full conversation history is maintained and sent with each message, and the size is controlled only by the model context limits. [ai.azure.com]
why this is happening:
When your agent is used in Teams, it keeps reusing the same conversation thread. Every time you send a message, the entire history is sent again to the model. If your agent includes large tool responses or long chats, the total tokens keep increasing and eventually impact performance.
What you can do to manage this:
Since there is no built-in control, the solution is to manage conversation history from your application or agent logic.
Here are practical approaches:
Keep only last few messages, instead of sending full conversation
def trim_messages(messages, max_turns=3):
return messages[-max_turns:]
Use this before sending request to the model
Summarize older conversation
Replace long history with short summary
def summarize_history(messages):
summary = "User discussed earlier context. Keep latest question."
recent = messages[-2:]
return [{"role": "system", "content": summary}] + recent
Split into new sessions
If conversation becomes large, reset manually
messages = []
messages.append({"role": "user", "content": "Start fresh question"})
Avoid large tool outputs in history
Instead of storing full response
messages.append({
"role": "assistant",
"content": "Tool executed successfully. Key result captured."
})
Send only required context
Prepare minimal input before model call
final_messages = [
{"role": "system", "content": "You are a helpful assistant"},
*trim_messages(messages, max_turns=3)
]
In short: Teams connector keeps full history by default and does not trim or reset it automatically. There is no config today to control this behavior. So, managing history at your side by trimming, summarizing, or resetting is the recommended approach
This will help reduce token usage, improve response time, and avoid timeout issues.
I Hope this helps. Do let me know if you have any further queries.
If this answers your query, please do click Accept Answer and Yes for was this answer helpful.
Thankyou!