Maintain Context with Multi-Turn Conversations Done

Implement state management using agent sessions to handle follow-up questions and preserve history.

Overview

In the previous tutorials, Jordan Miller built an agent that could use tools to check service health. However, each interaction was a “one-and-done” affair. If Jordan were to add a follow-up turn to the investigation, the agent would lose the context:

// Turn 1: (Already implemented in Module 2)
// await foreach (var update in agent.RunStreamingAsync(incident)) { ... }

// Turn 2: (Add this follow-up)
Console.WriteLine(await agent.RunAsync("Who is on call for that?")); 

// Result: "I need to know which service you are referring to in order to tell you who the on-call engineer is."

In a high-pressure incident, Jordan needs to be able to iterate—asking for more details or refining a search without repeating the entire history. This guide walks you through implementing Memory. You’ll use an Agent Session to preserve the conversation state, allowing Jordan’s assistant to maintain context across multiple turns.

Agent Anatomy

An AI Agent uses Memory to maintain a “thread” of conversation. Without it, every message is treated as the very first time the agent has ever met you.

🎭

Persona

Jordan’s on-call identity.

🧠

Brain

Reasoning about alerts.

🛠️

Tools

External capabilities.

Building

💾

Memory

State and history.

Upcoming

☁️

Hosting

Exposing as a service.

Understanding the Context Window

While the Agent Session manages history, every LLM has a finite Context Window—a physical limit on how many tokens it can process at once. As your conversation grows, the entire thread (Persona + History + Input) must fit within this window. If it overflows, the agent will begin to “forget” the earliest parts of the chat.

Coming SoonWe’ll explore Compaction and Persistence in the next module.

Setup your environment

If you are continuing from the previous tutorial, you can use your existing project. Otherwise, follow the steps below to initialize a new one.

📋 Pre-flight Checklist

🛠️ .NET 10.0 SDK (or later) installed.
🤖 AI Provider: Access to Azure OpenAI or a local service (Ollama/LM Studio).
💾 Session Management: We will use ChatClientAgentSession to handle history.

1 Install required packages

We are using the same core packages as the previous modules.

dotnet add package Microsoft.Agents.AI.OpenAI
dotnet add package OpenAI
dotnet restore

Package Anatomy

🔌Microsoft.Agents.AI.OpenAI

The core Agent Framework package. It provides the AsAIAgent extension and state management abstractions.

💾OpenAI

The official OpenAI client. We use this to establish the ChatClient which the agent uses for multi-turn reasoning.

dotnet add package Microsoft.Agents.AI.OpenAI
dotnet add package Azure.AI.OpenAI
dotnet restore

Package Anatomy

🔌Microsoft.Agents.AI.OpenAI

The core Agent Framework package. It provides the AsAIAgent extension and state management abstractions.

☁️Azure.AI.OpenAI

The official Azure SDK for OpenAI. Enables high-performance, enterprise-grade connectivity to GPT models.

Build the agent

We will now implement a two-turn conversation. In the first turn, we tell the agent about an incident. In the second turn, we ask a follow-up without repeating any details.

sequenceDiagram
    participant App as Console App
    participant Session as 💾 Session
    participant Agent as 🎭 Persona
    participant Brain as 🧠 Brain
    
    App->>Agent: CreateSessionAsync()
    Agent-->>Session: New Session Object
    
    Note over App,Session: Turn 1
    App->>Agent: RunAsync("Checkout latency in West Europe", Session)
    Agent->>Brain: Persona + History (Empty) + Input
    Brain-->>Agent: "I see checkout is degraded..."
    Agent->>Session: Store Turn 1
    Agent-->>App: Response 1
    
    Note over App,Session: Turn 2
    App->>Agent: RunAsync("Who is on call for that?", Session)
    Agent->>Brain: Persona + History (Turn 1) + Input
    Brain-->>Agent: "Taylor Vance is on call for West Europe."
    Agent->>Session: Store Turn 2
    Agent-->>App: Response 2

1 Implement Multi-Turn Logic
🎭 Persona

💾 Memory

We use agent.CreateSessionAsync() to generate a state container. When we call RunAsync or RunStreamingAsync, we pass this session as a parameter. The Agent Framework automatically handles the injection of previous messages into the LLM’s context.

Replace the contents of Program.cs with the following:

using OpenAI;
using System.ClientModel;
using Microsoft.Agents.AI;
using OpenAI.Chat;
using System.ComponentModel;

// 1. Configure the Provider
var endpoint = Environment.GetEnvironmentVariable("OPENAI_ENDPOINT") ?? "http://localhost:1234/v1";
var modelName = Environment.GetEnvironmentVariable("OPENAI_MODEL_NAME") ?? "google/gemma-4-e4b";
var chatClient = new OpenAIClient(new ApiKeyCredential("dummy"), new OpenAIClientOptions { Endpoint = new Uri(endpoint) })
    .GetChatClient(modelName);

// 2. Initialize the Agent with Tools

AIAgent agent = chatClient.AsAIAgent(new ChatClientAgentOptions
{
    Name = "TriageAgent",
    ChatOptions = new()
    {
        Instructions = """
        You are an enterprise incident triage assistant.
        Summarize the incident, identify likely severity, 
        and suggest the next investigation step.
        Always address the operator by their name and use their role to tailor your response.
        Keep answers concise and operational.
        """,
        Tools = [
            AIFunctionFactory.Create(GetServiceStatus, "GetServiceStatus"),
            AIFunctionFactory.Create(GetOnCallEngineer, "GetOnCallEngineer")
        ]
    }
});

// 3. Create the Session (Memory)
AgentSession session = await agent.CreateSessionAsync();

// 4. Turn 1: Providing Context
Console.WriteLine("--- Turn 1 ---");
var turn1 = "Checkout latency is above threshold in West Europe (4.8s).";
await foreach (var update in agent.RunStreamingAsync(turn1, session))
{
    Console.Write(update);
}
Console.WriteLine("\n");

// 5. Turn 2: Follow-up (Relies on Memory)
Console.WriteLine("--- Turn 2 ---");
var turn2 = "Who is on call for that?";
await foreach (var update in agent.RunStreamingAsync(turn2, session))
{
    Console.Write(update);
}
Console.WriteLine();

// Tool Definitions (from Module 2)
[Description("Gets the current health status for an enterprise service.")]
static string GetServiceStatus(
    [Description("The service name to check, such as checkout, payments, or inventory.")] string serviceName)
{
    return serviceName.ToLowerInvariant() switch
    {
        "checkout" => "Checkout is DEGRADED in West Europe. P95 latency is 4.8s. Payment retries are elevated.",
        "payments" => "Payments is HEALTHY. No active regional alerts.",
        "inventory" => "Inventory is HEALTHY. Last sync 2 minutes ago.",
        _ => $"{serviceName} has no active status record in the demo store."
    };
}

[Description("Gets the name of the engineer currently on-call.")]
static string GetOnCallEngineer(
    [Description("The service name to check.")] string serviceName) => "Taylor Vance (@tvance)";

using Azure.AI.OpenAI;
using Azure.Identity;
using Microsoft.Agents.AI;
using System.ComponentModel;

// 1. Configure the Provider
var endpoint = Environment.GetEnvironmentVariable("AZURE_OPENAI_ENDPOINT")!;
var deploymentName = Environment.GetEnvironmentVariable("AZURE_OPENAI_DEPLOYMENT_NAME")!;

// 2. Initialize the Agent with Tools
var chatClient = new AzureOpenAIClient(new Uri(endpoint), new DefaultAzureCredential())
    .GetChatClient(deploymentName);

AIAgent agent = chatClient.AsAIAgent(new ChatClientAgentOptions
{
    Name = "TriageAgent",
    ChatOptions = new()
    {
        Instructions = """
        You are an enterprise incident triage assistant.
        Summarize the incident, identify likely severity, 
        and suggest the next investigation step.
        Always address the operator by their name and use their role to tailor your response.
        Keep answers concise and operational.
        """,
        Tools = [
            AIFunctionFactory.Create(GetServiceStatus, "GetServiceStatus"),
            AIFunctionFactory.Create(GetOnCallEngineer, "GetOnCallEngineer")
        ]
    }
  });

// 3. Create the Session (Memory)
AgentSession session = await agent.CreateSessionAsync();

// 4. Turn 1: Providing Context
Console.WriteLine("--- Turn 1 ---");
var turn1 = "Checkout latency is above threshold in West Europe (4.8s).";
await foreach (var update in agent.RunStreamingAsync(turn1, session))
{
    Console.Write(update);
}
Console.WriteLine("\n");

// 5. Turn 2: Follow-up (Relies on Memory)
Console.WriteLine("--- Turn 2 ---");
var turn2 = "Who is on call for that?";
await foreach (var update in agent.RunStreamingAsync(turn2, session))
{
    Console.Write(update);
}
Console.WriteLine();

// Tool Definitions (from Module 2)
[Description("Gets the current health status for an enterprise service.")]
static string GetServiceStatus(
    [Description("The service name to check, such as checkout, payments, or inventory.")] string serviceName)
{
    return serviceName.ToLowerInvariant() switch
    {
        "checkout" => "Checkout is DEGRADED in West Europe. P95 latency is 4.8s. Payment retries are elevated.",
        "payments" => "Payments is HEALTHY. No active regional alerts.",
        "inventory" => "Inventory is HEALTHY. Last sync 2 minutes ago.",
        _ => $"{serviceName} has no active status record in the demo store."
    };
}

[Description("Gets the name of the engineer currently on-call.")]
static string GetOnCallEngineer(
    [Description("The service name to check.")] string serviceName) => "Taylor Vance (@tvance)";

With everything in place, execute the application from your terminal:

dotnet run

Try it

Experiment with how the agent maintains (or loses) state by modifying the session usage.

Deep History

Add a third turn to your code that asks for a summary of the entire conversation so far.

Console.WriteLine("--- Turn 3 ---");
await foreach (var update in agent.RunStreamingAsync("Summarize our discussion.", session))
{
    Console.Write(update);
}

Result: The agent will recount the specific incident details and the follow-up question you asked.

Break the Memory

Try running the second turn without passing the session object:

// Turn 2 without session
await foreach (var update in agent.RunStreamingAsync(turn2)) { ... }

Result: The agent will fail to answer the question, typically responding with something like “I don’t have information about a specific latency or region. Could you provide those details?”

Enforce a Protocol

Memory doesn’t just store facts; it stores behavioral rules established during the conversation. Update your turn1 variable in Program.cs to include a specific protocol:

var turn1 = """
Checkout latency is above threshold in West Europe (4.8s).
Also, for any incident in West Europe, you must always include 
a 'Directive' section at the end of your response that says 'Notify @emea-oncall'.
""";

Then ask an unrelated follow-up about the latency in Turn 2.

Result: The agent will remember the protocol established in the first turn and apply it to the second, proving that the Session maintains the “evolved” persona as well as the incident data.

Summary and Next Steps

You’ve successfully implemented state management! By using Sessions, your agent can now hold intelligent, multi-turn conversations that feel natural and context-aware.

While our agent now has a memory, it’s currently volatile. Because the session lives only in the application’s in-memory state, restarting the application or a system crash wipes the slate clean. In a real-world outage, an investigation might span days or involve multiple team members across different shifts.

For example, if you restart your app and ask a follow-up, the memory is gone:

// Session 1: Investigation begins...
// [App Exit]

// Session 2: A new engineer joins...
Console.WriteLine(await agent.RunAsync("Give me a summary of the incident so far."));

// Result: "I'm sorry, I don't have any record of an ongoing incident. Could you provide the details?"

In the next tutorial, we will solve this by implementing Persist Conversations and Smart Memory. We will learn how to serialize our session history and use AIContextProviders to extract structured facts that survive application restarts.