在LINE Bot中使用MemoryCache保存Semantic Kernel的對談記憶

作者： DD - 9月 15, 2025

在開發 LINE Bot 的 AI Agent 或客服機器人時，最重要的功能之一就是記憶。使用者永遠會預期機器人應該要能「記得上一句話」，並能依照上下文繼續對話。

在 Semantic Kernel 中，提供了 ChatHistory 物件來維護對話脈絡。這個物件會記錄 system / user / assistant 的訊息序列，當傳給大語言模型 (LLM) 時，就能讓模型在上下文中產生更自然的回覆。

但是，有一個問題：
LINE Bot 的 Webhook API 是 Stateless 的 ，這意味著，每一次訊息事件進來，Controller 都是新的，不會自動幫你保存之前的 ChatHistory。

因此，如果我們要讓 Semantic Kernel 記住對話，就需要額外設計一個「記憶儲存機制」。

短期記憶的解決方案：MemoryCache

方法有很多，但如果你的應用場景是：

AI Agent / QA 客服
一次對話通常會在 半小時內結束

這時候就不需要複雜的資料庫，只要使用 .NET 內建的 MemoryCache 就能搞定。

MemoryCache 的特點

存放在伺服器記憶體中
- 可以設定 滑動到期時間 (SlidingExpiration) → 長時間沒互動就清掉
- 可以設定 絕對到期時間 (AbsoluteExpiration) → 即使一直互動，最多存活多久(避免高費用或token爆掉)
- 效能快。但也因為是存放在伺服器端記憶體中，應用程式重啟或多台伺服器作HA架構時，資料會消失，重新佈署應用程式時，也會消失。

還算是適合「短期記憶」的應用場景。

專案架構

底下示範如何在 LINE Bot WebAPI 專案中，整合該機制，我們建立了三隻程式：

Controllers/LineBotChatGPTWebHookController.cs （處理 LINE Webhook）
Controllers/ChatCompletion.cs （使用Semantic Kernel 生成 AI 對話）
Controllers/ChatHistoryMemoryStore.cs （短期對談記憶保存）

完整程式碼我放在:
https://github.com/isdaviddong/LineBotWithMemory

如何在 Webhook Controller 中使用

先看看如何在 Webhook Controller 中使用，當 LINE 傳來訊息時，我們的流程大概是：

先取得使用者的 ChatHistory

var history = _store.GetOrCreate(LineEvent.source.userId);

呼叫 Semantic Kernel 並帶入歷史訊息

var chatGPT = new ChatGPT();
var responseMsg = chatGPT.getResponseFromGPT(LineEvent, history);

把此次對談訊息追加進 ChatHistory

_store.AppendUser(LineEvent.source.userId, LineEvent.message.text);
_store.AppendAssistant(LineEvent.source.userId, responseMsg);

回覆

this.ReplyMessage(LineEvent.replyToken, responseMsg);

上面的程式碼中使用到的_store，就是保存記憶的部分，我們設計一個IChatHistoryStore 介面，然後用 MemoryCache 來實作：

public class ChatHistoryMemoryStore : IChatHistoryStore
{
    private readonly IMemoryCache _cache;
    private readonly MemoryCacheEntryOptions _opts;
    private const int MaxMessagesPerUser = 24; // 避免 token 暴衝，只留最近 24 則

    public ChatHistoryMemoryStore(IMemoryCache cache)
    {
        _cache = cache;
        _opts = new MemoryCacheEntryOptions()
            .SetSlidingExpiration(TimeSpan.FromMinutes(30)) // 半小時沒互動就丟掉
            .SetAbsoluteExpiration(TimeSpan.FromHours(6));  // 最長保存 6 小時
    }

    public ChatHistory GetOrCreate(string userId)
    {
        return _cache.GetOrCreate(userId, _ =>
        {
            _.SetOptions(_opts);
            return new ChatHistory();
        })!;
    }

    public void AppendUser(string userId, string text)
    {
        var h = GetOrCreate(userId);
        h.AddUserMessage(text);
        Trim(h);
    }

    public void AppendAssistant(string userId, string text)
    {
        var h = GetOrCreate(userId);
        h.AddAssistantMessage(text);
        Trim(h);
    }

    public void Reset(string userId) => _cache.Remove(userId);

    private static void Trim(ChatHistory h)
    {
        if (h.Count > MaxMessagesPerUser)
        {
            var tail = h.Skip(Math.Max(0, h.Count - MaxMessagesPerUser)).ToList();
            h.Clear();
            foreach (var m in tail) h.Add(m);
        }
    }
}

而具體呼叫 LLM 的部分，則透過 Semantic Kernel :

public class ChatGPT
{
    const string OpenAIModel = "gpt-4o"; 
    const string OpenAIapiKey = "👉sk-xxxx"; 

    public string getResponseFromGPT(isRock.LineBot.Event LineEvent, ChatHistory history)
    {
        var builder = Kernel.CreateBuilder()
            .AddOpenAIChatCompletion(OpenAIModel, OpenAIapiKey);
        Kernel kernel = builder.Build();

        // 設定 System Prompt
        var SysPrompt = @"你是一個 AI 助理，可以回答使用者的問題";
        history.AddSystemMessage(SysPrompt);
        history.AddUserMessage(LineEvent.message.text);

        OpenAIPromptExecutionSettings execSettings = new()
        {
            ToolCallBehavior = ToolCallBehavior.AutoInvokeKernelFunctions,
        };

        var chatCompletionService = kernel.GetRequiredService<IChatCompletionService>();
        var result = chatCompletionService.GetChatMessageContentAsync(
            history,
            executionSettings: execSettings,
            kernel: kernel
        );

        return result.Result.Content;
    }
}

你會發現由於主程式在呼叫 getResponseFromGPT() 時，會把事先取得的歷史訊息傳入，因此我們在呼叫 GetChatMessageContentAsync() 方法時，LLM 就知道之前對談的前後文了。

小結

在 LINE Bot 開發中，因為 Webhook 是 stateless，我們需要額外的記憶儲存機制來維護對談上下文。
Semantic Kernel 提供的 ChatHistory 非常適合做對話記錄，但必須搭配一個儲存策略。

而如果對話生命周期短（例如 半小時～數小時），使用 .NET MemoryCache 就很方便。我們可以設定 滑動到期（無互動就回收）、絕對到期（最多保存多久），達到「短期記憶」的效果。若要長期記憶（跨天或跨機器），則應該改用 Redis 或資料庫機制。

透過這樣的設計，你就能讓 LINE Bot 具備「連貫對話」的能力，實現更自然的 AI 助理或 QA 客服。

留言

Alan寫道…

SK套件有提供兩個方法可以縮減對話歷史紀錄，ChatHistoryTruncationReducer & ChatHistorySummarizationReducer
參考自: https://learn.microsoft.com/zh-tw/semantic-kernel/concepts/ai-services/chat-completion/chat-history?pivots=programming-language-csharp

2025年9月16日下午1:19

搜尋此網誌

.NET Walker