Hey everyone!
I'm currently working on building a test chatbot using langchain-go and I need to be able to flush messages from the chat history when the token size of the full prompt hits a certain limit.
To tackle this, I've been digging into the codebase and exploring options to contribute to the repo. The python version has this nifty reference to BaseLanguageModel
, which handles the logic for measuring the token size of the stored memory buffer. However, the Go version doesn't yet have anything similar built-in, so I ended up using the titoken-go module to get the token size of my history.
Since I'm still a bit new to GoLang, I was wondering if this difference in design approach is something specific to Golang or just a design choice made by the author. As far as I understand, Golang has its own version of inheritance through embeddings so I created my own memory wrapper by embedding the memory Buffer
struct.
package memory
import (
"github.com/pkoukk/tiktoken-go"
mem "github.com/tmc/langchaingo/memory"
"github.com/tmc/langchaingo/schema"
)
type AsaiMemory struct {
*mem.Buffer
Encoding string
EncodingModel string
TokenLimit int
Messages []schema.ChatMessage
}
func NewAsaiMemory() *AsaiMemory {
m := AsaiMemory{
Buffer: mem.NewBuffer(),
Encoding: "",
EncodingModel: "gpt-3.5-turbo",
TokenLimit: 2800,
}
return &m
}
func (m *AsaiMemory) LoadMemory() error {
m.Messages = m.ChatHistory.Messages()
return nil
}
func (m *AsaiMemory) GetMemoryString() string {
bufferString, err := schema.GetBufferString(m.Messages, m.HumanPrefix, m.AIPrefix)
if err != nil {
return ""
}
return bufferString
}
func (m *AsaiMemory) TrimContext() error {
tkm, err := tiktoken.EncodingForModel(m.EncodingModel)
if err != nil {
return err
}
bufferString, err := schema.GetBufferString(m.Messages, m.HumanPrefix, m.AIPrefix)
if err != nil {
return err
}
bufferLength := len(tkm.Encode(bufferString, nil, nil))
if bufferLength > m.TokenLimit {
for bufferLength > m.TokenLimit {
m.Messages = m.Messages[1:]
bufferString, _ := schema.GetBufferString(m.Messages, m.HumanPrefix, m.AIPrefix)
bufferLength = len(tkm.Encode(bufferString, nil, nil))
}
}
return nil
}
I might be missing something, but the main pickle for me and the reason I needed the wrapper in the first place was because I couldn't access ChatHistory.messages
since it's a "private" property, and I couldn't find a way to pop/slice out the messages that were overflowing the token limit, I needed my own storage that I can manipulate and access.
So I was wondering if it would be a good idea, and the simplest solution for now to just extend the existing type Buffer struct
with something like:
func (m *Buffer) TrimContext(limit int, encodingModel string) error {
tkm, err := tiktoken.EncodingForModel(encodingModel)
if err != nil {
return err
}
bufferString, err := schema.GetBufferString(m.ChatHistory.messages, m.HumanPrefix, m.AIPrefix)
if err != nil {
return err
}
bufferLength := len(tkm.Encode(bufferString, nil, nil))
if bufferLength > limit {
for bufferLength > limit {
m.ChatHistory.messages = m.ChatHistory.messages[1:]
bufferString, _ := schema.GetBufferString(m.ChatHistory.messages, m.HumanPrefix, m.AIPrefix)
bufferLength = len(tkm.Encode(bufferString, nil, nil))
}
}
return nil
}
Iβm wondering if this should be included in a way in every type of memory since I donβt see a case where I wonβt worry about the token size of the prompt. Maybe a separate package that will take care of token counting?