tmc / langchaingo Goto Github PK

View Code? Open in Web Editor NEW

4.3K 36.0 591.0 12.79 MB

LangChain for Go, the easiest way to write LLM-based programs in Go

Home Page: https://tmc.github.io/langchaingo/

License: MIT License

Go 99.89% Makefile 0.11%

ai go golang langchain

langchaingo's Introduction

🦜️🔗 LangChain Go

⚡ Building applications with LLMs through composability, with Go! ⚡

🤔 What is this?

This is the Go language implementation of LangChain.

📖 Documentation

🎉 Examples

See ./examples for example usage.

package main

import (
  "context"
  "fmt"
  "log"

  "github.com/tmc/langchaingo/llms"
  "github.com/tmc/langchaingo/llms/openai"
)

func main() {
  ctx := context.Background()
  llm, err := openai.New()
  if err != nil {
    log.Fatal(err)
  }
  prompt := "What would be a good company name for a company that makes colorful socks?"
  completion, err := llms.GenerateFromSinglePrompt(ctx, llm, prompt)
  if err != nil {
    log.Fatal(err)
  }
  fmt.Println(completion)
}

$ go run .
Socktastic

Resources

Here are some links to blog posts and articles on using Langchain Go:

Using Gemini models in Go with LangChainGo - Jan 2024
Using Ollama with LangChainGo - Nov 2023
Creating a simple ChatGPT clone with Go - Aug 2023
Creating a ChatGPT Clone that Runs on Your Laptop with Go - Aug 2023

Contributors

langchaingo's People

Contributors

Stargazers

Watchers

Forkers

fluffykebab ganbayard abraxas-365 rdbell christiansch o-roma joelanglands anignx fraser-isbester k-minutti langchain-contribution r-glebov yubing744 wierwu aldarisbm beyond5959 andrewarrow bdqfork aliyeysides kenwschen tarunkoyalwar fodedoumbouya obitech pranoyk pedrohb88 wejick zhirafovod cldmnky leverly devcashflow alvarowolfx danielhjz regalius mvrilo u3breeze evilfreelancer eswulei to-be-architect omirror netr webws orotemo nickyang4github jfontestad uzziahlin swuecho edocevol dvonthenen slumber1122 jmcarbo bjwswang struki84 foreverif rh4shanks kugoucode tanlihuacq xiaoxfan teezzan ryomak yuchanns jieyoujun raghu-nandan-bs faelp22 b1ackmartian byebyebruce mcartagenah secform picnic106 zhangdongfarmer ujjwall-r rodas-j a925907195 mkdirmushroom hirampeng allisonbraithw tobiade smyja oloruntobi1 amk9978 baoist steinathan ai-awe jackleehal mplachter virgodestroffik988gt zivkovicn nidzola usernamesalah abdallamourad sourabh3b shulzpolina farizap seelichtblick lowczarc ntauth kevin19930919 leftfire bdjimmy andyyesiyu github-dengyu

langchaingo's Issues

feature: add Map-Rerank Combine Documents Chain

Description

This method involves running an initial prompt on each chunk of data, that not only tries to complete a task but also gives a score for how certain it is in its answer. The responses are then ranked according to this score, and the highest score is returned.

[source]

Acceptance Criteria

It must rank in parallel with the ranking for each chunk of data and return the result with the highest score.
It must work similarly to "Map Reduce Combine Documents Chain" but perform fewer LLM calls.
It shouldn't combine results from one document but return a response for each chunk of data.

References

Original Python implementation: langchain-ai/langchain#516
Golang features: #61

Start documentation site

We should get a Docusaurus docs site going.

[feature request] possible to use golang openai client?

https://github.com/sashabaranov/go-openai

It is pretty feature complete. will save a lot of repetitive code .

configure change like this will be totally unnecessary.

#117

Feature request: SimpleSequentialChain

@tmc is there any plan to add SimpleSequentialChain?

docs: finish first version of documentation site

The documentation site is currently missing a lot of content and needs work.

To import code from the examples directory in an mdx file you must write this at the start of the file:

import CodeBlock from "@theme/CodeBlock";
import ExampleLLM from "@examples/llm-chain-example/llm_chain.go";

Then write this where the code should be:

<CodeBlock language="go">{ExampleLLM}</CodeBlock>

Need to fix and improve the implementations and tests under `exp` sub-directory

Summary

It is impossible to develop, fix, and improve now because of the #14 changes didn't rename the imported packages in tests of exp packages, and the test code and file structure isn't idiomatic for Golang.

And also, I found out there are a lot of bugs and critical logical issues of implementation and tests within the packages under the exp sub-dir, however, I have to go step by step and break down things apart in order to boost up the review speed.

TODOs

Fix structuring issues (#17)
Fix import issues (#17)
Fix testing logical issue (such as t.FailNow() not called if error occurred and will interfere the further steps, didn't assert whether the if statement branch is entered or not, etc.) (#28 )
Fix the lint issues (#27, #28)
Fix the implementation issues (#27)

linting: add stricter linting

We got some basic linting config in #42 but we should beef up our lint rules to drive high quality contributions.

context: Propagate context.Context as appropriate.

Add verbose mode

Ala https://js.langchain.com/docs/modules/callbacks/#verbose-mode / https://python.langchain.com/docs/modules/callbacks/#get-started we should support verbose mode (related to callbacks support).

feat: SimpleSequentialChain

As shown in #61, we need to have a simple sequential chain feature, which enables people to interact with the LLM through a chain of prompts, and the output of each one is fed into the next.
source

I'd like to work on it.
I'm creating this issue not to waste contributors' time, and get aligned.

`outputparser.NewStructured` is restrictive & not robust enough

right now, outputparser.NewStructured is based on an internal struct (https://github.com/tmc/langchaingo/blob/main/outputparser/structured.go#L36) that's restrictive to only strings and output values, but I think it should be user-defined, a good example is from vector store here (https://github.com/tmc/langchaingo/blob/main/vectorstores/options.go#L30)

I can create a PR if this is the right way to go

Workaround

for now, I defined a custom prompt with the fields I need with a custom OutputParser and fed the prompt directly to the chain.

integrate with milvus for vector-store

https://github.com/milvus-io/milvus-sdk-go

Suggest implementation technology for documentation site

We should have a high quality rendered documentation site modeled after the existing langchain docs sites.

We need to determine what tool(s) to use and set up a CI pipeline to publish somewhere (probably github pages).

python langchain uses sphinx
typescript langchain uses docusaurus

I'm open to docusaurus but a go-based toolchain would be nice.

Shall we use Hugo?

SplitText not working

Do have a look. I need this library in my project to integrate.
Also please assign me. I will also try to fix it if I can.
Are unit tests written for the functions?

Allows setting custom http client

Reasons for example: In many areas, the local network cannot directly access the address of openai, and proxy settings are required to access

Add equivalent "schema" concepts.

References:

https://github.com/hwchase17/langchainjs/blob/main/langchain/src/schema/index.ts
https://github.com/hwchase17/langchain/blob/master/langchain/schema.py

run openai-chat-example failed due to undefined func

I am new to langchaingo. When I tried to run this openai-chat-example locally, it tolds me

❯ go run .
# basic-llm-example
./openai_chat_example.go:22:10: undefined: llms.WithStreamingFunc

I did a quick check on this part. Found

Request new feature support

Please consider adding a chain request message return for calculating token consumption

Refactor LLMs to have optional parameters

Quoting @tmc from #43 :

Let's do variadic/functional Options arguments to pass these (and have some reasonable defaults).

like

type Option func(LLM)

func New(options ...Option) (*LLM, error) {
    llm := &LLM{...}
    for _, o := range options {
       o(llm)
   }
}

/e: I wanted to also document which LLMs need to be updated, but as this is changing so quickly, I don't think that's feasible atm

feat: FewShotPromptTemplate

It's a clean way to have FewShotPromptTemplate, to implement the ConstitutionalChain, and also enable people to build prompts in that way.
source

I'd like to work on it.

Introduce `prompts` concept

Description

Adds prompts package along with the refactored exp/prompts.

References:

[Feature request] Add Anthropic support

With the latest announcement of Claude 2, it should be considered to also support Anthropic models like in the python implementation.

retrieval_qa discard ChainOption

When I was using the RetrievalQA chain, I noticed that there was no streaming output. Upon inspection, I found that the Call function in the chains/retrieval_qa.go file was discarding the ChainOption. If there is a reason for this? I tried adding the ChainOption and passing it to the called function, and found that the streaming output started working properly.

chains/retrieval_qa.go
func (c RetrievalQA) Call(ctx context.Context, values map[string]any, _ ...ChainCallOption)

SplitText is not working

github doesn't respect the contents of the .github folder.

act can't render this repo as the .github folder is incorrectly named.

add function call support

[feature request] Set custom http client for OpenAI

I am creating tests that don't make calls to the LLM by mocking the HTTP client.
The underlying openai client supports passing in an HTTP client as an option, but the wrapper around it does not (it hard codes the options passed in openaiclient.New(options.token, options.model, options.baseURL, options.organization) .
The Go Openai client also supports setting the client using WithHTTPClient, but to use this, the client on the LLM struct would need to be accessible.

Proposed Fix
Make the openai client accessible with a getter GetClient

reading API keys from .env file

I saw this function:

func New() (*LLM, error) {
	token := os.Getenv(tokenEnvVarName)
	if token == "" {
		return nil, ErrMissingToken
	}
	c, err := huggingfaceclient.New(token)
	return &LLM{
		client: c,
	}, err
}

And I know i can modify it to read the key from .env file, using godotenv like:

package main

import (
	"fmt"
	"log"
	"os"

	"github.com/joho/godotenv"
)

func main() {
	err := godotenv.Load() // will load the ".env" file
	if err != nil {
		log.Fatalf("Some error occured. Err: %s", err)
	}

	env := os.Getenv("HUGGINGFACEHUB_API_TOKEN")
	if "" == env {
             log.Fatalf("Can not find the API")
       }
}

Is there a way to modify the New() function through something like an extension instead of modifying the package code itself.

I know ths could be considered as a general golang question rather than directly related to this package, but though to ask you. Thanks

What is the role of the textKey option when defining a new Pinecone store, and Is it possible to circumvent it being used as a filter?

What is the proper usage of the Pinecone textKey field? It's described as an option for storing the text of the document a vector represents.

The way it's used however , when executing a query request, is more like a filter. It excludes document matches once the query to Pinecone has completed. In the following snippet results are excluded if they don't contain a metadata field that matches what was provided as a textKey. If no textKey is provided a default textKey ("text") is used. There doesn't appear to be a way to make a request without filtering the resulting documents.

	docs := make([]schema.Document, 0, len(response.Matches))
	for _, match := range response.Matches {
		pageContent, ok := match.Metadata[s.textKey].(string)
		if !ok {
			return nil, ErrMissingTextKey
		}

Documents that match the textKey filter are placed in a slice of type schema.Document however the metadata field is empty when returned from restQuery. match.Metadata is assigned to the Metadata field in the Document slice however it has already been deleted and no longer points to anything.

      delete(match.Metadata, s.textKey)

      docs = append(docs, schema.Document{
	  PageContent: pageContent,
	  Metadata: match.Metadata,
})

This might be a issue with my understanding of how textKey should be used but it is a little unclear. Another of the fields returned from the Pinecone query request is ID, which helps associate each vector with its corresponding document. This would also be useful to include in the Document struct ( although it is very specific to Pinecone)

[feature request] add new func option `WithOpenAIClient`

Hi，can we add an option to provider WithOpenAIClient option for New function, so that we can use local LLM api or Azure service, cause different provider of model may have different authorization method.

This is a chat model and not supported in the v1/completions endpoint. Did you mean to use v1/chat/completions?

langchaingo/llms/openai/internal/openaiclient/completions.go

Line 91 in c01f5d3

    
           req, err := http.NewRequestWithContext(ctx, http.MethodPost, fmt.Sprintf("%s/v1/completions", c.baseURL), bytes.NewReader(payloadBytes))

Introduce callbacks concept

Note that langchain uses callbackmanagers to track the entire process of execution, may be we need too?

https://python.langchain.com/en/latest/modules/callbacks/getting_started.html

Add golangci-lint github action

See https://golangci-lint.run/usage/install/#ci-installation

Features request: Options like WithSeed, WithTopK and so on

Hello!

I wanted to clarify why they were not added earlier? Perhaps there is some reason for this, or maybe you just forgot about them.

Can I add some of these options for more smooth usage of models in future PR? What should be taken into account?

good job

I was about to write a rough version of it myself, and there were other people working on projects

Introduce `chatModels`

I already started with openai but have the need to discuss how we want to map langchain to langchaingo in a few instances. I'll make a PR soon.

pinecone-withGrpc-option is not exported from package. Is this being worked on?

The withGrpc() function is not exported from the pinecone package. In effect it cannot be used by client code outside of that package. Is this something that is being worked on?

Both of AddSimilarity and SimilaritySearch functions have superfluous code blocks which check if the useGRPC flag is set.

Flushing ChatHistory based on token size

Hey everyone!

I'm currently working on building a test chatbot using langchain-go and I need to be able to flush messages from the chat history when the token size of the full prompt hits a certain limit.

To tackle this, I've been digging into the codebase and exploring options to contribute to the repo. The python version has this nifty reference to BaseLanguageModel, which handles the logic for measuring the token size of the stored memory buffer. However, the Go version doesn't yet have anything similar built-in, so I ended up using the titoken-go module to get the token size of my history.

Since I'm still a bit new to GoLang, I was wondering if this difference in design approach is something specific to Golang or just a design choice made by the author. As far as I understand, Golang has its own version of inheritance through embeddings so I created my own memory wrapper by embedding the memory Buffer struct.

package memory

import (
	"github.com/pkoukk/tiktoken-go"
	mem "github.com/tmc/langchaingo/memory"
	"github.com/tmc/langchaingo/schema"
)

type AsaiMemory struct {
	*mem.Buffer
	Encoding      string
	EncodingModel string
	TokenLimit    int
	Messages      []schema.ChatMessage
}

func NewAsaiMemory() *AsaiMemory {
	m := AsaiMemory{
		Buffer:        mem.NewBuffer(),
		Encoding:      "",
		EncodingModel: "gpt-3.5-turbo",
		TokenLimit:    2800,
	}

	return &m
}

func (m *AsaiMemory) LoadMemory() error {
	m.Messages = m.ChatHistory.Messages()
	return nil
}

func (m *AsaiMemory) GetMemoryString() string {
	bufferString, err := schema.GetBufferString(m.Messages, m.HumanPrefix, m.AIPrefix)
	if err != nil {
		return ""
	}
	return bufferString
}

func (m *AsaiMemory) TrimContext() error {
	tkm, err := tiktoken.EncodingForModel(m.EncodingModel)
	if err != nil {
		return err
	}

	bufferString, err := schema.GetBufferString(m.Messages, m.HumanPrefix, m.AIPrefix)
	if err != nil {
		return err
	}

	bufferLength := len(tkm.Encode(bufferString, nil, nil))

	if bufferLength > m.TokenLimit {
		for bufferLength > m.TokenLimit {
			m.Messages = m.Messages[1:]
			bufferString, _ := schema.GetBufferString(m.Messages, m.HumanPrefix, m.AIPrefix)
			bufferLength = len(tkm.Encode(bufferString, nil, nil))
		}
	}

	return nil
}

I might be missing something, but the main pickle for me and the reason I needed the wrapper in the first place was because I couldn't access ChatHistory.messages since it's a "private" property, and I couldn't find a way to pop/slice out the messages that were overflowing the token limit, I needed my own storage that I can manipulate and access.

So I was wondering if it would be a good idea, and the simplest solution for now to just extend the existing type Buffer struct with something like:

func (m *Buffer) TrimContext(limit int, encodingModel string) error {
	tkm, err := tiktoken.EncodingForModel(encodingModel)
	if err != nil {
		return err
	}

	bufferString, err := schema.GetBufferString(m.ChatHistory.messages, m.HumanPrefix, m.AIPrefix)
	if err != nil {
		return err
	}

	bufferLength := len(tkm.Encode(bufferString, nil, nil))

	if bufferLength > limit {
		for bufferLength > limit {
			m.ChatHistory.messages = m.ChatHistory.messages[1:]
			bufferString, _ := schema.GetBufferString(m.ChatHistory.messages, m.HumanPrefix, m.AIPrefix)
			bufferLength = len(tkm.Encode(bufferString, nil, nil))
		}
	}

	return nil
}

I’m wondering if this should be included in a way in every type of memory since I don’t see a case where I won’t worry about the token size of the prompt. Maybe a separate package that will take care of token counting?

feat: ConstitutionalChain

As shown in #61, we need to have a constitutional chain feature, which enables people to have some sort of control over the output of their chain.
source

I'd like to work on it.
I'm creating this issue not to waste contributors' time, and get aligned.

Improve CONTRIBUTING.md

As interest is spooling up here it's more important to have clear contribution guidelines.

In scope should include:

Pull request sizing and scoping.
PR title and description norms.
Code quality expectations
Testing expectations

Create the `agents` package

We need add the agents package to add more functionality and biggest use cases

Add doc.go files to every package.

Every package should have a well written package-level comment beginning with "// Package (packagename)".

Feature Parity Matrix

We should take inspiration (and content) from https://langchain.com/features.html and have a markdown file that indicates the current implementation state.

Once we have docs site (#21) this content could go there.

Add model parameter to openai.New()

The following New function in llms/openai/openaillm.go gets the model form either the environment, or from a default value in llms/openai/internal/openaiclient/openaiclient.go. The default value is behind an internal package. I propose that the openai.New() function a take an optional argument (using the optional function parameter pattern) that accepts an option for model. This would easily and elegantly allow multiple instances to openai.New() to use different models and not be fixed to a single environment variable.

// New returns a new OpenAI client.
func New() (*LLM, error) {
	// Require the OpenAI API key to be set.
	token := os.Getenv(tokenEnvVarName)
	if token == "" {
		return nil, ErrMissingToken
	}

	// Allow model selection.
	model := os.Getenv(modelEnvVarName)

	// Create the client.
	c, err := openaiclient.New(token, model)
	return &LLM{
		client: c,
	}, err
}

Features request: Converting llms.CallOptions to local-llm script arguments

Hi!

I have a suggestion regarding the options we currently have.

In localllm.go file, we define the following strings:

const (
	// The name of the environment variable that contains the path to the local LLM binary.
	localLLMBinVarName = "LOCAL_LLM_BIN"
	// The name of the environment variable that contains the CLI arguments to pass to the local LLM binary.
	localLLMArgsVarName = "LOCAL_LLM_ARGS"
)

My idea is to make these environment variables optional. It would be better to have the ability to change them programmatically. For example, if the bin and args values are not set in the New function, we can use the values from the environment variables. If neither is set, we can check and trigger an error.

As a result, we would have something like this:

func New(opts ...Option) (*LLM, error) {
	options := &options{
		bin: os.Getenv(localLLMBinVarName),
		args: os.Getenv(localLLMArgsVarName),
	}

	for _, opt := range opts {
		opt(options)
	}

        path, err := exec.LookPath(options.bin)
	if err != nil {
		return nil, errors.Join(ErrMissingBin, err)
	}

	c, err := localclient.New(path, options.args)
	return &LLM{
		client: c,
	}, err
}

Another idea is to implement dynamic generation of the arguments array based on the options passed through the llms.CallOptions struct.

Before the execution stage, we can append these options to the args array. This would result in a string like the following:

/path/to/executable/file --top_k="1" --top_p="10" --seed="42" --prompt="prompt"

What are your thoughts on these ideas?

[Feature Request] Introducing MessagePromptTemplate

First of all, please correct me if I am mistaken about your roadmap plan.

Your proposal is related to a problem

Currently, I couldn't find a proper way to invoke schema.ChatMessage with the package prompts. The only related one I found is prompts.StringPromptValue which Format the message to schema.HumanChatMessage.

The original Langchain provides different types of MessagePromptTemplate such as AIMessagePromptTemplate, HumanMessagePromptTemplate, and SystemMessagePromptTemplate.

FYI: https://langchain-langchain.vercel.app/docs/modules/model_io/prompts/prompt_templates/msg_prompt_templates

This is more convenient instead of using schema.ChatMessage directly.

Describe the solution you'd like

So let's introduce the same thing into this lib.

To declare an interface BaseMessagePromptTemplate inside prompts:

type BaseMessagePromptTemplate interface {
	IntputVariables() []string
	FormatPrompt(values map[string]interface{}) (schema.PromptValue, error)
	FormatMessages(values map[string]interface{}) ([]schema.ChatMessage, error)
}

Then we implement AIMessagePromptTemplate, HumanMessagePromptTemplate, and SystemMessagePromptTemplate based on prompts.PromptTemplate for the interface. That tree prompt templates should produce their respective schema.PromptValue (AIPromptValue, HumanPromptValue, and SystemPromptValue) so that can be easily used by Chat Models in such way:

func TestChatWithTemplate(t *testing.T) {
	template := prompts.NewChatPromptTemplate([]prompts.BaseMessagePromptTemplate{
		chat.NewSystemMessagePromptTemplate(
			`You are a translation engine that can only translate text and cannot interpret it.`,
			nil,
		),
		chat.NewHumanMessagePromptTemplate(
			`translate this text from {{.input_lang}} to {{.output_lang}}:\n{{.input}}`,
			[]string{"input_lang", "output_lang", "input"},
		),
	})
	value, err := template.FormatPrompt(map[string]interface{}{
		"input_lang":  "English",
		"output_lang": "Chinese",
		"input":       INPUT,
	})
	if err != nil {
		t.Fatalf("%s", err)
	}
	completion, err := llm.Chat(context.Background(), value.Messages())
	if err != nil {
		t.Fatalf("%s", err)
	}
	fmt.Println(completion.Message.GetText())
}

Additional context

I would like to create a PR and make a contribution to this feature request if you let me know.:)

Feature request: Conversational Retrieval QA

Description

https://python.langchain.com/docs/modules/chains/popular/chat_vector_db - this is available but seems that this library is not supporting it
I will try to work on it when I get some free time, seems like a nice feature to have

Stack overflow error `RecursiveCharacter`

code :

package vectorizer

import (
	"context"

	"github.com/tmc/langchaingo/textsplitter"
)

func Splitter(ctx context.Context, content string) ([]string, error) {
	splitter := textsplitter.RecursiveCharacter{Separators: []string{" "}}
	res, err := splitter.SplitText(content)
	if err != nil {
		return nil, err
	}
	
	return res, nil
}

Error:

runtime: goroutine stack exceeds 1000000000-byte limit
runtime: sp=0xc0204b4368 stack=[0xc0204b4000, 0xc0404b4000]
fatal error: stack overflow

runtime stack:
runtime.throw({0xb5dbcf?, 0x10ec200?})
        /usr/lib/go-1.18/src/runtime/panic.go:992 +0x71
runtime.newstack()
        /usr/lib/go-1.18/src/runtime/stack.go:1101 +0x5cc
runtime.morestack()
        /usr/lib/go-1.18/src/runtime/asm_amd64.s:547 +0x8b

Structured parser doesn't parse array

Hi, I am trying to parse an array of data using structured parser. The json I am trying to parse is

{
  "ids": [number]
}

I am getting this error map[] json: cannot unmarshal array into Go value of type string
I tried to change the data type of the parser to map[string]any and it worked perfectly.

I was wondering if this case is handled somewhere else? Or can we add it to the existing parser?

[feature request] pass in openai.Option options into embeddings.NewOpenAI() (to be able to use openai.WithToken(OPENAI_API_KEY) without env var)

I was wondering if some others would also agree that it would make sense to refactor the embeddings NewOpenAI function so that you can pass in the OPENAI_API_KEY as an openai.Option ( like done in the openaillm.go -> NewChat(opts ...Option) ) and do not depend on setting the OPENAI_API_KEY as environment variable.

This would make sense so you can actually make use of your users apiKeys for embedding documents.

embeddings/openai.go
embeddings/openai_test.go
llms/openai/openaillm.go

tmc / langchaingo Goto Github PK

langchaingo's Introduction

🦜️🔗 LangChain Go

🤔 What is this?

📖 Documentation

🎉 Examples

Resources

Contributors

langchaingo's People

Contributors

Stargazers

Watchers

Forkers

langchaingo's Issues

Description

Acceptance Criteria

References

Summary

TODOs

Workaround

Description

Your proposal is related to a problem

Describe the solution you'd like

Additional context

Description

Recommend Projects

Recommend Topics

Recommend Org