patterns-ai-core / langchainrb Goto Github PK
View Code? Open in Web Editor NEWBuild LLM-powered applications in Ruby
Home Page: https://rubydoc.info/gems/langchainrb
License: MIT License
Build LLM-powered applications in Ruby
Home Page: https://rubydoc.info/gems/langchainrb
License: MIT License
Add additional examples to /examples. We don't have all of the current functionality/use-cases documented. I think it's helpful to showcase what this library can and cannot do!
Build a new Tool Langchain::Tool::WolframAlpha
that can be connected to Agents.
We have outgrown the single-file README. It is time to migrate to a more scalable solution.
Auto-generated documention RDoc-style may not be ideal as it'll read as a dry technical manual.
Currently if you'd like to only use Qdrant and OpenAI, this gem will install a bunch of other dependencies you don't need. There's no reason to do this.
We can build a similar system to omniauth
where different strategies are installed separately omniauth-github
, omniauth-facebook
, etc.
We'd ask the user to install gems, on their own, in their applications and then we'd require them in this gem (pseudo-codeish):
class Qdrant
def initialize
require "qdrant"
...
Thank you for raising this issue @alchaplinsky!
Have you checked https://github.com/alexrudall/ruby-openai ?
It's common when prompt engineering to require the LLM to return results in JSON format and to include an example JSON response within the prompt. The current Prompt variable parser can't handle this as it interprets the example JSON as an input variable and then complains that it is missing:
require "langchain"
simple_template = "Tell me a {adjective} joke. Return in JSON in the format {{joke: 'The joke'}}"
Prompt::Base.extract_variables_from_template(simple_template)
=> ["adjective", "joke: 'The joke'"]
The current Python f-string parser handles this by allowing you to escape a curly brace with a double curly brace, as shown in the example above. Python f-string spec
Looking at the regex its clearly trying to do something with double curly brackets so before I try and fix it can you shed some light on the original implementation?
The prompt templates are hard to read and understand since they don't support muli-line strings.
Rename the files to .yaml, update the strings, and update code.
A popular use-case is constructing a prompt for the LLM that includes the database schema, and user's question at hand to construct a SQL query that gets executed on LLM's behalf. The SQL query's result then gets fed back to the LLM to synthesize an answer to the user's question.
Create an entity (should it be a Tool? an Agent? or another abstraction?) that:
A popular use-case is constructing a prompt for the LLM that includes the API specs, and user's question at hand to construct an API call that gets executed on LLM's behalf. The API call's result then gets fed back to the LLM to synthesize an answer to the user's question.
Create an entity (should it be a Tool? an Agent? or another abstraction?) that:
As a prequel to #129, use Sequel 'reflection' methods to replace the current messy schema dump with clean table definitions and foreign keys (to which sample data can added in a subsequent issues, and the output can be limited to certain tables):
https://sequel.jeremyevans.net/rdoc/files/doc/reflection_rdoc.html
Like this (as suggested by @bborn):
https://github.com/jerryjliu/llama_index/blob/b4618a2a24cd11b5c5949ab97389d62ac34ea336/llama_index/indices/struct_store/container_builder.py#L76
https://github.com/jerryjliu/llama_index/blob/c2f24363b8c6cd74f17647548187821ce9ea4ddf/llama_index/langchain_helpers/sql_wrapper.py#L44
Format (suggested by @rickychilcott from this paper)
CREATE TABLE Highschooler(
ID int primary key,
name text,
grade int)
/*
3 example rows:
SELECT * FROM Highschooler LIMIT 3;
ID name grade
1510 Jordan 9
1689 Gabriel 9
1381 Tiffany 9
*/
CREATE TABLE Friend(
student_id int,
friend_id int,
primary key (student_id,friend_id),
foreign key(student_id) references Highschooler(ID),
foreign key (friend_id) references Highschooler(ID)
)
/*
3 example rows:
SELECT * FROM Friend LIMIT 3;
student_id friend_id
1510 1381
1510 1689
1689 1709
*/
A small improvement to the docs for the Weaviate example is to include index_name
client = Langchain::Vectorsearch::Weaviate.new(
url: ENV["WEAVIATE_URL"],
api_key: ENV["WEAVIATE_API_KEY"],
index_name: "Document", # add this
llm: :openai, # or :cohere
llm_api_key: ENV["OPENAI_API_KEY"]
)
Since this is required by the API it makes sense to include
Reference: https://weaviate.io/developers/weaviate/quickstart/custom-vectors#schema
I think we need the ability to add and run some 'Integration' tests that exercise interactions in high level components and use actual apis and keys. They would be run only on request and could be run before each release.
Start with a simple question to ChainOfThought with openai like in the README, with expectation that the result should be similar but not exactly equal to the result given in the README, since I assume the ai can respond slightly differently each time the test is called.
Stack trace:
-> langchain [main*]: gem install langchainrb
Successfully installed langchainrb-0.3.11
Parsing documentation for langchainrb-0.3.11
Done installing documentation for langchainrb after 0 seconds
1 gem installed
-> langchain [main*]: irb
irb(main):001:0> require "langchain"
/Users/andrei/.asdf/installs/ruby/3.0.0/lib/ruby/gems/3.0.0/gems/langchainrb-0.3.11/lib/langchain.rb:17:in `<module:Langchain>': uninitialized constant Langchain::Pathname (NameError)
from /Users/andrei/.asdf/installs/ruby/3.0.0/lib/ruby/gems/3.0.0/gems/langchainrb-0.3.11/lib/langchain.rb:7:in `<top (required)>'
from <internal:/Users/andrei/.asdf/installs/ruby/3.0.0/lib/ruby/3.0.0/rubygems/core_ext/kernel_require.rb>:160:in `require'
from <internal:/Users/andrei/.asdf/installs/ruby/3.0.0/lib/ruby/3.0.0/rubygems/core_ext/kernel_require.rb>:160:in `rescue in require'
from <internal:/Users/andrei/.asdf/installs/ruby/3.0.0/lib/ruby/3.0.0/rubygems/core_ext/kernel_require.rb>:149:in `require'
from (irb):1:in `<main>'
from /Users/andrei/.asdf/installs/ruby/3.0.0/lib/ruby/gems/3.0.0/gems/irb-1.6.4/exe/irb:9:in `<top (required)>'
from /Users/andrei/.asdf/installs/ruby/3.0.0/bin/irb:23:in `load'
from /Users/andrei/.asdf/installs/ruby/3.0.0/bin/irb:23:in `<main>'
<internal:/Users/andrei/.asdf/installs/ruby/3.0.0/lib/ruby/3.0.0/rubygems/core_ext/kernel_require.rb>:85:in `require': cannot load such file -- langchain (LoadError)
from <internal:/Users/andrei/.asdf/installs/ruby/3.0.0/lib/ruby/3.0.0/rubygems/core_ext/kernel_require.rb>:85:in `require'
from (irb):1:in `<main>'
from /Users/andrei/.asdf/installs/ruby/3.0.0/lib/ruby/gems/3.0.0/gems/irb-1.6.4/exe/irb:9:in `<top (required)>'
from /Users/andrei/.asdf/installs/ruby/3.0.0/bin/irb:23:in `load'
from /Users/andrei/.asdf/installs/ruby/3.0.0/bin/irb:23:in `<main>'
We currently have custom errors/exceptions spread out and defined all throughout the project within different namespaces.
Gather up and move all of the custom error class definitions to a dedicated errors.rb
file. Here's an example of a good organization: https://github.com/jnunemaker/httparty/blob/master/lib/httparty/exceptions.rb. Please annotate (write documentation) to each error class, when it's raised, etc.
Agents need to be able to store data in memory or vector search DB for retrieval later.
There's a limitation with the current SQLQueryAgent
agent that will encounter a context window limit when a large database table schema is passed in to the LLM. As pointed out by @bborn LlamaIndex stores the DB schema in a vector search database to avoid stuffing the whole DB schema in a single prompt.
We should make the memory:
(should we call it persistence:
instead?) to Agents to solve problems like that ^^
Right now, all the Ruby files currently live in the lib
directory. This can be a problem because the "require namespace" is flat, which can cause problems when there's overlap. It would be better to have them in lib/langchain
Similarly, all the constants are in the global namespace, which is shared. It'd be better to have them under Langchain
.
So I propose:
lib/langchain
(except lib/langchain.rb
)Langchain
I can make the change, but it is very prone to getting out of date and causing merge conflicts, so wanted to get feedback before attempting.
The Pgvector's similarity_search_by_vector()
should be calculating the cosine distance by default: https://github.com/andreibondarev/langchainrb/blob/main/lib/vectorsearch/pgvector.rb#L75-L77
More info about building a cosine distance query here: https://github.com/pgvector/pgvector/blob/dee2c4feb1bc5b17b9fe6a0a1ce8dbf0963c1b05/README.md#vector-operators
1. Rename ChainOfThoughtAgent
to ReActAgent
because that's what it actually is. This is what "chain of thought" is: https://learnprompting.org/docs/intermediate/chain_of_thought; and this is what "ReAct" actually is: https://arxiv.org/abs/2210.03629
2. Try asking the LLM for a JSON output format -- may or may not be more accurate. Source:
. i.e. instead of RegEx-ing strings, we may be able to read off JSON keys instead.
(Separate PRs please).
It would be super-useful to accept an IO Stream or a string directly.
It's particularly useful when you're working with files on cloud storages like Google Drive or S3.
Example:
drive = Google::Apis::DriveV3::DriveService.new
raw_content = drive.get_file(my_file_id, download_dest: StringIO.new).string
text = Langchain::Loader.load(raw_content)
Without this, I have to do something like:
drive = Google::Apis::DriveV3::DriveService.new
temp_file = TempFile.new(my_file_id)
raw_content = drive.get_file(my_file_id, download_dest: temp_file.path)
text = Langchain::Loader.load(temp_file.path)
FileUtils.rm(temp_file.path)
Loaders::Doc
class (similar to https://github.com/andreibondarev/langchainrb/tree/main/lib/loaders) that processes .doc
and .docx
files.Currently when an Agent is initialized to use Tools we pass them as strings that are then matched to existing classes. See the following current usage:
agent = Langchain::Agent::ChainOfThoughtAgent.new(llm: :openai, llm_api_key: ENV["OPENAI_API_KEY"], tools: ['search', 'calculator'])
agent.tools
# => ["search", "calculator"]
agent.tools = ["wikipedia"]
This approach is not flexible because:
Let's change the Agent interface to accept Tool instances like so:
calculator_tool = Langchain::Tool::Calculator.new()
sql_db_tool = Langchain::Tool::Database.new(db_connection_string: "postgres://user:password@localhost:5432/db_name") # Coming in this PR: https://github.com/andreibondarev/langchainrb/pull/91/files#diff-9a2d0c4b8a1176be3d78866742f2ba4c2da2452cb05959f433c1204bb8211ebd
agent = Langchain::Agent::ChainOfThoughtAgent.new(
llm: :openai,
llm_api_key: ENV["OPENAI_API_KEY"],
tools: [calculator_tool, sql_db_tool]
)
Note: Modify the SerpApi tool to accept the api_key:
in the initialize method instead of seeking out the ENV var here:
module Langchain::Tool
class SerpApi < Base
attr_reader :api_key
def initialize(api_key:)
@api_key = api_key
def execute_search(input:)
GoogleSearch.new(
serp_api_key: api_key
When I set a breakpoint with 'binding.pry' most debugging commands work except 'step' and 'continue.'
[2] pry(#<Langchain::Tool::Database>)> whereami
Inside #<Langchain::Tool::Database>.
[3] pry(#<Langchain::Tool::Database>)>
[4] pry(#<Langchain::Tool::Database>)> step
NameError: undefined local variable or method `step' for #<Langchain::Tool::Database:0x00000001040e0e20 @DB=#<Sequel::SQLite::Database: {:adapter=>:sqlite}>>
from (pry):1:in `__pry__'
When the :debug
logger level is set:
Langchain.logger.level = :debug
I think we should print a ton of data, similar to the current output that the ChainOfThoughtAgent currently generates. Examples:
[Langchain.rb] [Weaviate]: Saving data to database is successful
[Langchain.rb] [OpenAI]: Generating embeddings
[Langchain.rb] [Weaviate]: Searching the "products" index
etc.
The challenge is figuring out which log messages are most helpful, we don't want to just flood the developer / the logs with useless text.
There's another way to do this:
[Langchain.rb] { service: "Vectorsearch::Weviate", action: "similarity_search", parameter: "..." }
[Langchain.rb] { service: "LLM::OpenAI", action: "embed", parameter: "..." }
Under module Vectorsearch::Base
, there is a method add_data
accepting path: nil, paths: nil
params.
From my point of view, we could remove path
and only accept paths
or use the splat operator.
Currently the conversations with LLMs that offer the .chat()
endpoint are not persisted, hence the LLM has no context of the previous chat messages that may have taken place.
The following 2 LLMs offer chat capabilities and accept messages:
array:
.chat()
methods to persist and keep track of previous chat exchanges that has taken place.openai = Langchain::LLM::OpenAI.new(api_key: ENV["OPENAI_API_KEY"])
openai.chat_persistence = true
openai.ask(question: ...)
#=> LLM answer...
openai.ask(question: ...)
#=> LLM answer...
openai.ask(question: ...)
#=> LLM answer...
openai.clear_chat_persistence!
openai.chat_persistence = false
We would like to collect any and all feedback people might have regarding Langchain.rb: GOOD and BAD! Have you already tried Langchain.rb in your project? Do you have specific requirements or use-cases that you don't think Langchain.rb could help you with? Please provide us with your feedback!
We'd like to ensure that this project is rooted in real needs and use-cases, and solves actual pain-points when developing LLM-driven applications.
Optional questions to think through and answer:
Thank you! โค๏ธ
I was just trying out this simple chain of thought agent:
agent = Langchain::Agent::ChainOfThoughtAgent.new(llm: :openai, llm_api_key: ENV["OPENAI_API_KEY"], tools: ['calculator'])
puts agent.run(question: "What is the square root of 99?")
It fails with:
server returns error for url: https://serpapi.com/search?q=%E2%88%9A99&engine=google&output=json&source=ruby
/Users/josh.nichols/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/google_search_results-2.2.0/lib/search/serp_api_search.rb:143:in `rescue in get_results': Invalid API key. Your API key should be here: https://serpapi.com/manage-api-key (RuntimeError)
from /Users/josh.nichols/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/google_search_results-2.2.0/lib/search/serp_api_search.rb:136:in `get_results'
from /Users/josh.nichols/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/google_search_results-2.2.0/lib/search/serp_api_search.rb:50:in `get_json'
from /Users/josh.nichols/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/google_search_results-2.2.0/lib/search/serp_api_search.rb:64:in `get_hash'
from /Users/josh.nichols/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/langchainrb-0.4.1/lib/langchain/tool/serp_api.rb:50:in `execute_search'
from /Users/josh.nichols/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/langchainrb-0.4.1/lib/langchain/tool/serp_api.rb:27:in `execute_search'
from /Users/josh.nichols/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/langchainrb-0.4.1/lib/langchain/tool/calculator.rb:27:in `rescue in execute'
from /Users/josh.nichols/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/langchainrb-0.4.1/lib/langchain/tool/calculator.rb:20:in `execute'
from /Users/josh.nichols/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/langchainrb-0.4.1/lib/langchain/tool/base.rb:26:in `execute'
from /Users/josh.nichols/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/langchainrb-0.4.1/lib/langchain/agent/chain_of_thought_agent/chain_of_thought_agent.rb:66:in `block in run'
from /Users/josh.nichols/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/langchainrb-0.4.1/lib/langchain/agent/chain_of_thought_agent/chain_of_thought_agent.rb:44:in `loop'
from /Users/josh.nichols/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/langchainrb-0.4.1/lib/langchain/agent/chain_of_thought_agent/chain_of_thought_agent.rb:44:in `run'
from main.rb:7:in `<main>'
/Users/josh.nichols/.rbenv/versions/3.2.2/lib/ruby/3.2.0/open-uri.rb:369:in `open_http': 401 Unauthorized (OpenURI::HTTPError)
from /Users/josh.nichols/.rbenv/versions/3.2.2/lib/ruby/3.2.0/open-uri.rb:760:in `buffer_open'
from /Users/josh.nichols/.rbenv/versions/3.2.2/lib/ruby/3.2.0/open-uri.rb:214:in `block in open_loop'
from /Users/josh.nichols/.rbenv/versions/3.2.2/lib/ruby/3.2.0/open-uri.rb:212:in `catch'
from /Users/josh.nichols/.rbenv/versions/3.2.2/lib/ruby/3.2.0/open-uri.rb:212:in `open_loop'
from /Users/josh.nichols/.rbenv/versions/3.2.2/lib/ruby/3.2.0/open-uri.rb:153:in `open_uri'
from /Users/josh.nichols/.rbenv/versions/3.2.2/lib/ruby/3.2.0/open-uri.rb:740:in `open'
from /Users/josh.nichols/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/google_search_results-2.2.0/lib/search/serp_api_search.rb:139:in `get_results'
from /Users/josh.nichols/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/google_search_results-2.2.0/lib/search/serp_api_search.rb:50:in `get_json'
from /Users/josh.nichols/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/google_search_results-2.2.0/lib/search/serp_api_search.rb:64:in `get_hash'
from /Users/josh.nichols/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/langchainrb-0.4.1/lib/langchain/tool/serp_api.rb:50:in `execute_search'
from /Users/josh.nichols/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/langchainrb-0.4.1/lib/langchain/tool/serp_api.rb:27:in `execute_search'
from /Users/josh.nichols/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/langchainrb-0.4.1/lib/langchain/tool/calculator.rb:27:in `rescue in execute'
from /Users/josh.nichols/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/langchainrb-0.4.1/lib/langchain/tool/calculator.rb:20:in `execute'
from /Users/josh.nichols/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/langchainrb-0.4.1/lib/langchain/tool/base.rb:26:in `execute'
from /Users/josh.nichols/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/langchainrb-0.4.1/lib/langchain/agent/chain_of_thought_agent/chain_of_thought_agent.rb:66:in `block in run'
from /Users/josh.nichols/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/langchainrb-0.4.1/lib/langchain/agent/chain_of_thought_agent/chain_of_thought_agent.rb:44:in `loop'
from /Users/josh.nichols/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/langchainrb-0.4.1/lib/langchain/agent/chain_of_thought_agent/chain_of_thought_agent.rb:44:in `run'
from main.rb:7:in `<main>'
/Users/josh.nichols/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/eqn-1.6.5/lib/eqn/parser.rb:12:in `parse': Parse error at offset: 0 -- Expected one of "\s", "\t", 'if', 'IF', 'round', 'ROUND', 'roundup', 'ROUNDUP', 'rounddown', 'ROUNDDOWN', '(', '+', '-', [0-9], '.', [a-zA-Z] at line 1, column 1 (byte 1) (Eqn::ParseError)
from /Users/josh.nichols/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/eqn-1.6.5/lib/eqn/calculator.rb:66:in `calculate'
from /Users/josh.nichols/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/langchainrb-0.4.1/lib/langchain/tool/calculator.rb:23:in `execute'
from /Users/josh.nichols/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/langchainrb-0.4.1/lib/langchain/tool/base.rb:26:in `execute'
from /Users/josh.nichols/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/langchainrb-0.4.1/lib/langchain/agent/chain_of_thought_agent/chain_of_thought_agent.rb:66:in `block in run'
from /Users/josh.nichols/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/langchainrb-0.4.1/lib/langchain/agent/chain_of_thought_agent/chain_of_thought_agent.rb:44:in `loop'
from /Users/josh.nichols/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/langchainrb-0.4.1/lib/langchain/agent/chain_of_thought_agent/chain_of_thought_agent.rb:44:in `run'
from main.rb:7:in `<main>'
I was kinda surprised that calculator uses the search tool. But I was also not expecting to give an API key.
I am imagining two pieces to this:
One feature I would love to have in Langchain.rb that may be super-useful is summarization:
https://python.langchain.com/en/latest/modules/chains/index_examples/summarize.html
I don't think it's super hard to implement. (at least: a base version of it)
Prompt + completion is apparently sometimes exceeding with hardcoded max_token values in the agents.
I think it can be calculated as: model_limit - prompt_token_size.
Use the Utils class TokenLengthValidator and/or Tiktoken.
(https://github.com/openai/openai-cookbook/blob/main/examples/How_to_count_tokens_with_tiktoken.ipynb) for counting tokens.](https://platform.openai.com/docs/api-reference/completions/create#completions/create-max_tokens)
Wrap all of the instances of @index_name
inside of queries in a quote_ident(@index_name)
. More information about quote_ident: https://www.rubydoc.info/gems/pg/PG/Connection#quote_ident-instance_method
Ruby 2.7 has reached the End of Life (https://www.ruby-lang.org/en/downloads/branches/). It seems like a number of dependencies will start dropping support for Ruby 2.7 and only requiring Ruby >= 3.0. For example pgvector-ruby, a dependency of ours, already dropped Ruby 2.7 support.
I was looking over these, and noticed:
These files don't exist. I'm not sure if the intent is to run them from where it's located or not. If it's not, then should at least include some comments to the effect of changing the file paths. I think it would be reasonable for them to live in the repository though.
We're missing the specs for the Qdrant vectorsearch.
Tasks:
spec/vectorsearch/qdrant_spec.rb
with specs.AI21 studio has a ton of interesting LLM task-specific endpoints and use-cases.
Implement LLM::A21
that utilizes their APIs.
When I run the following code I get this error
require "langchain"
require 'google_search_results'
GoogleSearch.api_key = ENV[KEY_SERAPI]
agent = Agent::ChainOfThoughtAgent.new(llm: :openai, llm_api_key: ENV[OPEN_IA], tools: ['search', 'calculator'])
agent.run(question: "How many full soccer fields would be needed to cover the distance between NYC and DC in a straight line?")
chain_of_thought_agent.rb:53:in `+': no implicit conversion of nil into String (TypeError)
prompt += response
and also when I don't get the error I don't get the message response.
TODO: Fill out the issue
When asking a question, depending on how have been stored the data, it's possible to have a context exceeding the max tokens of the LLM.
https://github.com/andreibondarev/langchainrb/blob/ccd0fd53a9737fb61c82058e86da1c9b855ccd7f/lib/langchain/vectorsearch/pinecone.rb#L113-L123
In this #ask
method, we could:
WDYT?
When the ask()
method is called on a Vectorsearch instance (e.g.: https://github.com/andreibondarev/langchainrb/blob/main/lib/vectorsearch/qdrant.rb#L89-L100), we call the completions()
method on the OpenAI LLM. I think better answers would be served by the chat/completions
endpoint instead.
Inspiration: https://python.langchain.com/en/latest/modules/agents/tools/examples/chatgpt_plugins.html
Starting from #55, copying here for reference:
pgvector specifies in the docs that the index must be defined AFTER the table has some data inside. So even if I would define it during schema creation, it won't have any effect on performances. You must create (or reindex) it AFTER you've put data in the table.
Solutions:
a) Add a method like update_indexes that the user must call manually after adding data
b) Update the index implicitly in the add_texts method, maybe with an option (update_index: true)
It should be pretty straight forward using the xsv gem.
Something like (untested):
# frozen_string_literal: true
require "xsv"
module Langchain
module Processors
class Xlsx < Base
EXTENSIONS = [".xlsx"]
CONTENT_TYPES = ["application/vnd.openxmlformats-officedocument.spreadsheetml.sheet"]
# Parse the document and return the text
# @param [File] data
# @return [Array of Hash]
def parse(data)
xlsx_file = Xsv.open(data.read)
xlsx_file.sheets.flat_map do |sheet|
sheet.map do |row|
row.map(&:strip)
end
end
end
end
end
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.