Code Monkey home page Code Monkey logo

langchain-benchmarks's People

Contributors

baskaryan avatar ccurme avatar eyurtsev avatar fpingham avatar hinthornw avatar hwchase17 avatar leo-gan avatar maruthiko avatar rlancemartin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

langchain-benchmarks's Issues

[Feature req] Agents: comparing fine-tuning techniques

Common question: I'm fine-tuning for an agent. What split of data should I prioritize collecting, and in what mixture?

  • Fewer long trajectories?
  • More short trajectories / single-step function calls?
  • If there is conversation in the mix, how much should I include? And then do i include the full trajectory flattened out? or remove for later calls?

ConnectionError

Hi, I'm trying to run custom_agent.py on my computer, when it comes to this line of code:
chain_results = run_on_dataset( client, dataset_name="Titanic CSV Data", llm_or_chain_factory=get_chain, evaluation=eval_config, )
it generates an error message:
ConnectionError: HTTPConnectionPool(host='localhost', port=1984): Max retries exceeded with url: /sessions (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x000001D88859A3A0>: Failed to establish a new connection: [WinError 10061] No connection could be made because the target machine actively refused it'))

I'm running under Windows with Python 3.9.7.
Has anyone seen this error before? Thanks!

Does LangChain-Benchmarks only support models from the model registry?

I'd like to run some benchmarks against models from Hugging Face. The tutorials seem tailored for models from the registry or OpenAI.

Before I go down the rabbit hole and try to use it myself, I thought I'd see if it was possible or if anyone has done this before and has examples I can look at.

Thanks

Error publishing feedback to LangSmith from Streamlit cloud

I've set LANGCHAIN_PROJECT and LANGCHAIN_API_KEY.

Feedback works locally.

App -
https://github.com/langchain-ai/langchain-benchmarks/tree/main/extraction

On streamlit cloud, I see this error -

Traceback (most recent call last):

  File "/home/adminuser/venv/lib/python3.9/site-packages/langsmith/utils.py", line 55, in raise_for_status_with_text

    response.raise_for_status()

  File "/home/adminuser/venv/lib/python3.9/site-packages/requests/models.py", line 1021, in raise_for_status

    raise HTTPError(http_error_msg, response=self)

requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://api.smith.langchain.com/feedback


The above exception was the direct cause of the following exception:


Traceback (most recent call last):

  File "/home/adminuser/venv/lib/python3.9/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 548, in _run_script

    self._session_state.on_script_will_rerun(rerun_data.widget_states)

  File "/home/adminuser/venv/lib/python3.9/site-packages/streamlit/runtime/state/safe_session_state.py", line 68, in on_script_will_rerun

    self._state.on_script_will_rerun(latest_widget_states)

  File "/home/adminuser/venv/lib/python3.9/site-packages/streamlit/runtime/state/session_state.py", line 484, in on_script_will_rerun

    self._call_callbacks()

  File "/home/adminuser/venv/lib/python3.9/site-packages/streamlit/runtime/state/session_state.py", line 497, in _call_callbacks

    self._new_widget_state.call_callback(wid)

  File "/home/adminuser/venv/lib/python3.9/site-packages/streamlit/runtime/state/session_state.py", line 249, in call_callback

    callback(*args, **kwargs)

  File "/mount/src/langchain-benchmarks/extraction/streamlit_app.py", line 9, in send_feedback

    client.create_feedback(run_id, "user_score", score=score)

  File "/home/adminuser/venv/lib/python3.9/site-packages/langsmith/client.py", line 1588, in create_feedback

    raise_for_status_with_text(response)

  File "/home/adminuser/venv/lib/python3.9/site-packages/langsmith/utils.py", line 57, in raise_for_status_with_text

    raise ValueError(response.text) from e

ValueError: {"detail":"Resource not found"}

Test some other models

Whenever we're ready with tool calling

  # ("fireworks-firefunction-v1", ChatFireworks(model="accounts/fireworks/models/firefunction-v1", temperature=0)),
  # ("cohere-command-light", ChatCohere(temperature=0, model="command-light")),
  # ("cohere-command", ChatCohere(temperature=0, model="command")),
  # ("cohere-command-r", ChatCohere(temperature=0, model="command-r")),
  # ("cohere-command-r-plus", ChatCohere(temperature=0, model="command-r-plus")),
  # ("mistral-large-2402", ChatMistralAI(model="mistral-large-2402", temperature=0)),

Code does not give plot related response and throws error at frontend

my code :

import pandas as pd
import streamlit as st
from langchain.chat_models import ChatOpenAI
from langchain.agents.agent_toolkits import create_pandas_dataframe_agent
from langchain.agents.agent_types import AgentType
import matplotlib.pyplot as plt

df = pd.read_csv('/Users/siddheshphapale/Desktop/project/sqlcsv.csv')

llm = ChatOpenAI(openai_api_key= "s5" , temperature=0 ,max_tokens= 500 , verbose= False)
agent = create_pandas_dataframe_agent(llm, df, agent_type=AgentType.OPENAI_FUNCTIONS)

from langsmith import Client
client = Client()
def send_feedback(run_id, score):
client.create_feedback(run_id, "user_score", score=score)

st.set_page_config(page_title='๐Ÿฆœ๐Ÿ”— Screenshot 2023-09-30 at 8 03 24โ€ฏPM
st.title('๐Ÿ“Š๐Ÿ”—')
st.info("")

query_text = st.text_input('Enter your question:', placeholder = 'region wise total net amt')

Form input and query

result = None
with st.form('myform', clear_on_submit=True):
submitted = st.form_submit_button('Submit')
if submitted:
with st.spinner('Calculating...'):
response = agent({"input": query_text}, include_run_info=True)
result = response["output"]
run_id = response["__run"].run_id
if result is not None:
st.info(result)
col_blank, col_text, col1, col2 = st.columns([10, 2,1,1])
with col_text:
st.text("Feedback:")
with col1:
st.button("๐Ÿ‘", on_click=send_feedback, args=(run_id, 1))
with col2:
st.button("๐Ÿ‘Ž", on_click=send_feedback, args=(run_id, 0))

Is it possible to change the model evaluator?

I see here that code is using the GPT 4 model for the evaluation, since it its the most expensive model out there to run, is it possible to change the evaluator model for another?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.