Code Monkey home page Code Monkey logo

auto-evaluator's People

Contributors

eltociear avatar mziru avatar pgouy avatar prem2012 avatar rlancemartin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

auto-evaluator's Issues

Improve speed of SVM retriever index creation

Currently SVM retriever has a create_index def that will embed a list of chunks:

def create_index(contexts: List[str], embeddings: Embeddings) -> np.ndarray:
    return np.array([embeddings.embed_query(split) for split in contexts])

See code:
https://github.com/hwchase17/langchain/blob/f1d15b4a75238eb56e386608b7f631a853a803e0/langchain/retrievers/svm.py#L17

This is quite slow (> 10 min on ~2k splits).

Tried multiprocessing, but appears to hang w/:

Process SpawnPoolWorker-74:
Traceback (most recent call last):
  File "/Users/31treehaus/opt/anaconda3/envs/ml/lib/python3.9/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/Users/31treehaus/opt/anaconda3/envs/ml/lib/python3.9/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/Users/31treehaus/opt/anaconda3/envs/ml/lib/python3.9/multiprocessing/pool.py", line 114, in worker
    task = get()
  File "/Users/31treehaus/opt/anaconda3/envs/ml/lib/python3.9/multiprocessing/queues.py", line 367, in get
    return _ForkingPickler.loads(res)
AttributeError: Can't get attribute 'embed_query' on <module '__main__' (built-in)>

Tried a dataframe apply to parallelize, but no latency improvement.

Need to carve out more time to look into this.

Is there a limit on the size of the PDF? A PDF with a size of 200kb reported an error

Traceback (most recent call last):
File "/Users/helen/code/xiaomi/auto-evaluator/venv/lib/python3.9/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 565, in _run_script
exec(code, module.dict)
File "/Users/helen/code/xiaomi/auto-evaluator/auto-evaluator.py", line 430, in
graded_answers, graded_retrieval, latency, predictions = run_evaluation(qa_chain, retriever, eval_set, grade_prompt,
File "/Users/helen/code/xiaomi/auto-evaluator/auto-evaluator.py", line 342, in run_evaluation
retrieval_grade = grade_model_retrieval(gt_dataset, retrieved_docs, grade_prompt)
File "/Users/helen/code/xiaomi/auto-evaluator/auto-evaluator.py", line 277, in grade_model_retrieval
graded_outputs = eval_chain.evaluate(
File "/Users/helen/code/xiaomi/auto-evaluator/venv/lib/python3.9/site-packages/langchain/evaluation/qa/eval_chain.py", line 60, in evaluate
return self.apply(inputs)
File "/Users/helen/code/xiaomi/auto-evaluator/venv/lib/python3.9/site-packages/langchain/chains/llm.py", line 118, in apply
response = self.generate(input_list)
File "/Users/helen/code/xiaomi/auto-evaluator/venv/lib/python3.9/site-packages/langchain/chains/llm.py", line 62, in generate
return self.llm.generate_prompt(prompts, stop)
File "/Users/helen/code/xiaomi/auto-evaluator/venv/lib/python3.9/site-packages/langchain/chat_models/base.py", line 82, in generate_prompt
raise e
File "/Users/helen/code/xiaomi/auto-evaluator/venv/lib/python3.9/site-packages/langchain/chat_models/base.py", line 79, in generate_prompt
output = self.generate(prompt_messages, stop=stop)
File "/Users/helen/code/xiaomi/auto-evaluator/venv/lib/python3.9/site-packages/langchain/chat_models/base.py", line 54, in generate
results = [self._generate(m, stop=stop) for m in messages]
File "/Users/helen/code/xiaomi/auto-evaluator/venv/lib/python3.9/site-packages/langchain/chat_models/base.py", line 54, in
results = [self._generate(m, stop=stop) for m in messages]
File "/Users/helen/code/xiaomi/auto-evaluator/venv/lib/python3.9/site-packages/langchain/chat_models/openai.py", line 266, in _generate
response = self.completion_with_retry(messages=message_dicts, **params)
File "/Users/helen/code/xiaomi/auto-evaluator/venv/lib/python3.9/site-packages/langchain/chat_models/openai.py", line 228, in completion_with_retry
return _completion_with_retry(**kwargs)
File "/Users/helen/code/xiaomi/auto-evaluator/venv/lib/python3.9/site-packages/tenacity/init.py", line 289, in wrapped_f
return self(f, *args, **kw)
File "/Users/helen/code/xiaomi/auto-evaluator/venv/lib/python3.9/site-packages/tenacity/init.py", line 379, in call
do = self.iter(retry_state=retry_state)
File "/Users/helen/code/xiaomi/auto-evaluator/venv/lib/python3.9/site-packages/tenacity/init.py", line 314, in iter
return fut.result()
File "/usr/local/Cellar/[email protected]/3.9.12/Frameworks/Python.framework/Versions/3.9/lib/python3.9/concurrent/futures/_base.py", line 439, in result
return self.__get_result()
File "/usr/local/Cellar/[email protected]/3.9.12/Frameworks/Python.framework/Versions/3.9/lib/python3.9/concurrent/futures/_base.py", line 391, in __get_result
raise self._exception
File "/Users/helen/code/xiaomi/auto-evaluator/venv/lib/python3.9/site-packages/tenacity/init.py", line 382, in call
result = fn(*args, **kwargs)
File "/Users/helen/code/xiaomi/auto-evaluator/venv/lib/python3.9/site-packages/langchain/chat_models/openai.py", line 226, in _completion_with_retry
return self.client.create(**kwargs)
File "/Users/helen/code/xiaomi/auto-evaluator/venv/lib/python3.9/site-packages/openai/api_resources/chat_completion.py", line 25, in create
return super().create(*args, **kwargs)
File "/Users/helen/code/xiaomi/auto-evaluator/venv/lib/python3.9/site-packages/openai/api_resources/abstract/engine_api_resource.py", line 153, in create
response, _, api_key = requestor.request(
File "/Users/helen/code/xiaomi/auto-evaluator/venv/lib/python3.9/site-packages/openai/api_requestor.py", line 226, in request
resp, got_stream = self._interpret_response(result, stream)
File "/Users/helen/code/xiaomi/auto-evaluator/venv/lib/python3.9/site-packages/openai/api_requestor.py", line 619, in _interpret_response
self._interpret_response_line(
File "/Users/helen/code/xiaomi/auto-evaluator/venv/lib/python3.9/site-packages/openai/api_requestor.py", line 679, in _interpret_response_line
raise self.handle_error_response(
openai.error.InvalidRequestError: This model's maximum context length is 4097 tokens. However, your messages resulted in 4280 tokens. Please reduce the length of the messages.

I've passed in the openai_api_key but still getting this validation error

ValidationError: 1 validation error for ChatOpenAI root Did not find openai_api_key, please add an environment variable OPENAI_API_KEY which contains it, or pass openai_api_key as a named parameter. (type=value_error)

Traceback:
File "/Users/yujiedou/Desktop/auto-evaluator-main/venv/lib/python3.9/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 565, in _run_script
exec(code, module.dict)
File "/Users/yujiedou/Desktop/auto-evaluator-main/auto-evaluator.py", line 302, in
eval_set = generate_eval(text, num_eval_questions, 3000)
File "/Users/yujiedou/Desktop/auto-evaluator-main/venv/lib/python3.9/site-packages/streamlit/runtime/caching/cache_utils.py", line 194, in wrapper
return cached_func(*args, **kwargs)
File "/Users/yujiedou/Desktop/auto-evaluator-main/venv/lib/python3.9/site-packages/streamlit/runtime/caching/cache_utils.py", line 223, in call
return self._get_or_create_cached_value(args, kwargs)
File "/Users/yujiedou/Desktop/auto-evaluator-main/venv/lib/python3.9/site-packages/streamlit/runtime/caching/cache_utils.py", line 248, in _get_or_create_cached_value
return self._handle_cache_miss(cache, value_key, func_args, func_kwargs)
File "/Users/yujiedou/Desktop/auto-evaluator-main/venv/lib/python3.9/site-packages/streamlit/runtime/caching/cache_utils.py", line 302, in _handle_cache_miss
computed_value = self._info.func(*func_args, **func_kwargs)
File "/Users/yujiedou/Desktop/auto-evaluator-main/auto-evaluator.py", line 81, in generate_eval
chain = QAGenerationChain.from_llm(ChatOpenAI(temperature=0, openai_api_key=OPENAI_API_KEY))
File "pydantic/main.py", line 341, in pydantic.main.BaseModel.init

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.