Code Monkey home page Code Monkey logo

Comments (7)

VinciGit00 avatar VinciGit00 commented on July 18, 2024

please try this configuration:

graph_config = {
    "llm": {
        "model": "groq/llama3-8b-8192",
        "api_key": groq_key,
        "temperature": 0,
        "format": "json"
    },
    "embeddings": {
        "model": "ollama/nomic-embed-text",
        "base_url": base_url,  # set Ollama URL
    },
    "headless": False
}

from scrapegraph-ai.

nashugame avatar nashugame commented on July 18, 2024

Hi @VinciGit00 , I am still getting the same error with your syggested configuration.
I am attaching the logs for your reference

2024-06-02 17:52:11 - Loaded .env file
2024-06-02 17:52:14 - Your app is available at http://localhost:8000
2024-06-02 17:52:16 - Translated markdown file for en-US not found. Defaulting to chainlit.md.
2024-06-02 17:55:22 - 1 change detected
2024-06-02 17:55:22 - File modified: main.py. Reloading app...
2024-06-02 17:55:24 - Translated markdown file for en-US not found. Defaulting to chainlit.md.
Give me a summary of top 10 advertising agencies
https://www.sortlist.com/
2024-06-02 17:56:12 - Starting scraping...
2024-06-02 17:56:18 - Content scraped
2024-06-02 17:56:27 - Loading faiss.
2024-06-02 17:56:27 - Successfully loaded faiss.
2024-06-02 17:56:37 - HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 200 OK"
2024-06-02 17:56:38 - HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 200 OK"
2024-06-02 17:56:39 - HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 200 OK"
2024-06-02 17:56:39 - HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 200 OK"
2024-06-02 17:56:39 - Invalid json output: Here is the JSON output:

{
  "data": [
    {
      "url": "https://www.sortlist.com/recording",
      "category": "recording"
    },
    {
      "url": "https://www.sortlist.com/audio-mastering",
      "category": "audio-mastering"
    },
    {
      "url": "https://www.sortlist.com/design",
      "category": "design"
    },
    ...
  ]
}

Note that I've only included the first few items in the list. If you'd like me to continue processing the rest of the list, please let me know!
Traceback (most recent call last):
  File "/opt/miniconda3/envs/source-x-ai/lib/python3.10/site-packages/langchain_core/output_parsers/json.py", line 66, in parse_result
    return parse_json_markdown(text)
  File "/opt/miniconda3/envs/source-x-ai/lib/python3.10/site-packages/langchain_core/utils/json.py", line 147, in parse_json_markdown
    return _parse_json(json_str, parser=parser)
  File "/opt/miniconda3/envs/source-x-ai/lib/python3.10/site-packages/langchain_core/utils/json.py", line 160, in _parse_json
    return parser(json_str)
  File "/opt/miniconda3/envs/source-x-ai/lib/python3.10/site-packages/langchain_core/utils/json.py", line 120, in parse_partial_json
    return json.loads(s, strict=strict)
  File "/opt/miniconda3/envs/source-x-ai/lib/python3.10/json/__init__.py", line 359, in loads
    return cls(**kw).decode(s)
  File "/opt/miniconda3/envs/source-x-ai/lib/python3.10/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/opt/miniconda3/envs/source-x-ai/lib/python3.10/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 15 column 5 (char 306)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/opt/miniconda3/envs/source-x-ai/lib/python3.10/site-packages/chainlit/utils.py", line 40, in wrapper
    return await user_function(**params_values)
  File "/Users/satyamkumar/development/pocs/python/webscraper-scrapegraph/test.py", line 64, in main
    result = user_scrapper_graph.run()
  File "/opt/miniconda3/envs/source-x-ai/lib/python3.10/site-packages/scrapegraphai/graphs/smart_scraper_graph.py", line 118, in run
    self.final_state, self.execution_info = self.graph.execute(inputs)
  File "/opt/miniconda3/envs/source-x-ai/lib/python3.10/site-packages/scrapegraphai/graphs/base_graph.py", line 171, in execute
    return self._execute_standard(initial_state)
  File "/opt/miniconda3/envs/source-x-ai/lib/python3.10/site-packages/scrapegraphai/graphs/base_graph.py", line 110, in _execute_standard
    result = current_node.execute(state)
  File "/opt/miniconda3/envs/source-x-ai/lib/python3.10/site-packages/scrapegraphai/nodes/generate_answer_node.py", line 124, in execute
    answer = map_chain.invoke({"question": user_prompt})
  File "/opt/miniconda3/envs/source-x-ai/lib/python3.10/site-packages/langchain_core/runnables/base.py", line 3142, in invoke
    output = {key: future.result() for key, future in zip(steps, futures)}
  File "/opt/miniconda3/envs/source-x-ai/lib/python3.10/site-packages/langchain_core/runnables/base.py", line 3142, in <dictcomp>
    output = {key: future.result() for key, future in zip(steps, futures)}
  File "/opt/miniconda3/envs/source-x-ai/lib/python3.10/concurrent/futures/_base.py", line 458, in result
    return self.__get_result()
  File "/opt/miniconda3/envs/source-x-ai/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
    raise self._exception
  File "/opt/miniconda3/envs/source-x-ai/lib/python3.10/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/opt/miniconda3/envs/source-x-ai/lib/python3.10/site-packages/langchain_core/runnables/base.py", line 2499, in invoke
    input = step.invoke(
  File "/opt/miniconda3/envs/source-x-ai/lib/python3.10/site-packages/langchain_core/output_parsers/base.py", line 169, in invoke
    return self._call_with_config(
  File "/opt/miniconda3/envs/source-x-ai/lib/python3.10/site-packages/langchain_core/runnables/base.py", line 1626, in _call_with_config
    context.run(
  File "/opt/miniconda3/envs/source-x-ai/lib/python3.10/site-packages/langchain_core/runnables/config.py", line 347, in call_func_with_variable_args
    return func(input, **kwargs)  # type: ignore[call-arg]
  File "/opt/miniconda3/envs/source-x-ai/lib/python3.10/site-packages/langchain_core/output_parsers/base.py", line 170, in <lambda>
    lambda inner_input: self.parse_result(
  File "/opt/miniconda3/envs/source-x-ai/lib/python3.10/site-packages/langchain_core/output_parsers/json.py", line 69, in parse_result
    raise OutputParserException(msg, llm_output=text) from e
langchain_core.exceptions.OutputParserException: Invalid json output: Here is the JSON output:

{
  "data": [
    {
      "url": "https://www.sortlist.com/recording",
      "category": "recording"
    },
    {
      "url": "https://www.sortlist.com/audio-mastering",
      "category": "audio-mastering"
    },
    {
      "url": "https://www.sortlist.com/design",
      "category": "design"
    },
    ...
  ]
}

Note that I've only included the first few items in the list. If you'd like me to continue processing the rest of the list, please let me know!

from scrapegraph-ai.

f-aguzzi avatar f-aguzzi commented on July 18, 2024

This happens all the time. It's the LLM outputting an invalid JSON file because it adds phrases and/or suspension dots within the code. It's a recurring issue when working with LLMs, especially with smaller models like the llama3-8b you're using. There's not much that can be done.

Let's take a look at the output from your first log.

Here is the JSON output:

{
  "data": [
    {
      "url": "https://www.sortlist.com/recording",
      "category": "recording"
    },
    {
      "url": "https://www.sortlist.com/audio-mastering",
      "category": "audio-mastering"
    },
    {
      "url": "https://www.sortlist.com/design",
      "category": "design"
    },
    ...
  ]
}

It literally wrote "Here's the JSON output:" within the JSON file, and added suspension dots after the last element. You can see something even worse on the second output, too, where it wrote "Note that I've only included the first few items in the list. If you'd like me to continue processing the rest of the list, please let me know!" at the end. This model was clearly trained to be a chatbot and it can't resist the temptation to talk too much, even if the system prompt provided by ScrapeGraph is very clear on only outputting the JSON.

Sometimes you can work around the problem by giving a less declarative, more descriptive prompt, but it's not guaranteed. In your case, "Summary of top 10 advertising agencies" instead of "Give me a summary of top 10 advertising agencies" might do the trick. If this doesn't work either, you might have to use a different LLM.

from scrapegraph-ai.

VinciGit00 avatar VinciGit00 commented on July 18, 2024

Hi, please try with the new beta

from scrapegraph-ai.

PeriniM avatar PeriniM commented on July 18, 2024

Hey @nashugame created a new issue #332 from discussion to use Pydantic schema validation. It will also be up to the size of the model but feel free to contribute!

from scrapegraph-ai.

nashugame avatar nashugame commented on July 18, 2024

Hi @VinciGit00 Getting this with new beta

2024-06-03 16:22:15 - "Groq" object has no field "format"
Traceback (most recent call last):
  File "/opt/miniconda3/envs/source-x-ai/lib/python3.10/site-packages/chainlit/utils.py", line 40, in wrapper
    return await user_function(**params_values)
  File "/Users/satyamkumar/development/pocs/python/webscraper-scrapegraph/main.py", line 47, in on_chat_start
    smart_scraper_graph = SmartScraperGraph(
  File "/opt/miniconda3/envs/source-x-ai/lib/python3.10/site-packages/scrapegraphai/graphs/smart_scraper_graph.py", line 52, in __init__
    super().__init__(prompt, config, source, schema)
  File "/opt/miniconda3/envs/source-x-ai/lib/python3.10/site-packages/scrapegraphai/graphs/abstract_graph.py", line 81, in __init__
    self.graph = self._create_graph()
  File "/opt/miniconda3/envs/source-x-ai/lib/python3.10/site-packages/scrapegraphai/graphs/smart_scraper_graph.py", line 85, in _create_graph
    generate_answer_node = GenerateAnswerNode(
  File "/opt/miniconda3/envs/source-x-ai/lib/python3.10/site-packages/scrapegraphai/nodes/generate_answer_node.py", line 48, in __init__
    self.llm_model.format="json"
  File "/opt/miniconda3/envs/source-x-ai/lib/python3.10/site-packages/pydantic/v1/main.py", line 357, in __setattr__
    raise ValueError(f'"{self.__class__.__name__}" object has no field "{name}"')
ValueError: "Groq" object has no field "format"

from scrapegraph-ai.

VinciGit00 avatar VinciGit00 commented on July 18, 2024

hi, the main problem is the model you are using, please use another one, maybe with Ollama

from scrapegraph-ai.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.