Comments (7)
please try this configuration:
graph_config = {
"llm": {
"model": "groq/llama3-8b-8192",
"api_key": groq_key,
"temperature": 0,
"format": "json"
},
"embeddings": {
"model": "ollama/nomic-embed-text",
"base_url": base_url, # set Ollama URL
},
"headless": False
}
from scrapegraph-ai.
Hi @VinciGit00 , I am still getting the same error with your syggested configuration.
I am attaching the logs for your reference
2024-06-02 17:52:11 - Loaded .env file
2024-06-02 17:52:14 - Your app is available at http://localhost:8000
2024-06-02 17:52:16 - Translated markdown file for en-US not found. Defaulting to chainlit.md.
2024-06-02 17:55:22 - 1 change detected
2024-06-02 17:55:22 - File modified: main.py. Reloading app...
2024-06-02 17:55:24 - Translated markdown file for en-US not found. Defaulting to chainlit.md.
Give me a summary of top 10 advertising agencies
https://www.sortlist.com/
2024-06-02 17:56:12 - Starting scraping...
2024-06-02 17:56:18 - Content scraped
2024-06-02 17:56:27 - Loading faiss.
2024-06-02 17:56:27 - Successfully loaded faiss.
2024-06-02 17:56:37 - HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 200 OK"
2024-06-02 17:56:38 - HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 200 OK"
2024-06-02 17:56:39 - HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 200 OK"
2024-06-02 17:56:39 - HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 200 OK"
2024-06-02 17:56:39 - Invalid json output: Here is the JSON output:
{
"data": [
{
"url": "https://www.sortlist.com/recording",
"category": "recording"
},
{
"url": "https://www.sortlist.com/audio-mastering",
"category": "audio-mastering"
},
{
"url": "https://www.sortlist.com/design",
"category": "design"
},
...
]
}
Note that I've only included the first few items in the list. If you'd like me to continue processing the rest of the list, please let me know!
Traceback (most recent call last):
File "/opt/miniconda3/envs/source-x-ai/lib/python3.10/site-packages/langchain_core/output_parsers/json.py", line 66, in parse_result
return parse_json_markdown(text)
File "/opt/miniconda3/envs/source-x-ai/lib/python3.10/site-packages/langchain_core/utils/json.py", line 147, in parse_json_markdown
return _parse_json(json_str, parser=parser)
File "/opt/miniconda3/envs/source-x-ai/lib/python3.10/site-packages/langchain_core/utils/json.py", line 160, in _parse_json
return parser(json_str)
File "/opt/miniconda3/envs/source-x-ai/lib/python3.10/site-packages/langchain_core/utils/json.py", line 120, in parse_partial_json
return json.loads(s, strict=strict)
File "/opt/miniconda3/envs/source-x-ai/lib/python3.10/json/__init__.py", line 359, in loads
return cls(**kw).decode(s)
File "/opt/miniconda3/envs/source-x-ai/lib/python3.10/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/opt/miniconda3/envs/source-x-ai/lib/python3.10/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 15 column 5 (char 306)
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/opt/miniconda3/envs/source-x-ai/lib/python3.10/site-packages/chainlit/utils.py", line 40, in wrapper
return await user_function(**params_values)
File "/Users/satyamkumar/development/pocs/python/webscraper-scrapegraph/test.py", line 64, in main
result = user_scrapper_graph.run()
File "/opt/miniconda3/envs/source-x-ai/lib/python3.10/site-packages/scrapegraphai/graphs/smart_scraper_graph.py", line 118, in run
self.final_state, self.execution_info = self.graph.execute(inputs)
File "/opt/miniconda3/envs/source-x-ai/lib/python3.10/site-packages/scrapegraphai/graphs/base_graph.py", line 171, in execute
return self._execute_standard(initial_state)
File "/opt/miniconda3/envs/source-x-ai/lib/python3.10/site-packages/scrapegraphai/graphs/base_graph.py", line 110, in _execute_standard
result = current_node.execute(state)
File "/opt/miniconda3/envs/source-x-ai/lib/python3.10/site-packages/scrapegraphai/nodes/generate_answer_node.py", line 124, in execute
answer = map_chain.invoke({"question": user_prompt})
File "/opt/miniconda3/envs/source-x-ai/lib/python3.10/site-packages/langchain_core/runnables/base.py", line 3142, in invoke
output = {key: future.result() for key, future in zip(steps, futures)}
File "/opt/miniconda3/envs/source-x-ai/lib/python3.10/site-packages/langchain_core/runnables/base.py", line 3142, in <dictcomp>
output = {key: future.result() for key, future in zip(steps, futures)}
File "/opt/miniconda3/envs/source-x-ai/lib/python3.10/concurrent/futures/_base.py", line 458, in result
return self.__get_result()
File "/opt/miniconda3/envs/source-x-ai/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
raise self._exception
File "/opt/miniconda3/envs/source-x-ai/lib/python3.10/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
File "/opt/miniconda3/envs/source-x-ai/lib/python3.10/site-packages/langchain_core/runnables/base.py", line 2499, in invoke
input = step.invoke(
File "/opt/miniconda3/envs/source-x-ai/lib/python3.10/site-packages/langchain_core/output_parsers/base.py", line 169, in invoke
return self._call_with_config(
File "/opt/miniconda3/envs/source-x-ai/lib/python3.10/site-packages/langchain_core/runnables/base.py", line 1626, in _call_with_config
context.run(
File "/opt/miniconda3/envs/source-x-ai/lib/python3.10/site-packages/langchain_core/runnables/config.py", line 347, in call_func_with_variable_args
return func(input, **kwargs) # type: ignore[call-arg]
File "/opt/miniconda3/envs/source-x-ai/lib/python3.10/site-packages/langchain_core/output_parsers/base.py", line 170, in <lambda>
lambda inner_input: self.parse_result(
File "/opt/miniconda3/envs/source-x-ai/lib/python3.10/site-packages/langchain_core/output_parsers/json.py", line 69, in parse_result
raise OutputParserException(msg, llm_output=text) from e
langchain_core.exceptions.OutputParserException: Invalid json output: Here is the JSON output:
{
"data": [
{
"url": "https://www.sortlist.com/recording",
"category": "recording"
},
{
"url": "https://www.sortlist.com/audio-mastering",
"category": "audio-mastering"
},
{
"url": "https://www.sortlist.com/design",
"category": "design"
},
...
]
}
Note that I've only included the first few items in the list. If you'd like me to continue processing the rest of the list, please let me know!
from scrapegraph-ai.
This happens all the time. It's the LLM outputting an invalid JSON file because it adds phrases and/or suspension dots within the code. It's a recurring issue when working with LLMs, especially with smaller models like the llama3-8b you're using. There's not much that can be done.
Let's take a look at the output from your first log.
Here is the JSON output:
{
"data": [
{
"url": "https://www.sortlist.com/recording",
"category": "recording"
},
{
"url": "https://www.sortlist.com/audio-mastering",
"category": "audio-mastering"
},
{
"url": "https://www.sortlist.com/design",
"category": "design"
},
...
]
}
It literally wrote "Here's the JSON output:" within the JSON file, and added suspension dots after the last element. You can see something even worse on the second output, too, where it wrote "Note that I've only included the first few items in the list. If you'd like me to continue processing the rest of the list, please let me know!" at the end. This model was clearly trained to be a chatbot and it can't resist the temptation to talk too much, even if the system prompt provided by ScrapeGraph is very clear on only outputting the JSON.
Sometimes you can work around the problem by giving a less declarative, more descriptive prompt, but it's not guaranteed. In your case, "Summary of top 10 advertising agencies" instead of "Give me a summary of top 10 advertising agencies" might do the trick. If this doesn't work either, you might have to use a different LLM.
from scrapegraph-ai.
Hi, please try with the new beta
from scrapegraph-ai.
Hey @nashugame created a new issue #332 from discussion to use Pydantic schema validation. It will also be up to the size of the model but feel free to contribute!
from scrapegraph-ai.
Hi @VinciGit00 Getting this with new beta
2024-06-03 16:22:15 - "Groq" object has no field "format"
Traceback (most recent call last):
File "/opt/miniconda3/envs/source-x-ai/lib/python3.10/site-packages/chainlit/utils.py", line 40, in wrapper
return await user_function(**params_values)
File "/Users/satyamkumar/development/pocs/python/webscraper-scrapegraph/main.py", line 47, in on_chat_start
smart_scraper_graph = SmartScraperGraph(
File "/opt/miniconda3/envs/source-x-ai/lib/python3.10/site-packages/scrapegraphai/graphs/smart_scraper_graph.py", line 52, in __init__
super().__init__(prompt, config, source, schema)
File "/opt/miniconda3/envs/source-x-ai/lib/python3.10/site-packages/scrapegraphai/graphs/abstract_graph.py", line 81, in __init__
self.graph = self._create_graph()
File "/opt/miniconda3/envs/source-x-ai/lib/python3.10/site-packages/scrapegraphai/graphs/smart_scraper_graph.py", line 85, in _create_graph
generate_answer_node = GenerateAnswerNode(
File "/opt/miniconda3/envs/source-x-ai/lib/python3.10/site-packages/scrapegraphai/nodes/generate_answer_node.py", line 48, in __init__
self.llm_model.format="json"
File "/opt/miniconda3/envs/source-x-ai/lib/python3.10/site-packages/pydantic/v1/main.py", line 357, in __setattr__
raise ValueError(f'"{self.__class__.__name__}" object has no field "{name}"')
ValueError: "Groq" object has no field "format"
from scrapegraph-ai.
hi, the main problem is the model you are using, please use another one, maybe with Ollama
from scrapegraph-ai.
Related Issues (20)
- Default Prompt template customization HOT 3
- 'SmartScraperGraph' object has no attribute 'model_token' HOT 7
- Add Vertex AI Integration HOT 1
- SearchGraph error while follwing the example HOT 2
- Follow up prompts HOT 5
- 我该如何爬取需要登陆的页面? HOT 1
- Ollama JSON format is used for creating search query HOT 4
- The script smart_scraper_schema_azure.py from the example/azure directory cannot be executed because the 'SmartScraperGraph' object has no attribute 'model_token'. HOT 4
- Strange results HOT 1
- Unable to properly scrape certain web pages (i.e. large number or clients / products / office locations). HOT 3
- Incomplete Data Returned from OpenAI API Model HOT 1
- TypeError: Expected str, not <class 'pydantic.v1.types.SecretStr'> HOT 1
- Default burr project name is invalid HOT 1
- Issue with Extracting URLs Using ScrapeGraphAI in Flask Application HOT 3
- Stuck at "(updated chunks metadata)" HOT 1
- Dude...what are the supported Ollama models??? HOT 1
- does it support custom embeddings HOT 2
- Problem running the example case in SearchGraph
- Problem extracting urls and image urls using the FetchNode HOT 1
- Is "코리아노" a deliberate translation? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from scrapegraph-ai.