Code Monkey home page Code Monkey logo

autogpt-web-interaction's Introduction

AutoGPT Web Interaction Plugin

Screenshot 2023-05-01 at 7 37 16 PM

The AutoGPT Web Interaction Plugin enables Auto-GPT to interact with websites.

Note: The plugin is very flakey on GPT-3.5, I recommend using GPT-4. However, it can still perform basic tasks on GPT-3.5.

Key Features:

  • Allows Auto-GPT to click elements.
  • Allows Auto-GPT to type text.
  • Allows Auto-GPT select elements.
  • Allows Auto-GPT to scroll

Installation

Follow these steps to configure the Auto-GPT Email Plugin:

1. Clone this repository.

2. cd into the directory, and run pip install -r requirements.txt

3. pip install playwright

3. Zip/Compress the web_interaction folder

4. Drag the new zip file into the Auto-GPT plugins folder.

5. Set ALLOWLISTED_PLUGINS=AutoGPTWebInteraction,example-plugin1,example-plugin2,etc in your AutoGPT .env file.

6. Edit goals

When using Auto-GPT please set one of the goals to "Remember to use the Web Interaction Plugin possible".

autogpt-web-interaction's People

Contributors

baseinfinity avatar gravelbridge avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

autogpt-web-interaction's Issues

About Headless Browser

I see the code headless = false, so does this Plugin only support obtaining web information through GUI-based browsers?

Can I modify false to true so that enable for headless mode ?

start_browser returns API error,

SYSTEM: Command start_browser returned: Error: It looks like you are using Playwright Sync API inside the asyncio loop. Please use the Async API instead.

Feature Request

Provide a More Robust Error Handling Mechanism: Each method should have exception handling to ensure that any errors that occur during the execution of a method are appropriately caught and dealt with. This is especially important for web interactions as many things can go wrong (e.g., a page failing to load, a page layout changing so an element can't be found, network issues).

Add Support for More Web Interaction Functions: The current implementation provides basic functions like clicking, typing, and navigation. You could extend these functionalities to handle more complex interactions such as dragging and dropping elements, dealing with pop-ups, or handling iframes.

State Management: A session management system could be beneficial to allow users to pause and resume their browsing session. This would involve storing the current state of the browser and restoring it when the user continues.

Better Image Handling: Currently, images are rendered as their alt text. You might want to consider implementing a feature that can perform image analysis or OCR (Optical Character Recognition) to provide more information about images when necessary.

Advanced DOM Navigation and Interaction: Add methods to interact with complex elements like sliders, dropdowns, or menus. Furthermore, instead of using a simplified DOM, consider using a full-fledged headless browser, which can provide more realistic interaction and better JavaScript execution.

Support for Asynchronous Actions: Some actions on the web are asynchronous, such as AJAX requests. Your plugin could have some support to handle these situations, like a "wait for element" feature.

Customizable User-Agent: Allow the user-agent string to be set by the user, to mimic different browsers or devices.

Proxy Support: Some users might want to route their requests through a proxy. Providing support for this can be helpful.

It never gets passed trying to install Playwright and then Docker

AutoGPT version 0.30
Linux Mint
No Docker
Local memory

I have followed your instructions regarding how to install it.

I set up the following goals:

ai_goals:

  • Log in to the Wordpress Dashboard at https://website/wp-login.php with username "username" and password "password"
  • Add a new post about AutoGPT that is SEO optimized for the word "autogpt".
  • Publish the post.
  • Send an email to hello@website with a report on what you did.
  • Remember to use the Web Interaction Plugin possible.
  • Terminate.
    ai_name: SEOEditorGPT
    ai_role: an AI WordPress SEO Editor that helps publishing SEO-optimized content on a website.
    api_budget: 0.0

I installed Playwright before running AutoGPT. I gave it feedback that Playwright is already installed.

It never understands that playwright is already installed.

Unknown command "go_to_website"

Hi there,

I'm not sure why, but the command you showed in the screenshot doesn't seem to be recognized, even though I cloned the repo, zipped it, put it in my plugins folder, and allowed it on my plugins list.

Any ideas?

Thank you

logs:

Input:no, please use command "go_to_website"
SYSTEM: Human feedback: no, please use command "go_to_website"
THOUGHTS: To get the URLs for popular videos on Youtube, I will use the get_hyperlinks command on the Youtube homepage.
REASONING: The Youtube homepage is likely to contain links to some of the most popular videos on the platform, so using get_hyperlinks on this page will allow me to extract those URLs.
PLAN:

  • Use the go_to_website command to navigate to the Youtube homepage, and then use the get_hyperlinks command to extract the URLs for some popular videos.
    CRITICISM: Some of the URLs returned by get_hyperlinks may not actually lead to videos, so I must be sure to filter out any unrelated links.
    NEXT ACTION: COMMAND = go_to_website ARGUMENTS = {'url': 'https://www.youtube.com/'}
    Enter 'y' to authorise command, 'y -N' to run N continuous commands, 's' to run self-feedback commands'n' to exit program, or enter feedback for ...
    Asking user via keyboard...
    Input:y

use playwright sync api inside

Error using plugin

REASONING:  Using the web_interaction plugin and API calls will allow us to gather the necessary information to develop an effective marketing plan and engage with potential customers on social media.
PLAN:
-  Use 'start_browser' command to open a browser window for web interaction
-  Use 'go_to_website' command to navigate to the business's website and gather information on competitors' websites
-  Use 'make_api_call' command to gather data on the target market and competitors
-  Develop a comprehensive marketing plan that highlights the unique value proposition of the business and targets the ideal customer base
-  Use 'post_tweet' command to share marketing messages and engage with potential customers on social media
CRITICISM:  We need to ensure that the information we gather is accurate and relevant to the business's target market. We also need to be careful not to overwhelm potential customers with too many marketing messages.
NEXT ACTION:  COMMAND = start_browser ARGUMENTS = {}
  Enter 'y' to authorise command, 'y -N' to run N continuous commands, 's' to run self-feedback commands, 'n' to exit program, or enter feedback for ...
  Asking user via keyboard...
Input:y
-=-=-=-=-=-=-= COMMAND AUTHORISED BY USER -=-=-=-=-=-=-=
SYSTEM:  Command start_browser returned: Error: It looks like you are using Playwright Sync API inside the asyncio loop. Please use the Async API instead.

ModuleNotFoundError: No module named 'playwright'

ModuleNotFoundError: No module named 'playwright'

so i posted this error to chatgpt and it said i need to install the 'playwright' module using pip, the Python package installer.

"pip install playwright"
"playwright install"

so i done that and i got the same error. i told chat-gpt and it said -

"My apologies for the confusion, as an AI model, I'm unable to execute Python code or directly interact with your Python environment to check if the module is installed correctly. However, you can check if the Playwright module is installed correctly by running the following command in your Python environment:

python -c "import playwright"

If the command does not output any message, it means that the Playwright module is correctly installed. If it outputs an error message like ModuleNotFoundError: No module named 'playwright', it means that the Playwright module is not installed correctly."

the command does not output any messages but im still getting the same error

url for website to visit

image
doesn't seem to want to get an actual url and I didn't specify one, maybe can try too but should work without like going to google and searching etc

Security Hardening

I am interested in branching to experiment with some ideas to harden the code and preventing exposure to potential security issues.

Possible security issues I am interested in trying to harden:

  1. Lack of Input Validation: The code does not validate user inputs, such as the url, id, and text parameters. This can lead to various security vulnerabilities, including URL manipulation, injection attacks, and cross-site scripting (XSS) attacks. It is important to validate and sanitize user inputs before using them in the code.

  2. Error Handling: The code uses a broad exception handling block with a generic except statement, which catches all exceptions without providing specific error messages. This can make it difficult to identify and handle specific errors, and it can also expose sensitive information in error messages. I think was identified in another issue thread.

  3. Code Execution from User Input: The code uses the evaluate method to execute JavaScript code passed as strings. If user-supplied input is directly used in these evaluated JavaScript snippets, it can lead to code injection vulnerabilities. It is crucial to validate and sanitize user inputs before executing them as code.

  4. Use of Global Variables: The code uses global variables (browser, page, client, page_element_buffer) to store state information. Using global variables can make the code more error-prone, harder to maintain, and vulnerable to potential race conditions in a multi-threaded environment. It is recommended to use local variables or encapsulate the state in a more controlled manner.

  5. Lack of Content Security Policy (CSP): The code does not implement or enforce a Content Security Policy. CSP helps prevent various types of attacks, such as XSS and data injection, by restricting the sources from which certain types of content (e.g., scripts, stylesheets) can be loaded. Implementing a strong CSP can enhance the security of the application.

  6. Potential Clickjacking Vulnerability: The code removes the target attribute from all <a> elements on the page using injected JavaScript. This can potentially introduce a clickjacking vulnerability, where an attacker tricks users into clicking on a hidden or disguised element by overlaying it with a malicious element. It is advisable to use other methods, such as adding the rel="noopener" attribute, to improve the security of links instead of removing the target attribute.

  7. Blacklisted Elements: The code maintains a set of blacklisted elements and skips processing them. However, the list of blacklisted elements is limited and may not cover all potentially dangerous elements. It is recommended to use a whitelist-based approach instead, where only known safe elements are allowed, to mitigate potential security risks.

Error: Cannot execute 'web_interaction': unknown command.

SYSTEM: Command web_interaction returned: Error: Cannot execute 'web_interaction': unknown command. Do not try to use this command again.

Keep getting this error.

  1. web_interaction.zip is in my plugins folder.
  2. My .env file reads:
    `## PLUGINS_CONFIG_FILE - The path to the plugins_config.yaml file, relative to the Auto-GPT root directory. (Default plugins_config.yaml)

PLUGINS_CONFIG_FILE=plugins_config.yaml

ALLOWLISTED_PLUGINS=AutoGPTWebInteraction`

I have not created a plugins_config.yaml since evertime I create it, I get the message:
plugins_config.yaml does not exist, creating base config.

Also it says on AutoGPT plugins readme that:
Screenshot 2023-08-09 at 7 08 50 PM

go_to_website after start_browser results in RuntimeError

File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/asyncio/base_events.py", line 584, in _check_running raise RuntimeError('This event loop is already running') RuntimeError: This event loop is already running sys:1: RuntimeWarning: coroutine 'Application.run_async' was never awaited

[BUG] AutoGPTWebInteraction Abstract class not found error

You need to add these code to run AutoGPTWebInteraction under init file's AutoGPTWebInteraction class.

` def can_handle_text_embedding(
self, text: str
) -> bool:
return False

def handle_text_embedding(
        self, text: str
) -> list:
    pass

def can_handle_user_input(self, user_input: str) -> bool:
    return False

def user_input(self, user_input: str) -> str:
    return user_input

def can_handle_report(self) -> bool:
    return False

def report(self, message: str) -> None:
    pass`

'go_to_website' not working

SYSTEM: Command go_to_website returned: Failed to go to url, please try again and make sure the url is correct.

I manually checked that the URL for correct.

Please update to support new version of Auto-GPT

Please update the plugin and documentation. Thank you,

Traceback (most recent call last):
File "/usr/local/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/local/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/app/autogpt/main.py", line 5, in
autogpt.cli.main()
File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1130, in call
return self.main(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1055, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1635, in invoke
rv = super().invoke(ctx)
File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib/python3.10/site-packages/click/core.py", line 760, in invoke
return __callback(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/click/decorators.py", line 26, in new_func
return f(get_current_context(), *args, **kwargs)
File "/app/autogpt/cli.py", line 96, in main
run_auto_gpt(
File "/app/autogpt/main.py", line 124, in run_auto_gpt
config.plugins = scan_plugins(config, config.debug_mode)
File "/app/autogpt/plugins/init.py", line 270, in scan_plugins
plugin_enabled = plugins_config.is_enabled(plugin_name)
AttributeError: 'dict' object has no attribute 'is_enabled'

Use ASync

If you start the browser, but it somehow closes, you can't start a new one.

SYSTEM: Command start_browser returned: Error: It looks like you are using Playwright Sync API inside the asyncio loop. Please use the Async API instead.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.