Code Monkey home page Code Monkey logo

Comments (12)

dimaqq avatar dimaqq commented on May 30, 2024

🀷🏿 πŸ€” would that be proxy between arsenic and browser, or browser and the website?

from arsenic.

DiMiTriFrog avatar DiMiTriFrog commented on May 30, 2024

The truth is that I don't really know what would be needed, but the end result would be that using arsenic you can access any web site through a proxy connection, like the option of : "--proxy-server:" but allowing to use user and password.

from arsenic.

DiMiTriFrog avatar DiMiTriFrog commented on May 30, 2024

At this moment I have a Docker with the arsenic script. Now, i configured the Docker for using the proxy but arsenic isn't load any web. I made a test with python requests and the requests library is using the proxy, I don't know why the arsenic library is not loading anything.

from arsenic.

DiMiTriFrog avatar DiMiTriFrog commented on May 30, 2024

Arsenic returns a black page -> <html><head></head><body></body></html>

from arsenic.

dimaqq avatar dimaqq commented on May 30, 2024

I think this issue needs a MRE
https://stackoverflow.com/help/minimal-reproducible-example

from arsenic.

DiMiTriFrog avatar DiMiTriFrog commented on May 30, 2024

Well, I have a Dockerfile with proxy conf:

FROM public.ecr.aws/lambda/python:3.8
WORKDIR /app

COPY requirements.txt  .
RUN  pip3 install -r requirements.txt
COPY headless-chromium /opt/headless-chromium
RUN chmod 777 /opt/headless-chromium

ENV http_proxy http://urltoproxy:port
ENV https_proxy https://urltoproxy:port

CMD ["/app/app.handler"]

My docker is for lambda use, I have two sample code (using the same dockerfile), one of this works and the other doesn't works. The sample code that works is a simple request of python:

Code that works (requests)

import requests
ip = requests.get('https://api.ipify.org').text
print(ip) // Returns a ip of proxy.

Code that doesn't works (arsenic)

from requests.packages.urllib3.exceptions import InsecureRequestWarning
from arsenic import get_session, stop_session
from arsenic.browsers import Chrome
from arsenic.services import Chromedriver
import asyncio

async def arsenic_simple():
    results_json = {}
    try:
        browser  = Chrome()
        browser.capabilities = {"goog:chromeOptions":  {"binary":"/opt/headless-chromium","args": ["--headless","--disable-gpu", "--no-sandbox",'--allow-running-insecure-content','--ignore-certificate-errors']}}   
        async with get_session(Chromedriver(log_file=os.devnull),browser) as session:
            await session.get('https://api.ipify.org')
            
            source_ip = await session.get_page_source()
            results_json.update({'source_ip': source_ip})
            results_json.update({'result':'Done!'})
            await session.close()
            return results_json

    except Exception as e:
        print(f"Error {e} ")
        return f"General error {str(e)} \n {results_json}"

    
def handler(event, context):
    resp = asyncio.run(arsenic_simple())
    print(resp)
    return {
        'statusCode': 200,
        'body': json.dumps(resp) 
    }

The response of the get_page_source() is -> <html><head></head><body></body></html> like a inexistent internet connection. The docker has a existent proxy conection that works using other libraries for make requests, but using Arsenic I can't use proxy .

Any idea?

I'm trying to use a Docker with proxy configuration because I can't use a proxy with auth username and password with arsenic.

from arsenic.

dimaqq avatar dimaqq commented on May 30, 2024

It seems you want to configure headless chromium to use a proxy.
I'd start with https://blog.apify.com/how-to-make-headless-chrome-and-puppeteer-use-a-proxy-server-with-authentication-249a21a79212/ or something

from arsenic.

DiMiTriFrog avatar DiMiTriFrog commented on May 30, 2024

It seems you want to configure headless chromium to use a proxy.
I'd start with https://blog.apify.com/how-to-make-headless-chrome-and-puppeteer-use-a-proxy-server-with-authentication-249a21a79212/ or something

I'm investigate how I could implement it. I'm trying to the option of selenium-wire, I mean that implement a local proxy with mitmproxy with upstream and conenct arsenic to the local result proxy.

from arsenic.

dimaqq avatar dimaqq commented on May 30, 2024

My 2c: this issue can be closed.
Rationale: it's a corner case in chromium configuration; this library can't cover all of browser config, only pertinent/common flags.

from arsenic.

DiMiTriFrog avatar DiMiTriFrog commented on May 30, 2024

Y fix the problem using own extension for http/https proxy with auth.
First need to add this flag ->"--load-extension=/path/folder_extension"

And inside folder_extension I have two files:

  • background.js
    `var config = {
    mode: "fixed_servers",
    rules: {
    singleProxy: {
    scheme: "https",
    host: "HOST_PROXY",
    port: PORT
    },
    bypassList: ["localhost"]
    }
    };

chrome.proxy.settings.set({value: config, scope: "regular"}, function() {});

function callbackFn(details) {
return {
authCredentials: {
username: "USER",
password: "PASSWORD"
}
};
}

chrome.webRequest.onAuthRequired.addListener(
callbackFn,
{urls: ["<all_urls>"]},
['blocking']
);`

  • manifest.json
  • { "version": "1.0.0", "manifest_version": 2, "name": "Chrome Proxy", "permissions": [ "proxy", "tabs", "unlimitedStorage", "storage", "<all_urls>", "webRequest", "webRequestBlocking" ], "background": { "scripts": ["background.js"] }, "minimum_chrome_version":"22.0.0" }

from arsenic.

bcastane avatar bcastane commented on May 30, 2024

HI @DiMiTriFrog , where did you add the load-extension flag?
I have been trying here, but with no luck.

browser.capabilities = {
"goog:chromeOptions": {"args": ["--headless","--load-extension=C:/Path/folder/extension/"]}
}

Thanks!

from arsenic.

DiMiTriFrog avatar DiMiTriFrog commented on May 30, 2024

HI @DiMiTriFrog , where did you add the load-extension flag? I have been trying here, but with no luck.

browser.capabilities = { "goog:chromeOptions": {"args": ["--headless","--load-extension=C:/Path/folder/extension/"]} }

Thanks!

Hi, extensions aren't working with headless mode..

from arsenic.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.