Comments (12)
π€·πΏ π€ would that be proxy between arsenic
and browser, or browser and the website?
from arsenic.
The truth is that I don't really know what would be needed, but the end result would be that using arsenic you can access any web site through a proxy connection, like the option of : "--proxy-server:" but allowing to use user and password.
from arsenic.
At this moment I have a Docker with the arsenic script. Now, i configured the Docker for using the proxy but arsenic isn't load any web. I made a test with python requests and the requests library is using the proxy, I don't know why the arsenic library is not loading anything.
from arsenic.
Arsenic returns a black page -> <html><head></head><body></body></html>
from arsenic.
I think this issue needs a MRE
https://stackoverflow.com/help/minimal-reproducible-example
from arsenic.
Well, I have a Dockerfile with proxy conf:
FROM public.ecr.aws/lambda/python:3.8
WORKDIR /app
COPY requirements.txt .
RUN pip3 install -r requirements.txt
COPY headless-chromium /opt/headless-chromium
RUN chmod 777 /opt/headless-chromium
ENV http_proxy http://urltoproxy:port
ENV https_proxy https://urltoproxy:port
CMD ["/app/app.handler"]
My docker is for lambda use, I have two sample code (using the same dockerfile), one of this works and the other doesn't works. The sample code that works is a simple request of python:
Code that works (requests)
import requests
ip = requests.get('https://api.ipify.org').text
print(ip) // Returns a ip of proxy.
Code that doesn't works (arsenic)
from requests.packages.urllib3.exceptions import InsecureRequestWarning
from arsenic import get_session, stop_session
from arsenic.browsers import Chrome
from arsenic.services import Chromedriver
import asyncio
async def arsenic_simple():
results_json = {}
try:
browser = Chrome()
browser.capabilities = {"goog:chromeOptions": {"binary":"/opt/headless-chromium","args": ["--headless","--disable-gpu", "--no-sandbox",'--allow-running-insecure-content','--ignore-certificate-errors']}}
async with get_session(Chromedriver(log_file=os.devnull),browser) as session:
await session.get('https://api.ipify.org')
source_ip = await session.get_page_source()
results_json.update({'source_ip': source_ip})
results_json.update({'result':'Done!'})
await session.close()
return results_json
except Exception as e:
print(f"Error {e} ")
return f"General error {str(e)} \n {results_json}"
def handler(event, context):
resp = asyncio.run(arsenic_simple())
print(resp)
return {
'statusCode': 200,
'body': json.dumps(resp)
}
The response of the get_page_source() is -> <html><head></head><body></body></html>
like a inexistent internet connection. The docker has a existent proxy conection that works using other libraries for make requests, but using Arsenic I can't use proxy .
Any idea?
I'm trying to use a Docker with proxy configuration because I can't use a proxy with auth username and password with arsenic.
from arsenic.
It seems you want to configure headless chromium to use a proxy.
I'd start with https://blog.apify.com/how-to-make-headless-chrome-and-puppeteer-use-a-proxy-server-with-authentication-249a21a79212/ or something
from arsenic.
It seems you want to configure headless chromium to use a proxy.
I'd start with https://blog.apify.com/how-to-make-headless-chrome-and-puppeteer-use-a-proxy-server-with-authentication-249a21a79212/ or something
I'm investigate how I could implement it. I'm trying to the option of selenium-wire, I mean that implement a local proxy with mitmproxy with upstream and conenct arsenic to the local result proxy.
from arsenic.
My 2c: this issue can be closed.
Rationale: it's a corner case in chromium configuration; this library can't cover all of browser config, only pertinent/common flags.
from arsenic.
Y fix the problem using own extension for http/https proxy with auth.
First need to add this flag ->"--load-extension=/path/folder_extension"
And inside folder_extension I have two files:
- background.js
`var config = {
mode: "fixed_servers",
rules: {
singleProxy: {
scheme: "https",
host: "HOST_PROXY",
port: PORT
},
bypassList: ["localhost"]
}
};
chrome.proxy.settings.set({value: config, scope: "regular"}, function() {});
function callbackFn(details) {
return {
authCredentials: {
username: "USER",
password: "PASSWORD"
}
};
}
chrome.webRequest.onAuthRequired.addListener(
callbackFn,
{urls: ["<all_urls>"]},
['blocking']
);`
- manifest.json
{ "version": "1.0.0", "manifest_version": 2, "name": "Chrome Proxy", "permissions": [ "proxy", "tabs", "unlimitedStorage", "storage", "<all_urls>", "webRequest", "webRequestBlocking" ], "background": { "scripts": ["background.js"] }, "minimum_chrome_version":"22.0.0" }
from arsenic.
HI @DiMiTriFrog , where did you add the load-extension flag?
I have been trying here, but with no luck.
browser.capabilities = {
"goog:chromeOptions": {"args": ["--headless","--load-extension=C:/Path/folder/extension/"]}
}
Thanks!
from arsenic.
HI @DiMiTriFrog , where did you add the load-extension flag? I have been trying here, but with no luck.
browser.capabilities = { "goog:chromeOptions": {"args": ["--headless","--load-extension=C:/Path/folder/extension/"]} }
Thanks!
Hi, extensions aren't working with headless mode..
from arsenic.
Related Issues (20)
- get_attribute("innerHTML") stopped working (chrome) HOT 3
- compatibility issue when asyncio loop policy is not the default one HOT 7
- Any way to controle several tabs async? HOT 2
- How to get page title? HOT 1
- arsenic doesnβt support all ChromeOptions? HOT 3
- Unable to open URLs with chrome driver at all? HOT 1
- Unable to set prefs in arsenic similar to selenium (download files to default directory)
- Any way to avoid close chrome after execution HOT 4
- How does this compare to playwright? HOT 1
- AttributeError: 'str' object has no attribute 'fileno' HOT 1
- Spawning Chrome/Edge browser in Windows from WSL2
- Asyncio TimeoutError HOT 1
- Unable to run tests because of broken dependency
- Post-merge CI build fails HOT 2
- Shall we try goodfirstissue.dev ?
- Getting 'unknown error: net::ERR_CONNECTION_CLOSED' Error HOT 4
- Cloudflare detection HOT 1
- No requirements.txt file and couldn't run the tests with Pytest HOT 5
- FileNotFoundError: [WinError 2] The system cannot find the file specified HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from arsenic.