Comments (7)
You can initialize a requests session before creating the browser, and pass it to the constructor of StatefulBrowser
(parameter session
).
from mechanicalsoup.
No need to write a new server, httpbin is there ;-). It does work, indeed, see 72783b8.
from mechanicalsoup.
Cookies are naturally "persistent" in MechanicalSoup in the sense "kept from one request to another". They are not persistent from one MechanicalSoup session to another (i.e. if you terminate your Python program and restart it, cookies are lost).
Saving cookies in a file is possible, it has been discussed here: #37
No time to actually try this, but probably we should add a mention of it in the doc (no time either for that ;-) ).
from mechanicalsoup.
Save is not the problem, but the initial load is.
I tried many way, but failed to set cookies before open URL.
requests.utils.add_dict_to_cookiejar(browser.session.cookies, cookies)
failed
browser = mechanicalsoup.StatefulBrowser(url,cookies=cookies)
failed
replace mechanicalsoup browser requests object with a set cookiejar requests object on init failed.
from mechanicalsoup.
Using requests open with cookies very simple.
my_custom_cookies = dict()
my_custom_cookies.update({'key1':'value1'})
res = requests.get(url, cookies=my_custom_cookies)
but using mechanicalsoup, I haven't found a feasible way, need help.
from mechanicalsoup.
Thank you, this is a right way.
s = requests.Session()
requests.utils.add_dict_to_cookiejar(s.cookies, {'key1': 'val1'})
browser = mechanicalsoup.StatefulBrowser(session=s)
r = browser.open(url)
and browser.open return r.ok
from mechanicalsoup.
I write a web server cookies test, both work.
import mechanicalsoup
import re
import requests
import multiprocessing
import time
import http.server
import socketserver
PORT = 8000
class HTTPRequestHandler(http.server.BaseHTTPRequestHandler):
def do_GET(self):
self.send_response(200)
self.send_header("Content-type", "text/html")
self.end_headers()
self.wfile.write(bytes("<html><head><title>test</title></head>\n", "utf-8"))
self.wfile.write(bytes("<p>Request: %s</p>" % self.path, "utf-8"))
self.wfile.write(bytes("<body>\n", "utf-8"))
find_cookie = False
for h in self.headers:
if re.search('cookie', h, re.I):
self.wfile.write(bytes(f"<p>{h}: {self.headers[h]}</p>\n", "utf-8"))
find_cookie = True
if not find_cookie:
self.wfile.write(bytes("<p>NO Cookies.</p>\n", "utf-8"))
self.wfile.write(bytes("</body></html>", "utf-8"))
def server():
with socketserver.TCPServer(("", PORT), HTTPRequestHandler) as httpd:
httpd.serve_forever()
from bs4 import BeautifulSoup
cookies_dict = {'key1':'val1', 'key2':'val2'}
def cookie_test_requests_get():
r = requests.get(f'http://127.0.0.1:{PORT}/test_requests_get',
cookies=cookies_dict)
if r.ok:
print(BeautifulSoup(r.text, features="lxml").get_text())
def cookie_test_mechanicalsoup_browser():
s = requests.Session()
requests.utils.add_dict_to_cookiejar(s.cookies, cookies_dict)
browser = mechanicalsoup.StatefulBrowser(session=s)
r = browser.open(f'http://127.0.0.1:{PORT}/test_mechanicalsoup_browser')
if r.ok:
print(str(browser.page.get_text()))
browser.open_relative('/retry_mechanicalsoup_browser')
if r.ok:
print(str(browser.page.get_text()))
def test_cookies():
s = multiprocessing.Process(target=server)
s.start()
time.sleep(1)
cookie_test_requests_get()
cookie_test_mechanicalsoup_browser()
s.terminate()
if __name__ == "__main__":
test_cookies()
output
127.0.0.1 - - [15/Oct/2021 20:06:33] "GET /test_requests_get HTTP/1.1" 200 -
test
Request: /test_requests_get
Cookie: key1=val1; key2=val2
127.0.0.1 - - [15/Oct/2021 20:06:33] "GET /test_mechanicalsoup_browser HTTP/1.1" 200 -
test
Request: /test_mechanicalsoup_browser
Cookie: key1=val1; key2=val2
127.0.0.1 - - [15/Oct/2021 20:06:33] "GET /retry_mechanicalsoup_browser HTTP/1.1" 200 -
test
Request: /retry_mechanicalsoup_browser
Cookie: key1=val1; key2=val2
from mechanicalsoup.
Related Issues (20)
- Selecting a form that only has a class attribute HOT 6
- Fetching completed view ( HTML ) - dynamic web pages HOT 1
- Can you build it without lxml? HOT 2
- MechanicalSoup does not support <input> form attribute HOT 1
- Cannot install MechanicalSoup on Kodi 19.3 HOT 1
- Renaming master branch to main HOT 1
- Typing annotations and typechecking with mypy or pyright? HOT 3
- Submit form with no button HOT 4
- Page redirect not being followed HOT 2
- browser.links() should return an empty list if self.page is None HOT 1
- Rate Limiting HOT 2
- Any plans on making the package fully html5-compliant? HOT 2
- Add a way to fork a browser HOT 3
- Add a "back button" method HOT 1
- Add to the `browser.follow_link(...)` docstring what the valid kwargs are
- Crawler with MechanicalSoup not starting on Kubernetes? HOT 2
- Upgrade werkzeug in test environment
- Suggestions for improvement on this time-consuming function: __looks_like_html() HOT 1
- Documentation: explain how the BeautifulSoup parser can be specified
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mechanicalsoup.