ansonlai / teslaservicemanualscraper Goto Github PK

This script will download the Tesla Service Manual onto a local doc folder for offline access.

License: MIT License

Python 100.00%

scraper scraping-websites selenium tesla

teslaservicemanualscraper's Introduction

Tesla Service Manual Scraper

This script will download the Tesla Service Manual onto a local doc folder for offline access. Windows and MacOS (thanks to CollinHeist).

Setup

Go into secrets.py and fill out tesla_account_email and tesla_account_password with your account and password.
Go into scrape.py and enter the index URL of the manual you want saved by changing service_manual_index and base_url variables. It is defaulted to the Model S (2012-2020), but confirmed to work on the Model 3 too.
If you have 2FA or other challenges with login, consider changing login_delay in secrets.py to 2 or 3 seconds so you can manually enter your credentials.
Setup Python 3. See tutorial at: https://wiki.python.org/moin/BeginnersGuide/Download
Setup selenium for Python. To use the required stealth module, you must use the Chromium webdriver. See tutorial at: https://blog.testproject.io/2019/07/16/installing-selenium-webdriver-using-python-chrome/
Pip install the required packages (including requests, selenium, selenium-stealth, and beautifulsoup4). On windows, you run the following commands on command prompt (CMD):
1. cd C:\Users\Anson\Desktop\TeslaServiceManualScraper [template, the path should go wherever you saved this readme]
2. run pip install -r requirements.txt
Before scraping, it is always a good idea to use a VPN of some sort to avoid any issues with your account. I didn't run into any issues personally, but you can never be too safe. It is also worthwhile to open a new account to claim the manuals instead of using a personal account.
Run scrape.py by typing python scrape.py
The browser will automatically restart when it encounters problems with the files or login status.

Viewing offline

Option 1: Easy Way

Go into docs/ folder and open up index.html. You're going to get 99% of the service manual just like that, but no search functionality.

Option 2: HTTP Server (thanks to TheNexusAvenger)

Run CMD on Windows, and change the directory to the docs folder. Something like this cd C:\Users\Anson\Desktop\TeslaServiceManualScraper
Run the following command: python -m http.server (Python obviously needs to be installed)
Use a web browser and navigate to: http://localhost:8000/ to see the full service manual including search.

Tips

A full scrape of the Model 3 service manual took over 30 minutes. This script is set up so that you can stop the script, and then continue later on.
Keep an eye out, Tesla's website seems to boot you out of logged in status after about 250 pages or 20 minutes of continuous refreshing. So it might be worthwhile to run this on the side while keeping an eye on your login status.
Total file size of the Model 3 service manual is roughly 2.2GB.
On your first run, Tesla might throw a Captcha or lead to an error page. Most of the time, just rerun and it'll work.
If you get an "Access Denied" page, Tesla has likely blocked you. Usually that block expires in 10 minutes.
If you need to reset for any reason (scraping another manual), you need to delete or move the docs/ folder and the dict.picle file.

teslaservicemanualscraper's People

Contributors

Stargazers

Watchers

Forkers

pkelly517 devzspy rashkash103 fabikroeger seanpm2001 yongsubchoi misterbeardy ktsenvolt hlarsen xlscior rducharme

teslaservicemanualscraper's Issues

AttributeError: 'WebDriver' object has no attribute 'find_element_by_css_selector'

Chrome 103, is this a chrome driver problem?

C:\Program Files\Python310\TeslaServiceManualScraper-master>py scrape.py
C:\Program Files\Python310\TeslaServiceManualScraper-master\scrape.py:1: DeprecationWarning: The distutils package is deprecated and slated for removal in Python 3.12. Use setuptools or check PEP 632 for potential alternatives
from distutils.command.clean import clean

DevTools listening on ws://127.0.0.1:54174/devtools/browser/3f8df9b0-340b-4578-ac26-aa680caabf37
****** SESSION LOADED ******
Traceback (most recent call last):
File "C:\Program Files\Python310\TeslaServiceManualScraper-master\scrape.py", line 272, in
run()
File "C:\Program Files\Python310\TeslaServiceManualScraper-master\scrape.py", line 261, in run
driver.get_index()
File "C:\Program Files\Python310\TeslaServiceManualScraper-master\scrape.py", line 46, in get_index
driver = tesla_login(self.driver)
File "C:\Program Files\Python310\TeslaServiceManualScraper-master\secrets.py", line 10, in tesla_login
driver.find_element_by_css_selector("#form-input-identity").send_keys(tesla_account_email)
AttributeError: 'WebDriver' object has no attribute 'find_element_by_css_selector'

reCaptia issue, and then an "Access Denied" error

It passes the creds, but there's a recaptia box requiring checking, after it failed I tried again, where I checked the box and did the human test, and then I get access denied.

Logging in manually with my account works fine though.

New Tip: Search Functionality

Under the Tips section, there is a mention of getting other files for the styling of the website. While trying out a local download of the Model Y service manual, I noticed there is another index.json file that can be downloaded to get searching. The main caveat is that the search functionality requires fetching that file over an HTTP request, which necessitates running the manual with an HTTP server. python -m http.server works just fine from what I can tell. Adding these 2 notes may be beneficial.

AttributeError: 'NoneType' object has no attribute 'endswith'

images to be downloaded: 7740
visited: 952
upcoming: 3
images to be downloaded: 7752
visited: 953
upcoming: 2
images to be downloaded: 7753
visited: 954
upcoming: 1
images to be downloaded: 7754
visited: 955
upcoming: 0
images to be downloaded: 7759
****** SESSION SAVED ******
Traceback (most recent call last):
File "C:\TeslaServiceManual\ts1\scrape.py", line 270, in
run()
File "C:\TeslaServiceManual\ts1\scrape.py", line 263, in run
driver.get_html()
File "C:\TeslaServiceManual\ts1\scrape.py", line 152, in get_html
self.restart_scrape()
File "C:\TeslaServiceManual\ts1\scrape.py", line 43, in restart_scrape
run()
File "C:\TeslaServiceManual\ts1\scrape.py", line 263, in run
driver.get_html()
File "C:\TeslaServiceManual\ts1\scrape.py", line 152, in get_html
self.restart_scrape()
File "C:\TeslaServiceManual\ts1\scrape.py", line 43, in restart_scrape
run()
File "C:\TeslaServiceManual\ts1\scrape.py", line 263, in run
driver.get_html()
File "C:\TeslaServiceManual\ts1\scrape.py", line 152, in get_html
self.restart_scrape()
File "C:\TeslaServiceManual\ts1\scrape.py", line 43, in restart_scrape
run()
File "C:\TeslaServiceManual\ts1\scrape.py", line 263, in run
driver.get_html()
File "C:\TeslaServiceManual\ts1\scrape.py", line 152, in get_html
self.restart_scrape()
File "C:\TeslaServiceManual\ts1\scrape.py", line 43, in restart_scrape
run()
File "C:\TeslaServiceManual\ts1\scrape.py", line 263, in run
driver.get_html()
File "C:\TeslaServiceManual\ts1\scrape.py", line 152, in get_html
self.restart_scrape()
File "C:\TeslaServiceManual\ts1\scrape.py", line 43, in restart_scrape
run()
File "C:\TeslaServiceManual\ts1\scrape.py", line 263, in run
driver.get_html()
File "C:\TeslaServiceManual\ts1\scrape.py", line 152, in get_html
self.restart_scrape()
File "C:\TeslaServiceManual\ts1\scrape.py", line 43, in restart_scrape
run()
File "C:\TeslaServiceManual\ts1\scrape.py", line 265, in run
clean_img_urls()
File "C:\TeslaServiceManual\ts1\scrape.py", line 248, in clean_img_urls
if url.endswith('jpg'):
AttributeError: 'NoneType' object has no attribute 'endswith'

help me.....

Traceback issues with find_element return self.execute(Command.FIND_ELEMEN

Using Selenium 4.2.0 with the latest scrape.py script. Seems like an Access Denied issue, kick out immediately after with below error. Script getting stuck on ModelS Electrical/Harness section 1710 (large quantity of images).

DevTools listening on ws://127.0.0.1:65188/devtools/browser/35abb965-bbe7-49d4-a23f-84559bce2819
****** SESSION LOADED ******
Traceback (most recent call last):
File "C:\Program Files\Python310\TeslaServiceManualScraper-master\scrape.py", line 272, in
run()
File "C:\Program Files\Python310\TeslaServiceManualScraper-master\scrape.py", line 261, in run
driver.get_index()
File "C:\Program Files\Python310\TeslaServiceManualScraper-master\scrape.py", line 46, in get_index
driver = tesla_login(self.driver)
File "C:\Program Files\Python310\TeslaServiceManualScraper-master\secrets.py", line 10, in tesla_login
driver.find_element_by_css_selector("#form-input-identity").send_keys(tesla_account_email)
File "C:\Program Files\Python310\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 814, in find_element_by_css_selector
return self.find_element(by=By.CSS_SELECTOR, value=css_selector)
File "C:\Program Files\Python310\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 1251, in find_element
return self.execute(Command.FIND_ELEMENT, {
File "C:\Program Files\Python310\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 430, in execute
self.error_handler.check_response(response)
File "C:\Program Files\Python310\lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 247, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"css selector","selector":"#form-input-identity"}
(Session info: chrome=103.0.5060.114)
Stacktrace:
Backtrace:
Ordinal0 [0x00DF6463+2188387]
Ordinal0 [0x00D8E461+1762401]
Ordinal0 [0x00CA3D78+802168]
Ordinal0 [0x00CD1880+989312]
Ordinal0 [0x00CD1B1B+989979]
Ordinal0 [0x00CFE912+1173778]
Ordinal0 [0x00CEC824+1099812]
Ordinal0 [0x00CFCC22+1166370]
Ordinal0 [0x00CEC5F6+1099254]
Ordinal0 [0x00CC6BE0+945120]
Ordinal0 [0x00CC7AD6+948950]
GetHandleVerifier [0x010971F2+2712546]
GetHandleVerifier [0x0108886D+2652765]
GetHandleVerifier [0x00E8002A+520730]
GetHandleVerifier [0x00E7EE06+516086]
Ordinal0 [0x00D9468B+1787531]
Ordinal0 [0x00D98E88+1805960]
Ordinal0 [0x00D98F75+1806197]
Ordinal0 [0x00DA1DF1+1842673]
BaseThreadInitThunk [0x76D36739+25]
RtlGetFullPathName_UEx [0x77E88FEF+1215]
RtlGetFullPathName_UEx [0x77E88FBD+1165]

Browser sizing issue

There are some issues with any browser sizing. As you can see it would work with full screen and some files are missing.
I can see the whole picture with only a partial browser.

indexerror (already downgraded to selenium 4.2.0

I am getting this same issue. Launches browser. Populates username. refreshes page. Populates password. submits information. page asks for passcode (2FA), refreshes, says you don't have access to the page and kicks you back to the 2FA, I enter the code, click submit (or wait) and the browser closes.

C:\Users\stone\Downloads\TeslaServiceManualScraper-master\TeslaServiceManualScraper-master>python scrape.py
C:\Users\stone\Downloads\TeslaServiceManualScraper-master\TeslaServiceManualScraper-master\scrape.py:1: DeprecationWarning: The distutils package is deprecated and slated for removal in Python 3.12. Use setuptools or check PEP 632 for potential alternatives
from distutils.command.clean import clean

DevTools listening on ws://127.0.0.1:53787/devtools/browser/e4778eca-0cef-4c56-9229-57f471a67201
****** SESSION LOADED ******
[7332:23804:0722/094217.385:ERROR:device_event_log_impl.cc(214)] [09:42:17.386] USB: usb_device_handle_win.cc:1048 Failed to read descriptor from node connection: A device attached to the system is not functioning. (0x1F)
[7332:23804:0722/094217.395:ERROR:device_event_log_impl.cc(214)] [09:42:17.396] USB: usb_device_handle_win.cc:1048 Failed to read descriptor from node connection: A device attached to the system is not functioning. (0x1F)
Traceback (most recent call last):
File "C:\Users\stone\Downloads\TeslaServiceManualScraper-master\TeslaServiceManualScraper-master\scrape.py", line 273, in
run()
File "C:\Users\stone\Downloads\TeslaServiceManualScraper-master\TeslaServiceManualScraper-master\scrape.py", line 262, in run
driver.get_index()
File "C:\Users\stone\Downloads\TeslaServiceManualScraper-master\TeslaServiceManualScraper-master\scrape.py", line 53, in get_index
window1 = driver.window_handles[1]
IndexError: list index out of range

C:\Users\stone\Downloads\TeslaServiceManualScraper-master\TeslaServiceManualScraper-master>

unable to scrap help

Can i get some help.
Had to make some tweaks to get it to even download. But now it is downloading and after a while the browser closes exactly at the same spot every time it is downloading the Model S manual it gets to section 12 and just closes the Brower, opens back up and I authenticate again but it starts at the very beginning of the manual. It does not pickup from section 12.

Any help is appreciated?

***Changes to the requirements document
beautifulsoup4==4.10.0
certifi==2020.11.8
chardet==3.0.4
future==0.18.2
idna==2.10
lxml==4.8.0
Pillow==9.1.1
pywin32-ctypes==0.2.0
requests==2.25.1
soupsieve==2.2.1
urllib3==1.26.2
zipp==3.4.0

Scraper errors

Hi, thanks for creating this script. I am receiving the error below. Any idea why?

Also, in case it mattered, I noticed that the script does not log in directly to service.tesla.com, rather, it logs into tesla.com. After it eventually gets to the manual I specified, it fails.

DevTools listening on ws://127.0.0.1:57338/devtools/browser/cd79339b-94d9-4d66-af54-37bfc30 ****** SESSION LOADED ****** [7452:19956:0129/225725.215:ERROR:device_event_log_impl.cc(215)] [22:57:25.215] USB: usb_device_handle_win.cc:1046 Failed to read descriptor from node connection: A device attached to the system is not functioning. (0x1F) [7452:19956:0129/225725.216:ERROR:device_event_log_impl.cc(215)] [22:57:25.216] USB: usb_device_handle_win.cc:1046 Failed to read descriptor from node connection: A device attached to the system is not functioning. (0x1F) Traceback (most recent call last): File "c:\users\j\downloads\ts\scrape.py", line 273, in <module> run() File "c:\users\j\downloads\ts\scrape.py", line 262, in run driver.get_index() File "c:\users\j\downloads\ts\scrape.py", line 53, in get_index window1 = driver.window_handles[1] IndexError: list index out of range

macOS + Python (

Installed selenium, selenium-stealth, beautifulsoup4 and also ran chromedriver

Traceback (most recent call last):
File "scrape.py", line 7, in
import requests
ModuleNotFoundError: No module named 'requests'